Git Clone vs cp -R --> WTF?
I knew git was fast, and I even knew it was faster than a lot of plain linux local file operations. Still, this still blew me away:
rsanheim@ares:~/src/personal/oss $ du -hd 0 insoshi/
26M insoshi/
rsanheim@ares:~/src/personal/oss $ time git clone insoshi/ /tmp/insoshi
Initialize /tmp/insoshi/.git
Initialized empty Git repository in /private/tmp/insoshi/.git/
Checking out files: 100% (2193/2193), done.
real 0m3.826s
user 0m0.251s
sys 0m0.658s
rsanheim@ares:~/src/personal/oss $ time cp -R insoshi/ /tmp/insoshi_cp
real 0m9.065s
user 0m0.114s
sys 0m1.442s
Ok, so a 26 meg repo takes almost three times as long to copy via a recursive cp than a local git clone. Thats a fairly small repo, lets try something bigger:
rsanheim@ares:~/src/relevance $ du -hd 0 rails
75M rails
rsanheim@ares:~/src/relevance $ time git clone rails /tmp/rails2
Initialize /tmp/rails2/.git
Initialized empty Git repository in /private/tmp/rails2/.git/
real 0m2.321s
user 0m0.151s
sys 0m0.465s
rsanheim@ares:~/src/relevance $ time cp -R rails/ /tmp/rails
real 0m7.133s
user 0m0.067s
sys 0m1.505s
The rails repo at 75 megs is still ~ 3 times faster.
Obviously, this is not scientific at all, but the point is pretty clear. Git is doing some magic that lets it move files around locally 2 to 3 times faster than a plain copy. From looking at the man page, I would guess it has something to do with git using hardlinks for things in .git/objects when cloning locally. My linux fu falls down a bit here – what are the ramifications of using hard links versus doing a “real” copy?
(This also makes me want to try out gitbak even more…)