Do I ever need to run git gc on a bare repo?
As Jefromi commented on Dan's answer,
git gc should be called automatically called during "normal" use of a bare repository.
I just ran
git gc --aggressive on two bare, shared repositories that have been actively used; one with about 38 commits the past 3-4 weeks, and the other with about 488 commits over roughly 3 months. Nobody has manually run
git gc on either repository.
$ git count-objects 333 objects, 595 kilobytes $ git count-objects -v count: 333 size: 595 in-pack: 0 packs: 0 size-pack: 0 prune-packable: 0 garbage: 0 $ git gc --aggressive Counting objects: 325, done. Delta compression using up to 4 threads. Compressing objects: 100% (323/323), done. Writing objects: 100% (325/325), done. Total 325 (delta 209), reused 0 (delta 0) Removing duplicate objects: 100% (256/256), done. $ git count-objects -v count: 8 size: 6 in-pack: 325 packs: 1 size-pack: 324 prune-packable: 0 garbage: 0 $ git count-objects 8 objects, 6 kilobytes
$ git count-objects 4315 objects, 11483 kilobytes $ git count-objects -v count: 4315 size: 11483 in-pack: 9778 packs: 20 size-pack: 15726 prune-packable: 1395 garbage: 0 $ git gc --aggressive Counting objects: 8548, done. Delta compression using up to 4 threads. Compressing objects: 100% (8468/8468), done. Writing objects: 100% (8548/8548), done. Total 8548 (delta 7007), reused 0 (delta 0) Removing duplicate objects: 100% (256/256), done. $ git count-objects -v count: 0 size: 0 in-pack: 8548 packs: 1 size-pack: 8937 prune-packable: 0 garbage: 0 $ git count-objects 0 objects, 0 kilobytes
I wish I had thought of it before I
gced these two repositories, but I should have run
git gc without the
--aggressive option to see the difference. Luckily I have a medium-sized active repository left to test (164 commits over nearly 2 months).
$ git count-objects -v count: 1279 size: 1574 in-pack: 2078 packs: 6 size-pack: 2080 prune-packable: 607 garbage: 0 $ git gc Counting objects: 1772, done. Delta compression using up to 4 threads. Compressing objects: 100% (1073/1073), done. Writing objects: 100% (1772/1772), done. Total 1772 (delta 1210), reused 1050 (delta 669) Removing duplicate objects: 100% (256/256), done. $ git count-objects -v count: 0 size: 0 in-pack: 1772 packs: 1 size-pack: 1092 prune-packable: 0 garbage: 0 $ git gc --aggressive Counting objects: 1772, done. Delta compression using up to 4 threads. Compressing objects: 100% (1742/1742), done. Writing objects: 100% (1772/1772), done. Total 1772 (delta 1249), reused 0 (delta 0) $ git count-objects -v count: 0 size: 0 in-pack: 1772 packs: 1 size-pack: 1058 prune-packable: 0 garbage: 0
git gc clearly made a large dent in
count-objects, even though we regularly
push to and
fetch from this repository. But upon reading the manpage for
git config, I noticed that the default loose object limit is 6700, which we apparently had not yet reached.
So it appears that the conclusion is no, you don't need to run
git gc manually on a bare repo;* but with the default setting for
gc.auto, it might be a long time before garbage collection occurs automatically.
* Generally, you shouldn't need to run
git gc. But sometimes you might be strapped for space and you should run
git gc manually or set
gc.auto to a lower value. My case for the question was simple curiosity, though.
git-gc man page:
Users are encouraged to run this task on a regular basis within each repository to maintain good disk space utilization and good operating performance.
Emphasis mine. Bare repositories are repositories too!
Further explanation: one of the housekeeping tasks that
git-gc performs is packing and repacking of loose objects. Even if you never have any dangling objects in your bare repository, you will -- over time -- accumulate lots of loose objects. These loose objects should periodically get packed, for efficiency. Similarly, if a large number of packs accumulate, they should periodically get repacked into larger (fewer) packs.
The issue with
git gc --auto is that it can be blocking.
But with the new (Git 2.0 Q2 2014) setting
gc.autodetach, you now can do it without any interruption:
See commit 4c4ac4d and commit 9f673f9 (Nguyễn Thái Ngọc Duy, aka pclouds):
gc --autotakes time and can block the user temporarily (but not any less annoyingly).
Make it run in background on systems that support it.
The only thing lost with running in background is printouts. But
gc outputis not really interesting.
You can keep it in foreground by changing
Note: only git 2.7 (Q4 2015) will make sure to not loose the error message.
See commit 329e6e8 (19 Sep 2015) by Nguyễn Thái Ngọc Duy (
(Merged by Junio C Hamano --
gitster -- in commit 076c827, 15 Oct 2015)
gc: save log from daemonized
gc --autoand print it next time
While commit 9f673f9 (
gc: config option for running
--autoin background - 2014-02-08) helps reduce some complaints about '
gc --auto' hogging the terminal, it creates another set of problems.
The latest in this set is, as the result of daemonizing,
stderris closed and all warnings are lost. This warning at the end of
cmd_gc()is particularly important because it tells the user how to avoid "
gc --auto" running repeatedly.
Because stderr is closed, the user does not know, naturally they complain about '
gc --auto' wasting CPU.
gc --autowill not run and
gc.logprinted out until the user removes
Some operations run
git gc --auto automatically, so there should never be the need to run
git gc, git should take care of this by itself.
Contrary to what bwawok said, there actually is (or might be) a difference between your local repo and that bare one: What operations you do with it. For example dangling objects can be created by rebasing, but it may be possible that you never rebase the bare repo, so maybe you don't ever need to remove them (because there are never any). And thus you may not need to use
git gc that often. But then again, like I said, git should take care of this automatically.
Related videos on Youtube
Go engineer, building distributed systems. +++++++++++[>+++++++[>+>+>+>+<<<<-]<-]+++++++++++[>+++>+++>++>+++<<<<-]>->++++>-->--->.<<.<.>>.<<<.>>>>+++++.<<<+++.--.>>---.<.>+++.++++.<+++++..[>]++++++[<---->-]<.<<<<.>>++++.<++.--.+.<.+->>>>[-]<<[>>+>+<<<-]>[>]+++[<--->-]<.<<-.+.<<[>+<-]>.>>--.>++++.<<<--.<<.>>>>>---.<<<.>>>-.<+.<-.>---.<<+++.>>>++.<<---.<<<.>>>>++.>--.++.<<.>>++.<<<---.>>[<<->>-]<<. The code on the old StackOverflow 404 page was brought to you by yours truly.Updated on October 16, 2020
Ralph Sinsuat over 2 years
man git-gcdoesn't have an obvious answer in it, and I haven't had any luck with Google either (although I might have just been using the wrong search terms).
I understand that you should occasionally run
git gcon a local repository to prune dangling objects and compress history, among other things -- but is a shared bare repository susceptible to these same issues?
If it matters, our workflow is multiple developers pulling from and pushing to a bare repository on a shared network drive. The "central" repository was created with
git init --bare --shared.
VonC about 9 yearsNote: setting
gc.autodetach(Git 2.0 Q2 2014) can help running
git gc --autowithout bloking the user. see my answer below.
Ralph Sinsuat almost 13 years+1 Thanks for clarifying one of the reasons that gc might be necessary on a bare repo.
Cascabel almost 13 yearsIt's definitely true that
gcneeds to be run on all repos, bare or not. It's also true that enough commands run it automatically that you essentially never have to. In the case of a bare repo, it's
gc --auto. (Sometimes you may want to manually run
git gc --aggressive, which will "more aggressively optimize the repository at the expense of taking much more time", but you may not find that to be important.)
Dan Moulding almost 13 years@Jefromi: I agree. The problem is that it doesn't seem to be very well documented which commands run
git gc --auto. I checked the
git-receive-packman page before writing my answer, and there's no mention of it there. So for the average user, I think it's difficult to know if
git gcneeds to be manually run. The fact that the
git gcman page still recommends that user's do run it manually seems to only add more confusion! Perhaps this is something that should be mentioned on the mailing list.
Cascabel almost 13 yearsYeah, git's documentation unfortunately can be a bit spotty sometimes. Maybe if I get ambitious I'll submit a patch. From a quick survey of the source:
rebase --interactive, and
gc --autodirectly. That's not a complete list, though, since other commands may call those.
Tino about 9 years
git gc --helpalso mentions option
git prunewhich might come handy in bare repos, depending on the usage type