veni vidi Scripsi

How I Recovered from a Corrupt Git Repository

How my Git repository got corrupt in the first place

Ok, so how my Git repository got corrupt was (sort of) entirely my own fault. And why it was a problem was definitely entirely my own fault.

I had two stashes in my repository and decided for no apparent reason to pop the second stash straight after popping the first stash. Git told me there were conflicts in the second pop. I decided that I shouldn't really have popped them straight after each other (lesson #1 here) and decided to restore a Time Machine backup to get to the state where both stashes were still stashed. And that's where things went horribly wrong (lesson #2).

After restoring the Time Machine backup I couldn't get a tree view in GitX (my GUI of choice). I did a git status and that was fine, then tried to switch branch and that was not so fine. I got a fatal: unable to read tree {commit hash}. Then I did a git fsck and it turns out I had a lot of missing/ broken links.

I Googled some and found several Stack­over­flow questions about this topic (this one described my situation best), but didn't really find a solution. Some suggested that it had something to do with the .git folder not being restored, but de­lib­er­ate­ly restoring it didn't help. I read something about packed and unpacked objects I couldn't really work out how that would fix my problem.

Why it was a problem for me

In short: I had a local branch that hadn't been pushed to anywhere that contained a lot of work, so I couldn't just clone from somewhere and be done. Lesson #3: regularly push your branches to some other (remote) repository while you're working.

How I finally (mostly) recovered from it

Like I said: I couldn't find anything that worked for me on the internet, so in the end I did the following:

  1. Move the corrupt repository to somewhere else;
  2. Re-clone the repository (master) from a remote repository (this repository obviously didn't contain my local branch);
  3. It turns out that the new clone only contained packed objects in the .git folder;
  4. I copied all the unpacked objects (the folders 00 - FF in .git/objects/) from my corrupt repository over to the new clone;
  5. Then I also copied over my local branch in .git/refs/heads/ to my new clone.

I then tried to check out my local branch and that worked, however when I did a git status a lot of files had been staged for deletion. I had no idea what was going on there so what I did next was:

  1. Push my branches to a remote repos­i­to­ry;
  2. Move this repository to a temporary location in case I might need it again;
  3. Re-clone again from this remote repository.

I've not got my stashes back, but there wasn't much work in them anyway and I probably could get them back if I tried. I did manage to get back the most important thing: the local branch with the huge amount of work in it.

What I've learned

Three things really (in order of importance):

  1. Regularly push your local stuff to a (remote) repos­i­to­ry;
  2. Don't pop your stashes one after the other, examine the changes being applied first (I might also ask myself: should I be using stashes this much? And isn't using Time Machine to correct your Git stuff really hackish and prone to problems?);
  3. You cannot rely on Time Machine to restore your Git repository correctly for you (which goes hand in hand with no. 1).

Disclaimer

The reason I didn't regularly push my branch to a remote repository was that this branch could not yet be merged with master and therefore frequently had to be rebased on the master branch. Pushing this branch to a remote repository, then rebasing it on master and trying to push it again is usually a bad idea. However it can be done and since I'm not going to share this branch with anyone it would probably have been a better idea to do it anyway.

Upgrading the Mac from Snow Leopard / Lion to Mavericks » « Recipe: Snert