Chinaunix首页 | 论坛 | 博客
  • 博客访问: 1006975
  • 博文数量: 442
  • 博客积分: 1146
  • 博客等级: 少尉
  • 技术积分: 1604
  • 用 户 组: 普通用户
  • 注册时间: 2010-11-04 12:52
个人简介

123

文章分类

全部博文(442)

文章存档

2017年(3)

2016年(15)

2015年(132)

2014年(52)

2013年(101)

2012年(110)

2011年(29)

分类: LINUX

2012-01-12 23:30:34

Git is one hell of a powertool.

Like with any such tool, as soon as you get to know it enough, you start pushing the boundaries. Git gives you a lot of control over your repository:

The list goes on…

More traditional version control systems don’t give you as much power as Git by any stretch of the mind. They are like taking a walk in the woods with your parents, at age 14.

You’re probably gonna see and do neat stuff, but you sure ain’t gonna get lost or anything.

Using Git on the other hand is more akin to being handed a cool motocross to go play alone in the woods… Also at age 14.

Insane motocross shit

We all know what’s bound to happen, right?

You’ll smash into a tree.

The source control equivalent to slamming into a tree is losing commits. Getting all of Git’s power and flexibility at once can be somewhat dangerous. You’ll find it so easy and helpful to branch and merge that you’ll start doing it way more often. On the other hand — especially in the beginning — you’ll misunderstand or plainly miss some important warnings, and make errors. Or you may just end up in weird merging situations you never thought of, and don’t necessarily understand. These situations can often result in losing commits or whole branches.

My goal with this article is to make sure you understand the situation you’re really in: you have temporarilylost commits or branches.

Disclaimer

This article assumes a basic knowledge of how git works, e.g. committing, branching and merging.

My first time

The first time I lost a commit was a good while ago. I can’t remember the details, but basically I got bit by the fact that under the covers, Git uses hard links liberally. Which means that copy / pasting your code directory as a recovery solution isn’t going to save your ass A nice poney when you attempt a potentially damaging operation you don’t fully understand.

Note that compressing your code directory will, though.

So there I was, after attempting an operation I didn’t really understand. I knew I had failed what I attempted and I knew I had lost my last commit. Ironically, I still had Gitk open, displaying that very commit. As long as I didn’t refresh the Gitk view with F5 I could see the lost commit.

Here’s a fun fact: under OSX (not sure about Linux) you cannot select and copy text from Gitk’s interface, except for the SHA1 field [1]. I knew Git probably had a way to recover from that… But you know, I just wanted to get back to work and NOT search documentation and blog posts endlessly.

So I took screenshots, passed them real quick through GOCR, just to see how far it would get.

The result: GOCR doesn’t like the font Monaco :-)

How to (really) recover lost commits with Git

Recently I lost a commit again. This time however, Gitk was not up to date. I knew I’d just lost something I wouldn’t necessarily remember in its entirety. It was a commit an hour old, touching many files. And I have a crappy memory.

This time I had to do it the right way. I found out it’s really easy (once you figure it out), but I found no really clear explanation anywhere. So here goes.

Initial setup

If you wanna follow along — and I strongly recommend it — here’s the boring few steps to create a dummy repo and bring it up to speed with for the rest of this article. We’re going to beat the hell out of this repo and it’s going to be fun.

So just paste the following into a console:

mkdir recovery;cd recovery git init touch file git add file git commit -m "First commit" echo "Hello World" > file git add . git commit -m "Greetings"git branch cool_branch git checkout cool_branch echo "What up world?" > cool_file git add . git commit -m "Now that was cool" git checkout master echo "What does that mean?" >> file

Ok, let’s look at where we’re at:

gitk ––all &

The ––all option lets you see all branches at the same time, as well as your stashes.

Click here to enlarge your picture!!1

Initial setup - Recovering git commits

We can see the cool_branch as well as some yet uncommitted changes over the master branch.

mathieu@ml recovery (master)$ ls -l total 16 -rw-r--r-- 1 mathieu staff 15B 7 Jun 18:19 cool_file -rw-r--r-- 1 mathieu staff 33B 7 Jun 18:19 file

Got my 2 files, I’m good to go.

Let’s make a mistake

Let’s say I decide I want to bring in these cool changes in master. I’ll do it with a rebase. I know there’s no big risk of conflicts so that’s a no-brainer.

mathieu@ml recovery (master)$ git rebase cool_branch file: needs update

My ugly mug

Now if you look carefully you’ll notice I wasn’t paying attention when Git gave me a feeble complaint about ‘file’.

Everything’s well, so I think “Ok, I don’t need cool_branch anymore”.

mathieu@ml recovery (master)$ git branch -d cool_branch error: The branch 'cool_branch' is not an ancestor of your current HEAD. If you are sure you want to delete it, run 'git branch -D cool_branch'.

Huh? Whatever you say, Linus. Let’s get on with it.

mathieu@ml recovery (master)$ git branch -D cool_branch Deleted branch cool_branch.

Ahh, it feels good to be a Git ninja. Now let’s see where we’re at and refresh Gitk with F5.

Gitk - oh shit moment

Oops, my cool commit is gone! That thing can’t be right. Let’s panic:

mathieu@ml recovery (master)$ ls file mathieu@ml recovery (master)$ git status # On branch master # Changed but not updated: # (use "git add ..." to update what will be committed) # # modified: file # no changes added to commit (use "git add" and/or "git commit -a") mathieu@ml recovery (master)$ git diff diff --git a/file b/file index 557db03..f2a8bf3 100644 --- a/file +++ b/file @@ -1 +1,2 @@ Hello World +What does that mean?

Oh shit face
Oh sh!t

So the ‘file: needs update’ message back there meant that the rebase didn’t happen, because I had pending changes.

Helpful.

Recovering a lost commit

Since I don’t think my uncommitted work is complete, I’ll just stash it instead of committing it. Then I’ll hunt down my lost work.

mathieu@ml recovery (master)$ git stash save "Questioning the universe" Saved working directory and index state "On master: Questioning the universe" HEAD is now at 6da726f... Greetings

In the name of paranoïa, let’s make sure this got in right:

In a paranoïa moment, we make sure the stash is saved correctly

Ok, let’s get on with our rescue mission:

mathieu@ml recovery (master)$ git fsck −−lost-found dangling commit 93b0c51cfea8c731aa385109b8e99d19b38a55be

That sounds right, exactly one commit in the lost and found.

Let’s just make sure:

mathieu@ml recovery (master)$ git show 93b0c51cfea8c731aa385109b8e99d19b38a55be | mate

We see in textmate that this is our lost commit

Bingo!

Different ways to recover the commit

There are a few different ways to recover that commit. Obviously we can just copy and paste that snippet, but in the case of a bigger commit, that approach will just amount to a lot of error-prone busywork.

I’ll reclaim my Git ninja status and try it a few different ways.

Recover it with rebase

Let’s just replay this change on top of master:

mathieu@ml recovery (master)$ git rebase 93b0c51cfea8c731aa385109b8e99d19b38a55be First, rewinding head to replay your work on top of it... HEAD is now at 93b0c51... Now that was cool Fast-forwarded master to 93b0c51cfea8c731aa385109b8e99d19b38a55be.

Commit recovered with rebase

Neat! Now I feel like a ninja worthy of the title again.

So let’s rewind one commit and try it another way.

mathieu@ml recovery (master)$ git reset --hard head^ HEAD is now at 6da726f... Greetings

Rewinding to a state where we’ve lost our commit

Ok, the commit’s gone.

(Don’t tell anyone but my inner ninja is feeling queasy again.)

Recover it with merge

There are cases where rebase is not powerful enough. For example when you expect to face a lot of conflicts. In this case merge is a better solution:

mathieu@ml recovery (master)$ git merge 93b0c51cfea8c731aa385109b8e99d19b38a55be Updating 6da726f..93b0c51 Fast forward cool_file | 1 + 1 files changed, 1 insertions(+), 0 deletions(-) create mode 100644 cool_file

Commit recovered with merge

Too easy… Rewind!

mathieu@ml recovery (master)$ git reset --hard head^ HEAD is now at 6da726f... GreetingsRecover it with cherry-pick

If instead you had a few commits one after another but you just want to pick the last one, rebase and merge won’t do. They would bring the whole branch back in master. That’s a situation for cherry-pick.

mathieu@ml recovery (master)$ git cherry-pick 93b0c51cfea8c731aa385109b8e99d19b38a55be Finished one cherry-pick. Created commit f443703: Now that was cool 1 files changed, 1 insertions(+), 0 deletions(-) create mode 100644 cool_file

Commit recovered with cherry-pick

Insane!

This only leaves one open question: WHO’S YOUR DADDY NOW, GIT?

Now that we’ve established the answer to that question, let’s get back to work!

Let’s make a second mistakemathieu@ml recovery (master)$ git stash clear

Or was it Git stash apply?

Oops! Accidentally lost the stash

Oh jeez, there we go again…

mathieu@ml recovery (master)$ git fsck −−lost-found dangling commit 24e3752f7a73ae98b361ce1c260e1f285d653447 dangling commit 93b0c51cfea8c731aa385109b8e99d19b38a55be

Ok, we still see the one we lost earlier, 93b0c51… Let’s look at the other one.

mathieu@ml recovery (master)$ git show 24e3752f7a73ae98b361ce1c260e1f285d653447 commit 24e3752f7a73ae98b361ce1c260e1f285d653447 Merge: 6da726f... c90f079... Author: Mathieu Martin Date: Sat Jun 7 16:02:57 2008 -0400 On master: Questioning the universe diff --cc file index 557db03,557db03..f2a8bf3 --- a/file +++ b/file @@@ -1,1 -1,1 +1,2 @@@ Hello World ++What does that mean?

Spot on. Let’s try something wild, while we’re here.

mathieu@ml recovery (master)$ git checkout 24e3752f7a73ae98b361ce1c260e1f285d653447 Note: moving to "24e3752f7a73ae98b361ce1c260e1f285d653447" which isn't a local branch If you want to create a new branch from this checkout, you may do so (now or later) by using -b with the checkout command again. Example: git checkout -b HEAD is now at 24e3752... On master: Questioning the universe mathieu@ml recovery (24e3752...)$

As you may have noticed, my console always indicates which branch I’m in, so far [2]. But now I seem to be in some kind of twilight zone, which Gitk confirms.

Oops! Accidentally lost the stash

Let’s follow Git’s suggestion and make that a branch.

mathieu@ml recovery (24e3752...)$ git checkout -b recovery Switched to a new branch "recovery" mathieu@ml recovery (recovery)$

Stash recovered as a branch

Looks weird, like stashed items always do, but at least we have our commit.

After fiddling around with what’s been recovered from the stash, I recommend NOT keeping it as a commit.

If you try to replay the change in the recovery branch over master’s most recent commit, you lose the “Questioning the universe” commit. Probably because a stash is a weird kind of commit, or maybe because of a bug. I don’t know.

(Don’t follow this one in your console)

mathieu@ml recovery (recovery)$ git rebase master #I said don't do this one First, rewinding head to replay your work on top of it... HEAD is now at 93b0c51... Now that was cool Nothing to do.

Rebasing the recovered stash over master doesn’t work

If instead I checkout master and then rebase its last change over the ‘recovery’ branch it seems to work.

Recovered stash back in master

However since I just saw a commit disappear when rebasing the other way around, I get the feeling that this isn’t a normal commit and it may come back to haunt me later.

Recover it by applying a diff

Let’s just apply the diff to master. I’ll do as if it actually was a substantial commit, involving lots of modifications on lots of files, and apply it automatically with ‘git apply’.

First let’s visualize where we’re at, again:

Stash recovered as a branch

A diff against master is not what we want since master includes a new (very cool) commit.

Instead we just want to see the changes introduced by the current commit. To do this we can compare it with the common ancestor between the master and recovery branches. So let’s start by finding it’s ID.

Finding the ID of the common ancestor

mathieu@ml recovery (recovery)$ git diff 6da726f37683c83947d54314cd32ca1ee9d490e0 diff --git a/file b/file index 557db03..f2a8bf3 100644 --- a/file +++ b/file @@ -1 +1,2 @@ Hello World +What does that mean?

Looks good. Now we throw that diff upstairs.

git diff 6da726f37683c83947d54314cd32ca1ee9d490e0 > ../recovery.diff

Then get apply it to our master branch.

mathieu@ml recovery (recovery)$ git checkout master Switched to branch "master" mathieu@ml recovery (master)$ git apply ../recovery.diff

And we finally confirm that everything’s under control.

mathieu@ml recovery (master)$ git status # On branch master # Changed but not updated: # (use "git add ..." to update what will be committed) # # modified: file # no changes added to commit (use "git add" and/or "git commit -a") mathieu@ml recovery (master)$ git diff diff --git a/file b/file index 557db03..f2a8bf3 100644 --- a/file +++ b/file @@ -1 +1,2 @@ Hello World +What does that mean?

This change was first stashed rather than committed because I felt it was not complete. Applying it with Git apply only introduces it as an unstaged change, which works perfectly for this situation. Now I can keep banging at the code until I feel this actually deserves to be committed.

mathieu@ml recovery (master)$ echo "I don't know" >> file mathieu@ml recovery (master)$ git commit -a -m "Conversation of staggering depth" Created commit 65a4794: Conversation of staggering depth 1 files changed, 2 insertions(+), 0 deletions(-)Cleaning up the crud

Ok, so now I still have this weird looking recovery branch.

Now we want to get rid of this weird recovery branch

Since it’s now useless we can get rid of it.

mathieu@ml recovery (master)$ git branch -d recovery error: The branch 'recovery' is not an ancestor of your current HEAD. If you are sure you want to delete it, run 'git branch -D recovery'.

Aha! This time everything’s committed correctly, so I know I can delete it for real. Git is complaining because that commit was not included through its normal merge or rebase commands. So it warns me that I may be about to lose something. However I know I got everything through the diff I made and re-applied.

mathieu@ml recovery (master)$ git branch -D recovery Deleted branch recovery.

Now that I’m aware that commits are reachable even if they’re not in a branch anymore, I wonder about my repo’s size.

Repository size with a few dangling commits: 224kb

mathieu@ml recovery (master)$ git gc Counting objects: 22, done. Compressing objects: 100% (14/14), done. Writing objects: 100% (22/22), done. Total 22 (delta 7), reused 0 (delta 0) mathieu@ml recovery (master)$ git prune

Repo size after cleaning up the crud: 152kb

Fair enough. I would expect the unused commits to now be unreachable, but strangely enough:

mathieu@ml recovery (master)$ git fsck −−lost-found dangling commit 49ed65cdea22443af3f1fd400754fe1517421b24 dangling commit 4b1bf4792cba929e88114379d7d5e86a2dc9990f dangling commit 6cdf88318109dede7bd3c1a75be76c7255708ded dangling commit 715a6b2cfe797383216d0f9b04fe8f50e90e779f dangling commit f443703e5060d9f3b4d97504bda5f97e5a0b31e8

If anyone finds out what that’s all about, please let me know!

Maybe Git’s just refusing to do any work unless it’s going to actually save a considerable amount of space? I have no idea.

Conclusion

Once you know how to recover from bad mistakes, you’ll find that Git is not only a very powerful tool, but also a very forgiving one. As opposed to a motocross.

The following commands will help you figure you way out of most bad situations:

  • git show
  • git fsck −−lost-found
  • git diff

And these ones will actually get out of these bad situations:

  • git rebase
  • git cherry-pick
  • git merge
  • git apply

As I think I demonstrated, Git gives you the ability to recover from most bad mistakes. The fact that any single commit can be cherry-picked, checked out, rebased or merged makes it really easy to recover from hairy situations.

The only case where you might actually lose information is when something has not been committed or stashed yet, which I think is perfectly reasonable.

So if you take only one thing away from this article, let it be this. Git is much safer than a motocross.

Footnotes

[1] At the time I didn’t know that just having the SHA1 id was enough to save me.

[2] See how to configure your console in the same manner and also get auto-completion for Git here.

16 Responses to “The illustrated guide to recovering lost commits with Git”
  1. anonymous says:

    Also check out “git log -g”.

  2. I think the reason those danglying commits are still there at the end is because git doesn’t get rid of random commits. It keeps them around for 30 days AFAIK. More info:

  3. Nice article Mat!

    One tip I always give to avoid getting in situation like this is to always run git status between commands. It’s a nice habit to get into and the output is generally more helpful than the git error/warning message.

    I checked the docs about the dangling commits. I think if you ran git prune before gc, it would have removed them. git gc runs prune but keeps commits for 30 days. Then it packs the repository, and according to the prune doc, packed commits do not get removed. That’s my guess.

  4. The illustrated guide to recovering lost commits with Git…

    [...]A walkthrough to recovering from most kinds of bad situations you can get into with Git, the revolutionary version control system. With cameos by ninjas and motocrosses![...]…

  5. Tim Abell says:

    In “mathieu@ml recovery (master)$ git fsck –lost-found” the “–lost-found” has had the leading double dash mangled.

  6. webmat says:

    @Tim
    Good catch, I fixed the few occurrences where it was mangled. WordPress’ markup language is playing tricks on us :-)

    @Rory and Anon: I’ll be posting more about Git shortly. I’ll revisit both your suggestions while I’m at it.

  7. [...] is the same. Git has tonnes of wonderful magic toys, but it seems it also has a little trick for saving the bacon when you accidentally loose something. One to bookmarked for those special [...]

  8. [...] for all the world like there is no way to get that work back.  But all is not lost!  Between  this great post in Mathieu Martin’s blog and Eric available just when I needed him even though he’s over in [...]

  9. [...] remember, in case of an emergency, you can always refer to my illustrated guide to recovering lost commits with Git [...]

  10. mikong says:

    At first I thought you were just using Windows that’s why you used ‘md recovery’ command instead of ‘mkdir recovery’. But with the screenshots you’re using Mac OS X right?

  11. webmat says:

    Mikong: you’re right, I’m using ‘md’ :-)

    In fact I made an alias md=’mkdir’. But I am under OS X, yes. I’ll fix the typo…

  12. webmat says:

    Glad to be of help, Daniel :-)

  13. Felix Rabe says:

    Try this:

    git-fsck –lost-found
    gitk [--all]

    Right-click, “make branch” -> same as git-checkout -b something

  14. Felix Rabe says:

    Bah, i meant:

    gitk [--all]

  15. webmat says:

    @Felix
    It is indeed a very cool trick to see what commits lead to a specific lost commit. Thanks for the tip :-)

    Unfortunately, as a Mac heretic I don’t have a Tk implementation that allows me to interact much with Gitk, as opposed to Linux users. I can copy/paste the SHA ID, click the buttons / select the combo boxes, but I can’t right-click anything. Well, I can, but nothing happens ;-)

    Now if only an approximate equivalent to TextMate could appear on Linux…

    An example of Felix’ trick:

    Gitk displaying two lost commits in sequence

  16. Brian Estes says:

    Ah, you saved me with this post! Thank you so much.

阅读(3003) | 评论(0) | 转发(0) |
给主人留下些什么吧!~~