found drama

get oblique

git cherry-pick: it might not be what you think

by Rob Friesel

Among the myriad of powerful features in Git, cherry-pick is probably the cause of and solution to most of my source control problems. I use it often, and probably way more than I should; I say the former because it seems like the obvious way to selectively and surgically move around “just those” changes, and the latter because the temptation to use cherry-pick is strong, and it’s easy to forget that there is some peril hidden in there as well.

cherry-pick: a short definition

If you don’t know what Git’s cherry-pick is, here is a definition from the man page’s description:

Given one or more existing commits, apply the change each one introduces, recording a new commit for each.

It does what it says on the tin: it “cherry picks” changes from one branch to another. In plain English, use it like this:

  1. Check out the target branch. (For example: master.)
  2. Run cherry-pick and give it the SHA for the commit that you want to bring in to that branch.

(“Real” sample commands later on.)

As I mentioned, I often run in to cases where cherry-pick seems like the perfect solution. For example, working on a topic branch, I identify an existing defect, patch it, and then immediately cherry-pick it across to develop or master1 so it can get integrated before the rest of the topic branch.2 Or (variation on the previous) a subset of features/changes from a topic branch need to get integrated before the rest of the topic branch.

Generally speaking, these are the kinds of places where cherry-pick seems to fit the bill nicely.

But…

But these are also the places where cherry-pick is likely to bite you. Why? Let’s look at that definition again:

Given one or more existing commits, apply the change each one introduces, recording a new commit for each.

That’s right. cherry-pick introduces a new commit. Same change, different commit. If you’re throwing away the rest of the topic branch, this isn’t a big deal — cherry-pick to your heart’s content! But if you’re planning on integrating that topic branch later, you could be in for some problems.

Example

Let’s take a look at a scenario where cherry-pick might give us some heartache. Imagine a repository with a master branch that has one commit on it.

cherry-pick illustrated: master with 1 commit

A simple enough place to start.

Now let’s create a topic branch and start making our changes.

cherry-pick illustrated: topic branch with 3 commits

This should be pretty familiar. You’re plugging away at your development, making your commits as you go. Progress.

Now let’s assume that you need to get that first topic commit (b1c590d) onto master ahead of everything else. Naturally we think: cherry-pick!

And we get this:

cherry-pick illustrated: cherry-pick that commit into master

Hooray! Now master has the changes that we so urgently needed to incorporate. Now we can return to our topic branch and finish up those features. But when we perform our merge, we have conflicts:

cherry-pick illustrated: after our merge... conflicts!

Remember that thing about “recording a new commit”? Yeah. That.

Luckily, in our contrived example, this merge conflict is trivial to resolve. But the your real world encounter with this probably won’t be so easy.

The Moral of This Story

When first learning about cherry-pick, don’t lose yourself in the excitement of being able to move copy those arbitrary commits. Remember that last bit in the description about the changes being recorded as a new commit. If you don’t read that part closely, it’s easy to believe that Git is going to recognize these commits as the same thing (in our example b1c590d and 0ef8523).3 It doesn’t, but that’s because that (i.e., cases like the above example) isn’t what cherry-pick is for. Again: cherry-pick is really “for” the times when you’re going to throw away the topic branch and only keep a handful of the “salvageable” changes.

This isn’t to say that you can’t use cherry-pick for cases like the one in our example. Just be aware that you (and not Git) are on the hook for resolving those conflicts. They’re not guaranteed to happen, but there’s a strong possibility that it will. Instead of cherry-pick, consider other strategies. For example, if you identify a defect on the topic branch, instead of fixing it there and cherry-picking it back to develop, consider switching to develop or master, performing the fix on a separate topic branch, and then merging those changes to the places where you need them.

As I mentioned, cherry-pick is a powerful tool, but it has that one little quirk that can trip you up if you aren’t paying close enough attention. Use it, but use it well and wisely.

  1. And here I’m taking for granted that you’re using “Git Flow” or something like it. Be warned: I’ll probably lapse into that assumption a couple of times. []
  2. And/but: “You really shouldn’t solve that problem that way…” []
  3. I freely admit it: for the first year or so of using Git, I had come to that erroneous conclusion and made that mistake often. It took me a while to figure out why (a) I kept seeing the same changes twice in the log and (b) why I kept getting so many merge conflicts. []

About Rob Friesel

Software engineer by day, science fiction writer by night. Author of The PhantomJS Cookbook and a short story in Please Do Not Remove. View all posts by Rob Friesel →

17 Responses to git cherry-pick: it might not be what you think

Ivan Urwin says:

Your diagrams are superb. WHat did you use to create them? I use Dot from Graphviz to draw some similar directed graphs, but they are nowere near as smooth and sleek looking as yours.

dodo says:

Thanks for the post.
There is a typo in your graph:
“git cherry-pick 0ef8523”
should be
“git cherry-pick b1c590d”

hbt says:

I tested this and the changes were merged recursively — which makes sense because although they are different commits, it is the same change.

Could you please clarify when/why we would end up with a conflict?

Thank you!

found_drama says:

@hbt – What version of git are you using? I suppose the behavior may have changed (“been fixed”) in later versions. It may also have to do with whether the merge is a fast-forward merge or not. The main thing in my example has to do with the intermediary commits.

Eric says:

This was a VERY helpful explanation, thanks much.

It looks like (a) cherry-pick is great if we’re deleting the feature branch, and (b) cherry-pick not so great if we’re keeping the feature branch to merge back in later

(Especially if it’s a big long feature branch)

So the question is: Do you think you could recommend the RIGHT way to do things in situation B? (possibly using more of your very clear diagrams)

Is it something along the lines of checking out the earlier “add reducer” commit, making a 2nd tiny feature branch called “hotfix” or something, and merging that onto master?

Or does that end up with the exact same problems as cherry-pick?

found_drama says:

@Eric– I think it really depends on the situation. By and large, my opinion is that cherry-pick is best reserved for two types of scenarios: (1) where you want a specific commit or two off of a feature branch that you’re otherwise abandoning; or (2) when you need to bring over a change to master quickly and are prepared to deal with a merge conflict later should one arise.

Most of the time, my advice would be to stick with vanilla merges and rebases.

Dan Moulding says:

It gets much worse than just this. Conflicts are a minor problem compared to the breakage that cherry-picks can cause.

Imagine that someone later realized that this commit that was cherry-picked onto master was *wrong* — the changes introduce a bad bug. They revert the commit on master. And lets say the person doesn’t know that this commit was cherry-picked from the topic branch. So they don’t revert the change on the topic branch. The topic branch still has the bug introduced by the bad commit. When that topic branch gets merged, there will be no conflict with the bad commit. It will get cleanly merged. BUT the bad changes that were reverted on master will silently be added back to master! This kind of thing can be very difficult to detect in a busy project.

Derrick says:

Isn’t a way around this problem to just commit directly to master, and then merge master down into the topic branch?

Roger Qiu says:

If I cherry pick a commit that changes say A file. Let’s say it adds “blah” the end of the file. If that A file already had changes from other commits in the same topic branch, but prior to the commit that I’m cherry picking, how does git manage the cherry pick? Does it bring transfer all changes the A file up the point in time for the commit I’m cherry picking, or does it forget those other prior changes? If it does forget, how does it deal with when the cherry picked commit assumes a different state of the A file for the delta?

Rob Friesel says:

@Roger – I’m not really sure, to be honest. I’m not very familiar with the internals of how specifically cherry-pick determines what is/not pare of the change. I’m fairly confident though that when cherry-picking, it does “transfer all changes” as you put it, but only inasmuch as it’s diffing the specific target commit against the current HEAD state. In other words, it isn’t a merge where you’re bringing “this commit” and it’s trail of commits, but rather just the two specific states (i.e., what’s in the tree as it manifests on the file system).

HK says:

Thank you for the useful post!
By the way, how about using –ff option for cherry-pick command.
======================================
–ff
If the current HEAD is the same as the parent of the cherry-pickā€™ed commit, then a fast forward to this commit will be performed.
======================================

Dan says:

Wouldn’t the topic branch be merged *after* the cherry-pick? That order of commits on master would be as follows (top = newest):

7446735 even more verbosity
5ba28b7 verbose reducer
b1c598d add reducer <– merged from topic
0ef8523 add reducer <– cherry-pick
689f5c7 initial commit

(Your diagram: cherry-picker-illustrated-4.png)

Ben says:

I saw a lot of good comments here, but I think the concerns of a lot of you could be solved by a few different options.

Firstly, use the -x option when picking to add a line to the commit message that tells others it was a picked commit, and also provide the SHA-1 of the original. And if possible, the –ff option is good too.

Second, when you need to merge a “picked” branch, rebase first. Rebasing will compare the snapshots or deltas, and is smart enough to not duplicate the commits. Also, since the parent of the branch is pointed at the head of the current branch, a fast-forward takes place and no extra merge commit is applied. You end up with a chronology based off of when the commits were applied(no timestamp data gets lost!!!), and without a duplicate commit. YAY!

At first glance, you might notice an issue with chonology. The picked commit might be out of order. This is due to git showing the author date by default. This is the date the commit was created, not applied. You can see the correct commit date using –fuller or by using –pretty.

Leave a Reply

Your email address will not be published. Required fields are marked *

*

*