Pulling and merging with git

When you have a multi-user workflow with git, you usually have a central repository that you clone, and then you periodically push your changes back to it and pull the changes others have made down from it. On the surface, this seems sort of like a subversion workflow, only you can also commit locally; in practice, however, it is very different. Each clone of the repository is essentially a potentially different branch of changes. You and Mike (a fictional über-coder) may both have the same version from the central repository, and then you make eight commits and he makes 29. What, then, does git do?
In essence, every commit you make in git can have multiple `children'. A given commit is, then, a tree. A lot of times, that tree degenerates to a list: if I commit six times in a row, each new commit is a child of the previous commit, and they happen linearly. This forms a list. However, in the case we mentioned above, the last commit that you and Mike synchronized on from the central repository has two children -- one is your first commit, and one is Mike's first commit. Then you commit linearly, in parallel to each other. How you merge these is the important part.
At some point, Mike pushes his changes. He does:

# git push
Counting objects: 104, done.
Compressing objects: 100% (73/73), done.
Writing objects: 100% (73/73), 7.13 KiB, done.
Total 73 (delta 58), reused 0 (delta 0)
To git@github.com:magic/mikeFreakingRox0rz.git
  bc06779..d79eb17 master -> master

Because Mike pushed his repository to the server before anyone else, what ends up on the server is his version of events: a linear list of commits starting from the last synchronization point. Until anyone else wants to work with this, that's fine. Then, you try to push:

# git push
To git@github.com:magic/mikeFreakingRox0rz.git
 ! [rejected]    master -> master (non-fast forward)
error: failed to push some refs to 'git@github.com:magic/mikeFreakingRox0rz.git'

Uh-oh. What happened? Well, a push is basically copying your repository back to the server. But if you were to do that now, you would overwrite all of Mike's changes. Since your local set of commits and the remote set of commits have now diverged, git doesn't let you push. The key difference between what you did and what Mike did is that when Mike pushed, the remote server had a subset of his changes, whereas when you pushed the remote server had changes that you have never seen. All in all, this is similar to when you try to commit your changes to an SVN repository when someone else has made commits before you: in order to avoid incoherency in the differences that are stored, you can't do this.
In SVN, the next step is to update. This updates the changes, and, if you are fortunate, there are no conflicts with changes you made and you are good to go. In git, the next step is to pull:

# git pull
remote: Counting objects: 20, done.
remote: Compressing objects: 100% (11/11), done.
remote: Total 11 (delta 8), reused 0 (delta 0)
Unpacking objects: 100% (11/11), done.
 From git@github.com:magic/mikeFreakingRox0rz.git
  38620e5..058cf00 master   -> origin/master
Merge made by recursive. 
 mike/is/awesome/prove.rb |  4 ++-- 
 1 files changed, 2 insertions(+), 2 deletions(-)

Usually you just fire this off, do your push again, and then forget about it. But it's important to understand what is going on in this step. When I pull from the remote repository and I have changes that are not in that remote repository, we need to reconcile those differences. In terms of what the commits look like, the original sync commit had two child lists. Those two lists must now be reconciled and joined into a single commit, which represents a coherent continuation point for everyone. In essence, the two branches that have diverged need to converge again before we push back to the server. Git needs a single head commit -- essentially the commit at the end of your trees -- to work correctly.
Anytime you pull from a repository and it has commits that you don't have, those commits are added to your local repository, and then a single merge commit is created that merges the last commit from the remote repository with the last commit from your local repository. In the above transcript from git, that happened automatically:

Merge made by recursive.

This means git went ahead and merged the two and didn't find any conflicts it couldn't resolve. In this case, git also automatically created the merge commit for you. If you do a git log afterwards, you will see it:

# git log commit eb8442977176a95568e27b40c169e2d97ab4e8f7
Merge: 0f6ef2d... 058cf00... 
Author: Antonio Salazar Cardozo  
Date:  Wed Apr 22 16:55:41 2009 -0400

   Merge branch 'master' of git@github.com:magic/mikeFreakingRox0rz.git

At this point, your repository is ready to push back, and you can do a git push as above and this time it will work.
This is the perfect situation above. There are two important things that can go wrong when you pull. A pull in git is a combination of a fetch command (which pulls the remote repository in its current form into a local copy of that repository) and a merge command (which merges that local copy of the remote repository into your local working repository). Generally, the fetch works (unless you have a connection issue), and the merge occasionally fails. For example:

# git pull
remote: Counting objects: 48, done.
remote: Compressing objects: 100% (26/26), done.
remote: Total 26 (delta 20), reused 0 (delta 0)
Unpacking objects: 100% (26/26), done.
 From git@github.com:magic/mikeFreakingRox0rz.git
  38620e5..eb84429 master   -> origin/master
Updating 38620e5..eb84429
mike/is/awesome/prove.rb: needs update
error: Entry 'mike/is/awesome/prove.rb' not uptodate. Cannot merge.

This happens when you have made changes locally that haven't yet been committed. Git merges committed files happily, but it doesn't even try to merge changes that are not yet committed into the repository. The solution here is to just commit:

# git commit -m "He is awesomer." mike/is/awesome/prove.rb

Once that's done, you can do one of two things. You can run a pull again, but that will redo the fetch, and you've already fetched the remote repository. The alternative is faster, which is to just redo the merge part of the pull:

# git merge origin/master

This merges from the master branch of the origin remote, which is the remote repository and branch we've been working with. At this point, the merge will usually succeed. However, it may fail yet again, with an error that looks more like:

Auto-merged mike/is/awesome/prove.rb 
CONFLICT (content): Merge conflict in mike/is/awesome/prove.rb

This may happen for multiple files. At this point, you need to resolve the merge error. The repository is left in a very unsteady state. The only way to really move forward is to resolve the conflict and then commit. The commit will be the merge commit we saw in the log above, and will include any automatic merging that git has done successfully. The easiest way to perform the merge is:

# git mergetool
merge tool candidates: kdiff3 tkdiff xxdiff meld gvimdiff opendiff emerge vimdiff 
Merging the files: mike/is/awesome/prove.rb 

Normal merge conflict for 'mike/is/awesome/prove.rb': 
  {local}: modified
  {remote}: modified 
Hit return to start merge resolution tool (opendiff):

git mergetool essentially walks you through each file with conflicts and offers you the choice of which program you want to resolve the conflict with. It detects what programs are available and makes an intelligent selection based on that. On Mac, it tends to be opendiff. Once you open the file, you can resolve the conflict and then save it. mergetool may prompt you to verify that you've resolved the conflict (that usually only happens if you leave everything unchanged). Then, when you've done all the files, you are returned to the command prompt. At that point, run git status, double-check that everything looks good, add anything you want to commit if you haven't done so yet, and then commit. At that point, your merge commit is done and you can push again.
One last thing worth mentioning is an alternative to committing your changes when you get the ``not uptodate'' error. That error means that you have uncommitted changes in your working tree on a file that git is trying to merge. You can commit them, but you may not be done with those changes yet. In these cases, you can use git stash. If you just run git stash, it will stash all changes to your working tree away in a secret place, and leave your working tree clean for a merge. Then, when you've done the merge, you can do git stash apply, which will apply the latest stashed changes to your working tree. This is a good way to pause your work, merge, then continue your work.

9 responses
Very well written!! This was the best article I've read on git by far ... you've described a very convoluted topic and you've done a fantastic job! Bravo...
Glad you enjoyed it!
I wish I could read it more easily, Antonio. Your text and the screenshots each appear as one *very* long line. I used Lynx to understand what you're saying, but a lot is lost that way.
Hm yeah, that's recent. Not sure what happened. I'll see if I can fix it.
Ok, jkl, try that on for size. Sorry for the problem; posterous must have munched on the line breaks there, but I went through and put them back. Thanks for letting me know!
That is very well written.

Somewhere between the clean postereous design and the nice progression of topics lies Zen :)

Excellent article. Thank you VERY MUCH for this!!!!
Glad you liked it! :)
1 visitor upvoted this post.