Wednesday, August 15, 2012

Automating git

This is a long-overdue follow-up to my previous post about using git to fix Moodle bugs. Thanks to Andrew Nichols of LUNS for nudging me in to writing this.

Git has an efficient command-line interface, but even so, there are some sequences of commands that you find yourself typing repeatedly. Git provides a mechanism called aliases which can be used to reduce this repetitive typing. This post explains how I use it in my Moodle development.

Basic usage

Let us start with the simplest possible example. I get bored typing git cherry-pick in full all the time. The solution is to edit the file .gitconfig in my home directory, and add

[alias]
        cp     = cherry-pick

Then git cp … is equivalent to git cherry-pick …. That saves 9 characters every time I have to copy a bug fix to a stable branch.

Simple aliases like this can also be used to to supply options. Another one I have set up is

        ff     = merge --ff-only

I use that when I need to update one of my local branches to match a remote branch. Suppose I think I are on the master branch, and I want to update that to the latest moodle/master. Normally one would just type git merge moodle/master and it would look like this:

timslaptop:moodle_head tim$ git merge moodle/master
Updating ddd84e9..b658200
Fast-forward

Suppose, however, that I had made a mistake, and I was actually on some other branch. Then git would try to do a merge between master and that branch, which is not what I want. The --ff-only option tells git not to do that. Instead it will stop with an error if it can't do a fast forward. So, to prevent mistakes, I normally use that option, and I do it frequently enough I found it worthwhile to create the alias.

Getting more ambitious

Sometimes the repeated operation you want to automate is a sequence of git commands. For example, when a new weekly build of Moodle comes out, I need to type a sequence of commands like this:

git checkout master
git fetch moodle
git merge --ff-only moodle/master
git push origin master

That updates my local copy of the master branch with the latest from moodle.org and then copies that to my area on github. To automate this sort of thing, you have to start using the power of Unix shell scripting. (If you are on Windows, don't worry, because you typically get the bash shell when you install git.)

Fortunately, you don't need to know much scripting, and you can probably just copy these examples blindly. The first thing to know is that you can put two commands on one line if you separate them using a semi-colon (just like in PHP). The previous sequence of commands could be typed on one line as

git checkout master; git fetch moodle; git merge --ff-only moodle/master; git push origin master

(Note that these lines of code are getting quite long, and will probably line-wrap. It should, however, be a single line of code.)

Doing it this way turns out to be a bad idea. What happens if one of the commands gives an error? Well, the system will just move on to the next command, even though the error from the previous command probably left things in an unknown state. Dangerous! Fortunately there is a better way. If you use && instead of ; then any error will cause everything to stop immediately. If you are familiar with PHP, then just image that every command is a function that returns true or false to say whether it succeeded or not. That is not so far from the truth. So, the right way to join the commands together looks like this:

git checkout master && git fetch moodle && git merge --ff-only moodle/master && git push origin master

Now we know what we want to automate, we need to teach this to git. It is a bit tricky because we don't just want to convert one single git command into another single git command. Instead we want to convert one git command into a sequence of shell commands. Fortunately this is supported, you just need to know the right syntax:

        updatemaster = !sh -c 'git checkout master && git fetch moodle && git merge --ff-only moodle/master && git push origin master' -

Now I just have to type git updatemaster to run that sequence of commands.

Parameterising your aliases

That is all very well for master, but what about all the stable branches? Do I have to create lots of separate aliases like update23, update22, update21, …? Of course not. Git was created by and for computer programmers. Shell scripts can take parameters, and the solution is an alias that looks like

        update = !sh -c 'git checkout MOODLE_$1_STABLE && git fetch moodle && git merge --ff-only moodle/MOODLE_$1_STABLE && git push origin MOODLE_$1_STABLE' -

With that alias, git update 23 will update my MOODLE_23_STABLE branch, git update 22 will update my MOODLE_22_STABLE, and so on.

You can use any number of parameters. If you remember my previous blog post, typically I will create the bug fix on a branch with a name like MDL-12345 that starts from master, and then I will want to copy that to a branch called MDL-12345_23 branching off MOODLE_23_STABLE. With the following alias, I just have to type git cpfix MDL-12345 23 in my Moodle 2.3 stable check-out:

        cpfix = !sh -c 'git fetch -p origin && git checkout -b $1_$2 MOODLE_$2_STABLE && git cherry-pick origin/master..origin/$1 && git push origin $1_$2' -

One final example that belongs in this section:

        killbranch = !sh -c 'git branch -d $1 && git push origin :$1' -

That deletes a branch both in the local repository and also from my area on github. That is useful once one of my bug fixes has been integrated. I then no longer need the MDL-12345 branch and can eliminate it with git killbranch MDL-12345.

To boldly go …

Of course, all this automation comes with some risk. If you are going to screw up, automation lets you screw up more things quicker. I feel obliged to emphasis that at this point. If you are going to shoot yourself in the foot, a machine gun gives the most spectacular results, and we are about to build one, at least metaphorically.

We just saw the killbranch command that can be used to clean up branches that have been integrated. What happens if I submitted lots of branches for integration last week. I have to delete lots of branches. Can that be automated? Using git I can at least get a list of those branches:

timslaptop:moodle_head tim$ git checkout master
Already on 'master'
timslaptop:moodle_head tim$ git branch --merged
  MDL-12345
  MDL-23456
* master

Those are the branches that are included in master, and so are presumably ones that have already been integrated. It is a bit irritating that the master branch itself is included in the list, but I can get rid of it using the standard command grep:

timslaptop:moodle_head tim$ git branch --merged | grep -v master
  MDL-12345
  MDL-23456

I have a list of branches to delete, but how can I actually delete them? I need to execute a command for each of those branch names. Once again, we find that shell scripting was developed by hacker, for hackers. The command xargs does exactly that. xargs executes a given command once for each line of input it receives. Feeding in the list of branches, and getting it to execute the killbranch command looks like this:

git branch --merged | grep -v $1 | xargs -I "{}" git killbranch "{}"

Now to make that into an alias

        killmerged = !sh -c 'git checkout $1 && git branch --merged | grep -v $1 | xargs -I "{}" git killbranch "{}"' -

With that in place, git killmerged master will delete all my branches that have been integrated into master. Note that you can use one alias (killbranch) inside another (killmerged). That makes it easier to build more complex aliases.

Once I have deleted all the things that were integrated, I am left with the branches I have in progress that have not been integrated yet. Those all need to be rebased, and that can be automated too:

        updatefix = !sh -c 'git checkout $1 && git rebase $2 && git checkout $2 && git push origin -f $1' -
        updatefixes = !sh -c 'git checkout $1 && git branch | grep -v $1 | xargs -I "{}" git updatefix "{}" $1' -

With those in place, I just just type git updatefixes master, and that will rebase all my branches, both locally and on github. Use at your own risk!

Thats all folks

To summarise, here is the whole of the alias section of my .gitconfig file:

[alias]
        cp     = cherry-pick
        ff     = merge --ff-only
        cpfix  = !sh -c 'git fetch -p origin && git checkout -b $1_$2 MOODLE_$2_STABLE && git cherry-pick origin/master..origin/$1 && git push origin $1_$2' -
        update = !sh -c 'git checkout MOODLE_$1_STABLE && git fetch moodle && git merge --ff-only moodle/MOODLE_$1_STABLE && git push origin MOODLE_$1_STABLE' -
        killbranch = !sh -c 'git branch -d $1 && git push origin :$1' -
        killmerged = !sh -c 'git checkout $1 && git branch --merged | grep -v $1 | xargs -I "{}" git killbranch "{}"' -
        updatefix = !sh -c 'git checkout $1 && git rebase $2 && git checkout $2 && git push origin -f $1' -
        updatefixes = !sh -c 'git checkout $1 && git branch | grep -v $1 | xargs -I "{}" git updatefix "{}" $1' -

There is limited documentation for this on the git config man page. There is more on the git wiki.

2 comments:

  1. If you like this, you might also be interested in Eloys git scripts: lovingit.

    ReplyDelete
  2. It's often my practice (in Mercurial, I'm afraid) to put the command into the comment, particularly for merges.

    $ hg merge shiny-new-thing # Good enough for now.
    $ hg commit -m'$ hg merge shiny-new-thing # Good enough for now'

    There's less benefit in doing this if I am using my own abbreviations for commands.

    ReplyDelete