Evaluation: Git

Last time I talked about Mercurial, and was generally disappointed with it. I also evaluated Git, another major distributed version control system (DVCS).

Short Review: Quirky, but a promising winner.

Git, like Mercurial, was spawned as a result of the Linux-BitKeeper feud. It was written initially by Linus Torvalds, apparently during quite a lull in Linux development. It is for obvious reasons a very Linux focused tool, and I’d heard that performance is poor on Windows. I was not optimistic about it being usable on Windows.

Installation actually went very smoothly. Git for Windows is basically powered by MSYS, the same Unix tools port that accompanies the Windows GCC port called MinGW. The installer is neat and sets up everything for you. It even offers a set of shell extensions that provide a graphical interface. Note that I opted not to install this interface, and I have no idea what it’s like. A friend tells me it’s awful.

Once the installer is done, git is ready to go. It’s added to PATH and you can start cloning things right off the bat. Command line usage is simple and straightforward, and there’s even a ‘config’ option that lets you set things up nicely without having to figure out what config file you want and where it lives. It’s still a bit annoying, but I like it a lot better than Mercurial. I’ve heard some people complain about git being composed of dozens of binaries, but I haven’t seen this on either my Windows or Linux boxes. I suspect this is a complaint about old versions, where each git command was its own binary (git-commit, git-clone, git-svn, etc), but that’s long since been retired. Most of the installed binaries are just the MSYS ports of core Unix programs like ls.

I was also thrilled with the git-svn integration. Unlike Mercurial, the support is built in and flat out works with no drama whatsoever. I didn’t try committing back into the Subversion repository from git, but apparently there is fabulous two way support. It was simple enough to create a git repository but it can be time consuming, since git replays every single check-in from Subversion to itself. I tested on a small repository with only about 120 revisions, which took maybe two minutes.

This is where I have to admit I have another motive for choosing Git. My favorite VCS frontend comes in a version called SmartGit. It’s a stand-alone (not shell integrated) client that is free for non commercial use and works really well. It even handled SSH beautifully, which I’m thankful about. It’s still beta technically, but I haven’t noticed any problems.

Now the rough stuff. I already mentioned that Git for Windows comes with a GUI that is apparently not good. What I discovered is that getting git to authenticate from Windows is fairly awful. In Subversion, you actually configure users and passwords explicitly in a plain-text file. Git doesn’t support anything of the sort; their ‘git-daemon’ server allows fully anonymous pulls and can be configured for anonymous-push only. Authentication is entirely dependent on the filesystem permissions and the users configured on the server (barring workarounds), which means that most of the time, authenticated Git transactions happen inside SSH sessions. If you want to do something else, it’s complicated at best. Oh, and good luck with HTTP integration if you chose a web server other than Apache. I have to imagine running a Windows based git server is difficult.

Let me tell you about SSH on Windows. It can be unpleasant. Most people use PuTTY (which is very nice), and if you use a server with public key authentication, you’ll end up using a program called Pageant that provides that service to various applications. Pageant doesn’t use OpenSSH compatible keys, so you have to convert the keys over, and watch out because the current stable version of Pageant won’t do RSA keys. Git in turn depends on a third program called Plink, which exists to help programs talk to Pageant, and it finds that program via the GIT_SSH environment variable. The long and short of it is that getting Git to log into a public key auth SSH server is quite painful. I discovered that SmartGit simply reads OpenSSH keys and connects without any complications, so I rely on it for transactions with our main server.

I am planning to transition over to git soon, because I think that the workflow of a DVCS is better overall. It’s really clear, though, that these are raw tools compared to the much more established and stable Subversion. It’s also a little more complicated to understand; whether you’re using git, Mercurial, or something else it’s valuable to read the free ebooks that explain how to work with them. There are all kinds of quirks in these tools. Git, for example, uses a ‘staging area’ that clones your files for commit, and if you’re not careful you can wind up committing older versions of your files than what’s on disk. I don’t know why — seems like the opposite extreme from Mercurial.

It’s because of these types of issues that I favor choosing the version control system with the most momentum behind it. Git and Mercurial aren’t the only two DVCS out there; Bazaar, monotone, and many more are available. But these tools already have rough (and sharp!) edges, and by sticking to the popular ones you are likely to get the most community support. Both Git and Mercurial have full blown books dedicated to them that are available electronically for free. My advice is that you read them.

Evaluation: Mercurial

I’ve been a long time Subversion user, and I’m very comfortable with its quirks and limitations. It’s an example of a centralized version control system (CVCS), which is very easy to understand. However, there’s been a lot of talk lately about distributed version control systems (DVCS), of which there are two well known examples: git and Mercurial. I’ve spent a moderate amount of time evaluating both, and I decided to post my thoughts. This entry is about Mercurial.

Short review: A half baked, annoying system.

I started with Mercurial, because I’d heard anecdotally that it’s more Windows friendly and generally nicer to work with than git. I was additionally spurred by reading the first chapter of HgInit, an e-book by Joel Spolsky of ‘Joel on Software’ fame. Say what you will about Joel — it’s a concise and coherent explanation of why distributed version control is, in a general sense, preferable to centralized. Armed with that knowledge, I began looking at what’s involved in transitioning from Subversion to Mercurial.

Installation was smooth. Mercurial’s site has a Windows installer ready to go that sets everything up beautifully. Configuration, however, was unpleasant. The Mercurial guide starts with this as your very first step:

As first step, you should teach Mercurial your name. For that you open the file ~/.hgrc with a text-editor and add the ui section (user interaction) with your username:

Yes, because what I’ve always wanted from my VCS is for it to be a hassle every time I move to a new machine. Setting up extensions is similarly a pain in the neck. More on that in a moment. Basically Mercurial’s configurations are a headache.

Then there’s the actual VCS. You see, I have one gigantic problem with Mercurial, and it’s summed up by Joel:

Whereas, in Mercurial, all commands always apply to the entire tree. If your code is in c:\code, when you issue the hg commit command, you can be in c:\code or in any subdirectory and it has the same effect.

This is an incredibly awkward design decision. The basic idea, I guess, is that somebody got really frustrated about forgetting to check in changes and decided this was the solution. My take is that this is a stupid restriction that makes development unpleasant.

When I’m working on something, I usually have several related projects in a repository. (Mercurial fans freely admit this is a bad way to work with it.) Within each project, I usually wind up making a few sets of parallel changes. These changes are independent and shouldn’t be part of the same check-in. The idea with Mercurial is, I think, that you simply produce new branches every time you do something like this, and then merge back together. Should be no problem, since branching is such a trivial operation in Mercurial.

So now I have to stop and think about whether I should be branching every time I make a tweak somewhere?

Oh but wait, how about the extension mechanism? I should be able to patch in whatever behavior I need, and surely this is something that bothers other people! As it turns out that definitely the case. Apart from the branching suggestions, there’s not one but half a dozen extensions to handle this problem, all of which have their own quirks and pretty much all of which involve jumping back into the VCS frequently. This is apparently a problem the Mercurial developers are still puzzling over.

Actually there is one tool that’s solved this the way you would expect: TortoiseHg. Which is great, save two problems. Number one, I want my VCS features to be available from the command line and front-end both. Two, I really dislike Tortoise. Alternative Mercurial frontends are both trash, and an unbelievable pain to set up. If you’re working with Mercurial, TortoiseHg and command line are really your only sane options.

It comes down to one thing: workflow. With Mercurial, I have to be constantly conscious about whether I’m in the right branch, doing the right thing. Should I be shelving these changes? Do they go together or not? How many branches should I maintain privately? Ugh.

Apart from all that, I ran into one serious show stopper. Part of this test includes migrating my existing Subversion repository, and Mercurial includes a convenient extension for it. Wait, did I say convenient? I meant borderline functional:

Subversion’s Python bindings are a prerequisite. The bindings (generated with SWIG) are installed separately on Windows, and can be found on http://subversion.tigris.org/ . Note that you can’t do this with the Win32 Mercurial binaries — there’s no way to install the Subversion bindings into its built-in Python library. So you’ll need to use a Mercurial installed on top of a stand-alone Python, and you may also need to do something like “set HG=python c:\Python25\Scripts\hg” to override the default Win32 binaries if you have those installed also. For Mac OS X, the easiest way is to install the CollabNet Subversion build, and then copy the content of /opt/subversion/lib/svn-python to the site-package directory of the python installation.

The silver lining is there are apparently third party tools to handle this that are far better, but at this point Mercurial has tallied up a lot of irritations and I’m ready to move on.

Spoiler: I’m transitioning to git. I’ll go into all the gory details in my next post, but I found git to be vastly better to work with.

Confirmed: Clear WiMax Bandwidth Throttling

Earlier this week, an Engadget post appeared reporting that Clear WiMax (aka 4G wireless) throttles bandwidth for users who have used a ‘substantial’ amount of bandwidth. The exact amount isn’t clear but 10GB appears as a common number. Once you trigger the throttle, you’re stuck at 0.25 megabit for an indeterminate amount of time. As it turns out, I’ve signed up for Clear and am still within their trial period. If there’s a cap, now would be the time to find it.

Some background: Clear is a WiMax service (formerly known as Clearwire), owned mostly by Sprint and with significant investment from Comcast. They allow you to sign up for ‘unlimited’ service at a fairly reasonable fee. Although they don’t promise specific speeds, they’ve clocked in at 10+ megabits for home service with a good signal, and mobile on the go service that can do about 5-6 megabits as well. They’ve been pushing it very hard here in Baltimore, which was one of the very first markets to get it (Portland being the other). I liked the idea and decided to try it.

I tested for a bandwidth cap by starting to download fairly substantial but still reasonable amounts of data on the third day of my service. The data was mainly HTTP downloads from TechNet/MSDN, but also included some low level torrent activity overnight. I started the testing on Wednesday, Sept. 29 2010. Today morning is Saturday, Oct 2 and I’ve hit the cap. Speedtest.net reports about 0.24 megabit. As it turns out, Clear has a site that reports how much bandwidth you’re using. Let’s see how I did!

Not so hot. The total is just north of 18GB and the throttle has kicked in. I don’t consider myself a heavy bandwidth user, and I think bandwidth caps (like Comcast’s 250GB) are completely reasonable. However, 18GB is simply ridiculous for a service that advertises itself as ‘unlimited’, especially when Clear’s customer service assured me earlier in the week that there is no throttling. This is a data level that would be easy to hit using video streaming services or digitally purchasing software (Steam!).

How does this compare to the Verizon ADSL service I already had? Pricing is the same at about $45 per month for home service, but Clear has me on a two year contract while Verizon is month-to-month. Speeds are about the same at about 6.5 megabits peak. I’ve never hit any kind of cap on Verizon’s service, whereas I crashed into Clear’s cap almost immediately.

I’ll be canceling my Clear service first thing Monday morning.

UPDATE: Take a look at this graph from Netflix:

See which ISP is on the bottom? Clearwire.