I came across “The Thing About Git” on my tweet can for Git today and it has led me down this 2 hour path of reading about Git features that I have never used. I came across this insight for myself about version control and history that I have before now missed because I have only used VCSs like CVS or SVN in the most primitive ways. To everybody this should be a duh but here is the description that did it for me (from here):
The typical development workflow goes something like this:
- Checkout copy of upstream code base.
- Implement feature X.
- Commit.
- Implement independent feature Y.
- Commit.
- Implement independent feature Z.
- Commit.
- Push new features back upstream.
Now, what really happens is that when I’m implementing Y or Z, I’ll realize that I made a mistake in X. The trick is then fixing X so that my fix is part of the changeset/patch for X that ultimately gets pushed upstream in the last step. That way, the upstream folks will see only a single, clean patch for feature X – not a mishmash of patches that together represent X.
When I read this it clicked. For some reason it never occurred to me the importance of a clean history to a revision control system. On the projects I work on I never take the time to go back and fix X in the example below. I commit all types of missteps and kick off a build or stop working when it looks like everything is working from my unit tests. I have tons of commits before Z which reverse previous changes or are simply wrong. I wouldn’t even know how to clean up history in Perforce or SVN. And, gasp, I have never submitted or received or applied a patch.
I am now starting to understand how many of gits features are oriented around easily working on lightweight topical branches and then merging those branches into a clean history of exactly those patches that make up a fix or feature. There is really a de-facto standard in the open-source community that let’s you communicate a series of commits or patches to others consumers of your repository (update, I am so stupid never realizing that in a clean history the patches are the commits themselves).
Git has all types of tools to help with all this stuff including:
Anyway, I am just now starting use these tools but at least now I understand the motivation for them, a nice clean version history for others to consume when you send them your changes. I am now motivated to keep a cleaner history on my personal projects. It’s almost like coding for a while and creating a bunch of spaghetti code and then refactoring and cleaning things up so they can easily be used later and used by others. Git allows a model of development that lets you just go at it make a big mess, and then easily clean it up and make it neat again. And most important allow your work to be easily consumed and built on by others. I never realized how important that was with version control as it is with code.
Now I want to be on a more collaborative project so that I use these techniques of mailing patches, pushing, pulling, cherry picking. On my Adobe project we are using Perforce, yuck. I am going to sneak Git in.