Articles

Introduction to Version Control with GitHub

Why should I care?

Version control (also known as revision or source control) is a procedure for managing changes to files over time; essentially, an intelligent backup system, designed specifically for the management of source code. Version control provides a tidier and vastly more robust solution to saving multiple copies of files as changes are made, e.g. “readme.txt”, “readme.txt.bak”, “readme.txt.bak2″. Generally each project will have its own repository, storing all the code and assets for that project. For every file in the repository there is a full list of changes over time. Version control systems (VCSs) provide tools for interacting with these listings, allowing the user to revert some code to an older version or work on an experimental feature in an isolated environment. A large number of version control systems exist; popular examples include Bazaar, CVS, Mercurial and Subversion. However, this article focuses on using Git, due to its widespread adoption and the popularity of GitHub.

A breakdown of the advantages of using version control on your projects:

  • As each change to a file is stored, individual files and entire projects can be restored to any point in the history of the project.
  • Changes can be pushed to a remote repository, providing off-site backup.
  • An unlimited number of people can work on the same project, allowing for collaboration with any size of team.
  • Each change can be accompanied by a small explanation, allowing users to easily see what changes have been implemented.
  • Using branches it’s easy to work on larger experimental features in isolation and later merge the new code into the main branch painlessly.

Even when working without any other collaborators, the vast majority of developers still use version control to allow them to easily revert changes and experiment in isolation from the main working source tree.

Installation and Authentication

Git clients are available for Linux, Windows and Mac OS X; comprehensive tutorials on installing Git on each of these platforms are available from GitHub. GitHub primarily uses SSH for user authentication; as such, you’ll need to generate an SSH key and link it to your account in order to access repositories using Git. The installation tutorials detail how to go about generating and linking the keys on each of the platforms.

Getting Started

If you’re intending to work on an existing repository, you’ll need to perform a clone operation on it. This creates a local copy of the entire repository, including all the file history, allowing you to perform operations such as branching without having to connect to the remote repository. Git is one of several distributed version control systems, distinct from others in that each clone is created equal: at at a technical level there is no canonical or primary repository. In practice, one repository is often chosen as the canonical repository by the developers (when using GitHub, usually this is the GitHub repository, or ‘origin’ as it is called), but this isn’t recognised on a technical level.

Hardcore Forking Action

Fork a repository in GitHub
Fork a repository in GitHub

Before we can clone a repository we need to have a repository to clone. For this tutorial I’ve created a tutorial repository on GitHub, which is going to allow me to demonstrate exactly how you’d go about contributing to any open source project on GitHub, and in the process teach you how to use Git for your own projects.

When you visit the repository linked above you should see the ‘Watch’ and ‘Fork’ buttons shown in the figure to the right. Forking a project creates an exact copy of it — with the full change history and branches — on your own GitHub account. Forking a project gives you a remote copy that you can edit and push changes too, either so you can contribute changes back into the project you forked from, or so you can use the project as a basis for your own (within the license terms for that particular project!)

Now, go ahead and fork my project.

Cloning

Once GitHub has finished forking the project, you’ll be presented with an almost identical screen to the one you were on previously, however you’ll notice rather than ‘rossbearman/git-tutorial’, the repository title reads ‘your-name/git-tutorial’. Congratulations! You’ve forked the repository successfully, now let’s clone it to your local computer and make some changes!

 git clone git@github.com:{USERNAME}/git-tutorial.git

This first command will create a new directory called git-tutorial and clone your fork into it. GitHub automatically adds a remote called ‘origin’ that links to your fork of the repository; this is essentially an easy to remember name that points towards a remote repository. You’ll also want to manually add a remote to point to the repository you forked from, so you can pull any updates directly into your clone.

cd git-tutorial
git remote add upstream git://github.com/rossbearman/git-tutorial.git
git fetch upstream

Making our Changes

Branching

Branches are an essential tool in any Git workflow. A branch is (yet another) copy of the full source tree; however the idea with a branch is to work on new features or changes, whilst keeping them isolated from the working code. In this manner you can easily keep your main branch (‘master’) always compilable and continue to commit potentially broken changes to your new branch. You can have an unlimited number of branches on any project; for example you might have an ‘interface’ branch for working on a new interface and a ‘validation’ branch for working on some new validation code. Perhaps if you’re using an issue tracker you might create a new branch for fixing a specific bug, and use the ID of the bug in the branch name, i.e. ‘bug-2817′.

Vincent Driessen wrote an article proposing a complex but highly-scalable branching structure for software projects. It’s worth reading once you’ve got to grips with the basics of Git usage.

Although our tutorial repository is a trivial example, and won’t really gain anything from the use of a branch, we’ll create one to demonstrate how to use them. Our branch will be named ‘development’ and is created with the following line in the terminal:

git checkout -b development

The checkout command switches to a different branch, and the -b flag tells it to create the specified branch and then switch to it.

Finally we’ll just pull in any changes that have since been made to the ‘master’ branch on the original repository you forked from, using the remote we created in the last section. This ensures our new branch is up to date before we start making any modifications.

git pull upstream/master

Editing

Now that we have forked, cloned and branched, we can start editing. On a real project you might fix a bug, or implement a new feature, but for this tutorial we’ll just edit the README file and add a line of text to the end. It can be anything, your signature, a quote, or just some silly text.

After editing the file, run the status command and you should see that README is listed as having been modified.

git status

Another useful command for getting an overview of the changes is diff, this will show you a list of the changes to the files, and gives an insight into how the changes are stored by Git.

git diff

Submitting our Changes

Once you’re happy with your changes it’s time to add them to the staging area, commit them to your local repository, push them out to your remote repository and request a pull of your changes into the original repository.

Staging

The staging area is as it sounds, a temporary location (also known as the index) where you piece together all the elements for a single commit. To add changes to the staging area, use the add command. You can either specify the files individually, or use the . operator to add all files in the current location. To continue, run one of the following commands; on this small example they will both have the same result, adding yours changes to the README file to the staging area.

git add -A
git add README

When working with larger changes, potentially spanning multiple files, this staging system allows you to temporarily store your changes as you progress. You can then package all those changes into a single commit with a good description, resulting in a tidier, easier to manage history.

Committing

Let’s go ahead and commit the changes currently in our staging area, to our local repository. For this we use the commit command, optionally with the -m flag to provide a short commit message. If you don’t specify a commit message using the -m flag, your chosen text editor will open so you can enter a more detailed message.

git commit -m 'Added my signature to the tutorial README.'

Merging

Remember how we were making these changes in the development branch? Well now that we’ve finished what this branch was created to do, we can merge it back into the master branch and delete development. First we switch branch to the master.

git checkout master

Then we merge the changes from the development branch into our current branch.

git merge development

Finally we can delete the no longer needed development branch.

git branch -d development

Pushing

Now that our changes are merged into the master branch and stored in our local repository we can use the push command to send the commit to our forked repository on the GitHub servers.

git push origin master

If you now view your forked repository on GitHub you should see your commit message displayed at the top, and viewing the README file will reflect your changes.

Request a Pull

GitHub Pull Request
Send a pull request to the original repository

If you were working on your own repository, or building on top of a forked repository, this is where your interaction with Git would likely end. However if you’re contributing to another project, there’s one final step to get your changes merged into the original repository, the pull request.

A pull request simply notifies the repository maintainers that your fork of the repository has some changes they may want to merge into the official repository.

To make a pull request of your changes to the README file into the main git-tutorial repository, go to the page for your forked repository and click on ‘Pull Request’. The new page will allow you to write a message to accompany your request, once you’re done simply click ‘Send pull request’ and you’re done.

I’ll endeavour to merge all pull requests into the original repository as soon as possible, so you can see the final results of your work.

Future

Now you’re equipped with a basic understanding of how to use Git and GitHub to contribute to open source projects, I urge you to go and do so! Even if it’s simply improving the documentation, GitHub makes it trivially easy to do.

Further Reading

Git Quick Reference

Pro Git Online Book

  • Thank you very much. This is a wonderful tutorial.

    jebbo

    March 21, 2011

  • Very true! Makes a chaeng to see someone spell it out like that. :)

    Matee

    April 12, 2011

  • Cool! That’s a clevre way of looking at it!

    Zavrina

    April 12, 2011

Leave a comment  

name*

email*

website

Submit comment