Version Control (Beta)

1 What is version control and what can it do for you?

Version control is a system that records changes to a file or set of files over time so that you can recall specific versions later. You know how Microsoft Word will periodically save your file, and you can recover previous versions if you really mess up? That is a simple form of version control. Other forms offer far more functionality, including the ability to save and recover versions of an entire project, tools for collaboration, and more.

Some commonly used version control systems are Git, Mercurial, and Subversion (SVN). Here at RFF, we strongly advise using Git, which we will discuss for the remainder of the section. We also recommend using GitHub to host Git repositories online. Here are some of the benefits of using Git with GitHub for version control:

  • Grants peace of mind knowing work is stored safely and can be recalled with minimal effort, which gives greater liberty to test out new ideas (even when a computer breaks!)

  • Facilitates collaboration by allowing multiple versions to coexist and efficiently borrow code from one another

  • Enables efficient review and discussion of code changes before incorporating them into the main branch of the repository

  • Documents when, why, and by whom specific changes were made, which helps with debugging and troubleshooting

  • Consolidates and Organizes project code and documentation in the same, easily-accessible, backed-up location.

  • Preserves easy web browser access to all versions of code.

  • Provides public access to datasets and code under the auspices of a software license

  • Empowers audiences to answer their own questions about the data and methods used in publications, reducing overhead for researchers who would otherwise be given that task

This section of the guidance will help you get started in making version control a standard part of the project life cycle.

2 Version Control Basics

Before we get started in using Git for version control, let’s cover a few basic principles. Git is a popular version control system, which is software that we can download and use on our computer. Git works by monitoring the changes of the contents of an otherwise ordinary file folder, called a repository. In a Git repository, a user can tell Git which file changes to keep, which to discard, and can label those changes, go back to previous file versions, and much more!

While it is possible to use Git only on your computer without posting it online, it is often advisable to host repositories online so that they are synced in the cloud, making it easy to share them and back them up. There are many websites that can host Git repositories, but the most popular is called GitHub, which is what we recommend using at RFF. (Some others you may come across GitLab and BitBucket) Not only does GitHub host these repositories, but it also provides convenient ways to host documentation, discuss code changes, and report bugs.