Assignment 3: Source Control

Getting started with Git

Source control, also known as version control, is one of the most fundamental tools for developers at every experience level. In modern software development, teams routinely collaborate on shared codebases where changes from multiple contributors must be tracked systematically. Without this discipline, projects risk conflicting modifications, unrecoverable work loss, and significant challenges in tracing bug origins. Individual developers also gain substantial value through maintainable code histories and easy restoration of prior versions.

Tools like git have become indispensable across all technology sectors for managing these workflows, yet it remains concerning how frequently new graduates know nothing about these tools until entering a professional role. You have been using git to submit your lab and project work throughout this course. Today, we will cover the essential principles of source control and teach you how to use git to effectively manage and save your source code independently.

§1 Why Source Control? #

Source control helps manage the evolution of codebases (or any text) over time. Here’s why it’s important:

  1. Tracking Changes: Every time code is modified, source control systems track what was changed, who changed it, and when it was changed. This allows developers to trace the history of a project and revert to previous versions when necessary.
  2. Collaboration: Source control systems enable multiple developers to work on the same codebase simultaneously without overwriting each other’s work. They manage the merging of changes and help resolve conflicts when two people modify the same part of the code.
  3. Backup and Recovery: Code is always saved in a central or distributed repository. Even if your computer fails, your code is safe, and you can recover it from the repository.
  4. Branching and Experimentation: Developers can easily experiment with new features or bug fixes without affecting the main project. Once changes are stable, the changes can be merged into the main project.

§2 Distributed vs Centralized Source Control #

There are two main types of version control systems:

Centralized Version Control Systems (CVCS):

Distributed Version Control Systems (DVCS):

§3 Source Control Software Comparison #

§3.1 Git #

Git is a distributed version control system (DVCS) known for its excellent performance and scalability, ideal for projects of all sizes. Its lightweight branching and decentralized collaboration model enable flexible workflows, making it the go-to for open source and enterprise projects alike. Although Git has a learning curve, it integrates well with platforms like GitHub and GitLab, providing powerful tools for code review and CI/CD. It is widely used in agile development and distributed teams.

§3.2 Perforce (Helix Core) #

Perforce is a paid centralized version control system (CVCS), renowned for handling large-scale projects with large binary files, especially in game development and multimedia. Its centralized model supports sophisticated branching (streams) and excels in performance and scalability for enterprise environments. Although setup is more complex, it’s highly efficient once configured. Perforce integrates well with CI/CD tools and is commonly used for projects requiring high performance with large files.

§3.3 PlasticSCM #

PlasticSCM is a paid distributed or centralized version control system, optimized for handling large teams and files, particularly in game development. Its visual branching and merging tools make complex workflows easier to manage. PlasticSCM integrates seamlessly with Unity and Unreal Engine, making it popular in game development. Though setup can be complex, its intuitive visual tools are well-suited for large, collaborative projects.

§3.4 Subversion (SVN) #

Subversion is a centralized version control system suitable for small to mid-sized projects. While it supports branching and merging, it can become cumbersome with large projects. SVN’s centralized model is preferred by teams that need a single authoritative repository. It remains in use for legacy systems and offers easier adoption for beginners, though it lacks the scalability and flexibility of newer systems like Git.

§3.5 Mercurial #

Mercurial is a distributed version control system that offers simpler branching and merging than Git, with a more linear approach. While it handles large projects well, it has been overshadowed by Git in terms of adoption and ecosystem support. Mercurial is easier to learn and offers a clean interface, making it appealing for teams with simpler workflows. However, it is less commonly used today, limiting its community and integration options.

§4 Git: The Industry Standard #

In this assignment, we will focus on Git, the most widely used version control system in modern software development.

Git Advantages:

§4.1 Basic Concepts in Git #

Repository: A repository (or “repo”) is where your project’s source code and its history of changes are stored. In Git, a repository can be local (on your machine) or remote (on a server like GitHub).

Commit: A commit is a snapshot of your code at a given point in time. Each commit in Git has a unique identifier (a SHA256 hash) and typically includes a message describing the changes associated with that commit.

Branch: A branch is a parallel version of your codebase that allows you to work on a feature or fix a bug without affecting the main codebase. Once the work is complete, the branch can be merged back into the main project line.

Merge: Merging is the process of integrating changes from one branch into another. Conflicts can arise when the same part of a code file is modified in two different branches, and they must be resolved before the merge can be completed.

Pull and Push: In Git, “pull” is the operation to bring changes from a remote repository into your local repository, while “push” sends your local changes to a remote repository.

Clone: Cloning a repository creates a local copy of a remote repository on your machine, allowing you to work on it independently.

§5 Git CLI Tutorial #

Git is incredibly beneficial, even when you’re working on projects alone. It allows you to maintain a detailed history of all changes made to your code, providing a safeguard if something goes wrong. If you accidentally introduce a bug or break functionality, Git makes it easy to roll back changes or retrieve earlier versions.

§5.1 Preparation #

You are going to do this assignment on your own computer. You will need to access a terminal to complete this work:

MacOS/Linux Users: You can simply use the built in terminal.app.

Windows Users: You will have the best experience by using linux, you can install Windows Subsystem For Linux (WSL) and use that. Install instructions here: https://learn.microsoft.com/en-us/windows/wsl/install

After you have your terminal open, create a new directory to work from:

mkdir -p git-assignment
cd git-assignment

Also create a file to use with git (paste this whole block in the terminal):

echo '#include <iostream>

int main() {
    std::cout << "Hello, world!";
}' > main.cpp

§5.2 Installation #

Git can be installed by following the installation guide for your operating system here: https://git-scm.com/book/en/v2/Getting-Started-Installing-Git

§5.3 Setting up Git #

Before you start using Git, you need to configure your user information. These commands set your name and email address, which will appear in your commits.

git config --global user.name "Your Name"
git config --global user.email "your.email@example.com"

§5.4 Initializing a Git Repository #

To start tracking a project with Git, you first need to initialize a repository.

In the terminal, make sure you are inside your project folder (e.g. cd ~/git-assignment), then run:

git init

This will create a hidden .git directory in your current directory, this folder contains the full history of your project’s changes, along with metadata.

§5.5 Checking the Status of Your Repository #

To see which files have been changed, added, or deleted in your project, run:

git status

This command shows the status of your working directory and staging area. This will list any new, modified, or deleted files since your last commit. After initializing a new project, all files will show up here as “new”.

$ git status
On branch main
Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git restore <file>..." to discard changes in working directory)
        modified:   main.cpp

Untracked files:
  (use "git add <file>..." to include in what will be committed)
        main.cpp

no changes added to commit (use "git add" and/or "git commit -a")

here, a new file named main.cpp is found but is not yet tracked by git.

§5.6 Adding Files to Staging #

When you are ready to record your file changes to the history, you need to “stage” your files. Run:

git add <filenames>

You can also add all files that have been modified or newly created by using:

git add .

§5.7 Committing Changes #

Once files are staged, you can “commit” them to the repository, this will store all staged files to the project history, along with a message.

git commit -m "Commit message here"

This command saves the changes in the staging area along with a message ideally describing the changes, the message can contain anything.

§5.8 Viewing Commit History #

To view the commit history for your repository, run:

git log

You can add options like --oneline to display a more compact version:

git log --oneline

§5.9 Restoring Changes #

If you make any changes that you want to undo, you can restore your project to the last commit with:

git reset --hard

You can also restore to any point in your git history with git checkout <commit-hash>, the <commit-hash> can be found with git log or git log --oneline:

git checkout <commit-hash>

Example (using only the first ~8-10 characters of a hash is allowed):

git checkout 5284eac3

§5.10 Checking Differences #

Git allows you to view a differential log of what has changed between any commits in the history

Compare the current files with a specific version:

git diff <commit-hash>

Compare the two specific versions with each other:

git diff <commit-hash1> <commit-hash2>

You can also view what has changed between the current working directory and the previous commit with simply: git diff.
Example:

$ git diff
diff --git a/main.cpp b/main.cpp
index 44cc6ab..83ffc6b 100644
--- a/main.cpp
+++ b/main.cpp
@@ -1,5 +1,5 @@
 #include <iostream>
 
 int main() {
-    std::cout << "Hello, world!";
+    std::cout << "Hello, Class!";
 }

Diffs can be very useful for understanding what has changed, you should “commit” any time you have working code or are about to make potentially breaking changes. If you run into trouble and can’t get your code to work, you can easily compare what has changed or restore to a previously working state with git.

§5.11 Reverting Changes #

Sometimes you need to undo work that have already been committed.
The safest way is to use git revert, which creates a new commit to undo the changes without erasing history.

git revert <commit-hash>

Example (using only the first ~8-10 characters of the hash):

git revert 5284eac3

After the revert finishes, git log will show both the original commit and the new “Revert …” commit, keeping the repository history intact.

§5.12 More Git Tutorials #

This was a very basic guide to help you get started with the basics of Git. For more in-depth learning, check out these much more comprehensive beginners guides:

§6 Assignment #

Objective: Familiarize yourself with basic Git operations like initializing a repository, adding files, committing changes, viewing commit history, and viewing changes.

Instructions:

  1. Create a new directory and initialize it as a Git repository.
  2. Create a README.md file that contains just your name and student id.
  3. Stage and commit the new README.md file.
  4. Add something new to the README.md file, then stage and commit your new changes, repeat this a few times.
  5. Use git log to view the commit history, there should be a minimum of 4 entries.
  6. View what has changed in your project with git diff <target-commit>

Deliverables: Submit a screenshot of the output of your commit history (git log) and the diff between your first commit and the latest commit (git diff <first-commit-hash>).