EatGit - Part 3 | DotBashHistory

Preface
Patch Files
Cherry Picking
Interactive Rebase
Stashes
Miscellaneous
- Bisect
- Blame
- Reflog
- Archive
Summary

Preface

In Part 1, we covered the essentials of Git, including how to create commits and branches and how to investigate the history of a project. In Part 2, we explored how and where Git stores its data structures. We examined what goes inside a commit and how branches are represented in Git. Additionally, we created a commit using Git’s Plumbing commands. In this part, I’ll wrap up our Git journey as a solo developer. We’ll delve into more advanced tools that we may need to use from time to time.

Patch Files

Imagine you’ve done some work and want someone else to review it. If they have access to your repository, they can use git diff and git log to see what you’ve done. However, in the open-source community, this isn’t always the case. Patch files offer a way to share your work without relying on branches, making them more flexible. You can store patch files on a USB stick or send them via email. Creating patch files is quite easy.

To demonstrate, let’s continue with our CNN anchors example. Since you may have conducted experiments in previous part, I will remove the directory and repopulate it with two branches.

If you seek examples with more substance, feel free to subscribe to one of the channels at vger.kernel.org and apply patches that contributors post. In addition to the patch files, you also need to clone the Linux kernel or Git repositories.

rm -rf cnn/
mkdir cnn
cd cnn/

git init

git config --local user.name 'Mohammad Rahimi'
git config --local user.email 'rahimi.mhmmd@outlook.com'

echo 'Anderson Cooper' >> employees.md
echo 'Jake Tapper' >> employees.md
echo 'Chris Wallace' >> employees.md

git add employees.md
git commit -m 'Add Cooper, Tapper and Wallace'

We create another branch and add some other employees. If you don’t get the meaning behind these changes, Google their names ;)

git checkout -b woke main

echo 'Abby Phillip' >> employees.md
git add employees.md
git commit -m 'Add Phillip'

echo 'Audie Cornish' >> employees.md
echo 'Zain Asher' >> employees.md
git add employees.md
git commit -m 'Add Cornish and Asher'

git log --oneline --all --graph
# * 2dcbac8 (HEAD -> woke) Add Cornish and Asher
# * 6bc2c0a Add Phillip
# * a48e1bb (main) Add Cooper, Tapper and Wallace

Now, to create patch files that include the changes made on our woke branch but are not present on the main branch:

git checkout woke
git format-patch main
# 0001-Add-Phillip.patch
# 0002-Add-Cornish-and-Asher.patch

cat 0001-Add-Phillip.patch
# From 6bc2c0aae0c3449b128a4f8434602dfe6dab9e1a Mon Sep 17 00:00:00 2001
# From: Mohammad Rahimi <rahimi.mhmmd@outlook.com>
# Date: Mon, 20 May 2024 09:10:33 +0800
# Subject: [PATCH 1/2] Add Phillip
#
# ---
#  employees.md | 1 +
#  1 file changed, 1 insertion(+)
#
# diff --git a/employees.md b/employees.md
# index 0d59cdb..6454a94 100644
# --- a/employees.md
# +++ b/employees.md
# @@ -1,3 +1,4 @@
#  Anderson Cooper
#  Jake Tapper
#  Chris Wallace
# +Abby Phillip
# -- 
# 2.45.GIT
#

Patch files are essentially text files. If you look inside one of them, you will see it is strikingly similar to the output of the git diff command. In fact, you can create a patch file by storing the output of git diff in a file:

# equivalent to 0002-Add-Cornish-and-Asher.patch
git diff 6bc2c0a 2dcbac8 > hand-made-patch

The format-patch command, in addition to the diff, includes the corresponding commit hash, author’s name, and timestamp of the modifications. You can email these patch files to repository maintainers. If they’re accepted, the maintainers will incorporate the changes into the main branch by committing them.

To apply a patch file onto a branch:

git checkout main

git apply 0001-Add-Phillip.patch

git status 
# On branch main
# Changes not staged for commit:
#   (use "git add <file>..." to update what will be committed)
#   (use "git restore <file>..." to discard changes in working directory)
#         modified:   employees.md
#
# Untracked files:
#   (use "git add <file>..." to include in what will be committed)
#         0001-Add-Phillip.patch
#         0002-Add-Cornish-and-Asher.patch
#         hand-made-patch
#
# no changes added to commit (use "git add" and/or "git commit -a")

As you can see, the changes are not committed yet. You can add them to the Index and then commit them, which will have your name as both the author and committer. Alternatively, you can use the following command to commit the patch file directly onto the main branch with the original author’s name:

git restore employees.md

git config --local user.name "Maintainer's name"

git am 0001-Add-Phillip.patch

git config --local user.name 'Mohammad Rahimi'

To see the above change:

git log --oneline --all --graph
# * 548e193 (HEAD -> main) Add Phillip
# | * 2dcbac8 (woke) Add Cornish and Asher
# | * 6bc2c0a Add Phillip
# |/  
# * a48e1bb Add Cooper, Tapper and Wallace

git cat-file -p 548e193
# tree 03e53233c394eb4559d98e1a10a14eac4044f726
# parent a48e1bb85df53651fbc173d2aba47ebe3db4a38b
# author Mohammad Rahimi <rahimi.mhmmd@outlook.com> 1716167433 +0800
# committer Maintainer's name <rahimi.mhmmd@outlook.com> 1716169076 +0800
#
# Add Phillip

To apply the second commit on the woke branch, using the hand-made-patch file:

git checkout main

git apply hand-made-patch

git add employees.md
git commit -m 'Add Cornish and Asher'

If you create a patch file using either of the two mentioned methods and send it to someone, they might not be able to apply the patch cleanly if their file has changed in the meantime. This is because the lines referenced in the patch file’s diff may have been deleted or moved. In such cases, Git will output an error message and won’t apply any of the patches.

# error: patch failed: <file>:<line>
# error: <file>: patch does not apply

To solve this problem, you need to update your main branch. Switch back to your feature branch and either merge it with or rebase it onto the main branch . Resolve any conflicts that arise and generate the patch files again. Merging and rebasing will be explained in Part 4.

Patch files can be very handy and come to your rescue in times of need. They also play a central role in contributing to major open-source projects. I recommend getting comfortable using them. This concludes my introduction to patch files.

Cherry Picking

If you’ve done some work in a feature branch that turns out to be unnecessary, you can still cherry-pick useful commits from that branch. Another use case for cherry-picking is when you fix a bug in one branch and need the same fix in another. Or perhaps you committed your work to the wrong branch by mistake. In all these scenarios, cherry-picking can help.

To avoid conflicts in our example project, let’s explore the last mentioned use case. We’ll create a commit but mistakenly place it on our woke branch. Conflicts and resolving them will be covered in Part 4.

git checkout woke

echo 'Michael Smerconish' >> employees.md
git add employees.md
git commit -m 'Add Smerconish'

git log --oneline --all --graph
# * f892560 (HEAD -> woke) Add Smerconish
# * 2dcbac8 Add Cornish and Asher
# * 6bc2c0a Add Phillip
# | * cf560f0 (main) Add Cornish and Asher
# | * 548e193 Add Phillip
# |/
# * a48e1bb Add Cooper, Tapper and Wallace

Now, let’s correct our mistake by cherry-picking the last commit to our main branch:

git checkout main
git cherry-pick f892560

git log --oneline --all --graph
# * c588da3 (HEAD -> main) Add Smerconish
# * cf560f0 Add Cornish and Asher
# * 548e193 Add Phillip
# | * f892560 (woke) Add Smerconish
# | * 2dcbac8 Add Cornish and Asher
# | * 6bc2c0a Add Phillip
# |/
# * a48e1bb Add Cooper, Tapper and Wallace

You can see the same commit from the woke branch now on the main branch.

Interactive Rebase

Rebasing is a procedure that affects the history of a branch. As mentioned several times before, changing history can lead to many complications if others share that history with you. That being said, it is the most effective tool for organizing your work and creating a clean history.

In this section, I will show you interactive rebasing. There is also a contribution workflow that involves rebasing, which is different from this one. Both use the git rebase command, but interactive rebasing also uses the -i flag. I will explain merging and rebasing in the next part.

With interactive rebasing, you can specify a range of commits and an action at each commit. The command goes through each one and applies the action. I typically use interactive rebasing to combine my commits on feature branch into one before publishing my work. This way, as time passes, the project will maintain a clean history. You won’t see commit messages like “Fix minor mistake” or “Rename variable” after a meaningful commit like “Fix bug report #6623”.

In our example project, before publishing our main branch, let’s combine the first two commits together (to minimize the chance of getting canceled ;)).

git checkout main
git rebase -i --root

The git rebase -i command expects a commit and will list all the commits from HEAD up to but not including that commit. Because of this behavior, you cannot edit the initial commit. That’s why I used the --root flag. The list is ordered from old at the top to new at the bottom.

Edit the file and change the action at the second commit from pick to squash . Save and close the editor. The opened file also includes instructions on commands that you can use at each step.

pick a48e1bb Add Cooper, Tapper and Wallace
squash 548e193 Add Phillip
pick cf560f0 Add Cornish and Asher
pick c588da3 Add Smerconish

The interactive rebase will show you one more editor for changing the commit message after combining the first two commits. Provide a meaningful message.

Our new history should look like the following:

git log --oneline
# 08206a8 (HEAD -> main) Add Smerconish
# 301703c Add Cornish and Asher
# cd0bbc3 Add Cooper, Tapper, Wallace and Phillip

Although it can be scary, I recommend using interactive rebase in your daily tasks. Keeping the history clean should be something that all team members strive for.

Stashes

Imagine you are working on a feature when your boss informs you that a customer reported a bug after your last commit was published. You need to switch back to the main branch immediately and start investigating if the bug is caused by your changes. However, you have already made significant progress in your feature branch, but it is not finished or in a state that deserves its own commit. If you don’t want to lose your changes and also do not want to commit them, you can stash the changes. Each stash contains uncommitted work, whether it is in the Working Tree or the Index. Let’s go through a slightly more complicated scenario.

You are on a feature branch, refactoring some code. Your boss requests that you postpone the refactoring and instead write some additional tests for your recent module. To switch to another branch without losing the work you’ve already done:

git checkout -b refactor main

# do some work, stage and do more work

# boss asks for tests
git stash push --all -m 'Refactor code'
git checkout -b add-tests main

# write some tests, stage and write more

While you are working on the tests, a customer reports a critical bug in the latest release, and you are assigned to fix it. To save your current work before committing:

git stash push --all -m 'Add tests for module'
git checkout -b fix-bug main

# fix the bug, then
git commit -m 'Fix bug'

Now go back to writing tests:

git checkout add-tests
git stash pop --index

# finish rest of the tests, then
git commit -m 'Add tests for module'

After writing the tests, continue refactoring the code:

git checkout refactor
git stash pop --index

It is possible to stash your work on one branch and apply it to another branch. Additionally, you can list all your stashes and apply any specific stash:

git stash list
# stash@{0}: On add-tests: Add tests
# stash@{1}: On refactor: Refactor code

git stash apply --index stash@{1}

To remove all the stashes use git stash clear.

If any of the modified or staged files have a different revision on the other branch, the git checkout command will result in an error stating that your modified files need to be committed or stashed before switching. However, if checking out another branch won’t overwrite any of those files, switching between branches will maintain the state of both the Index and the Working Tree.

Miscellaneous

In this section, I’ll introduce some Git tools that you might not use frequently but are nonetheless useful.

Bisect

The git bisect command is used to find the exact commit that introduced a bad behavior. You begin by providing this command with a commit that has the issue and a commit that you know has been good. To start the binary search in the it-tools repository that we acquired in Part 1:

# commit with the tag v2024.5.10-33e5294 is bad
# commit with the tag v2023.12.21-5ed3693 is good
git bisect start v2024.5.10-33e5294 v2023.12.21-5ed3693
# Bisecting: 12 revisions left to test after this (roughly 4 steps)
# [a07806cd15fdbd24c88afaf618a2d0c16d66bb3f] refactor(home): lightened tool cards (#882)

You can tag a commit with an explanation, which is usually used to mark a release version. The command can be something like:

git tag -a v2.0 abc123 -m "Version 2.0 LTS"

If you run git log --oneline --all, you will see that the git bisect command has positioned HEAD at a commit in the middle of both tags. You can examine the code and determine whether the issue exists or not. If it is a good commit, continue with:

git bisect good
# Bisecting: 6 revisions left to test after this (roughly 3 steps)
# [221ddfa75c5731d7a5dc1f0b03663ba4fd9e7965] fix(language): English language cleanup (#1036)

And if it is a bad commit:

git bisect bad
# Bisecting: 2 revisions left to test after this (roughly 2 steps)
# [23f82d956a8af21e176f7268c9414244168bd4eb] fix(bcrypt tool): allow salt rounds up to 100 (#987)

You continue investigating commits until Git tells you that it has found the commit that introduced the issue. You can stop the search midway with git bisect reset.

If you have automated tests and some of them started to fail after an unknown change, you can use git bisect run my_script arguments to automatically find the problematic commit. The my_script or any other executable file should exit with code 0 if the current source code is good and exit with a code between 1 and 127 (inclusive), except 125, if the current source code is bad.

Blame

The git show command is used to display detailed information about a specific commit, including the files it altered. Alternatively, you can use the git blame command to see which commit introduced each line of a file. If you find a bug and want to blame someone for it, you can use git blame to trace each line back to its corresponding commit. To see which commit introduced each line of code:

git blame employees.md
# ^cd0bbc3 (Mohammad Rahimi 2024-05-20 09:00:20 +0800 1) Anderson Cooper
# ^cd0bbc3 (Mohammad Rahimi 2024-05-20 09:00:20 +0800 2) Jake Tapper
# ^cd0bbc3 (Mohammad Rahimi 2024-05-20 09:00:20 +0800 3) Chris Wallace
# ^cd0bbc3 (Mohammad Rahimi 2024-05-20 09:00:20 +0800 4) Abby Phillip
# 301703c3 (Mohammad Rahimi 2024-05-20 19:05:45 +0800 5) Audie Cornish
# 301703c3 (Mohammad Rahimi 2024-05-20 19:05:45 +0800 6) Zain Asher
# 08206a8e (Mohammad Rahimi 2024-05-21 07:05:39 +0800 7) Michael Smerconish

The ^ symbol next to a commit hash indicates that the line of code was created by the very first commit that introduced the file into the repository.

Reflog

Whenever there is a change to HEAD or any other references in Git, the reference log is updated. For example, when you switch branches, a new entry is added to this log. This entry indicates the commit hash that HEAD now points to:

git checkout 301703c
git reflog
# 301703c (HEAD) HEAD@{0}: checkout: moving from refactor to 3017
# 08206a8 (refactor, main, add-tests) HEAD@{1}: checkout: moving from add-tests to refactor

When attempting to untangle branches in your Git history, mistakes can lead to losing commits. Without the hash of the lost commit, recovering it becomes impossible. However, git reflog can help retrieve the lost hash. As long as you have the hash of the last commit on a branch, you can do anything, except running the garbage collector in Git. I won’t cover the garbage collection command in this tutorial.

To create a new branch from where HEAD used to be 3 modifications before:

git reflog -3
git checkout -b recovered HEAD@{2}

or to search for a lost branch:

git reflog | grep -E "<key-word>"

You can improve the git reflog output with the following:

git config --local alias.flog 'reflog --pretty=format:"%C(White)%gd %C(Yellow)%h %C(Cyan)%ch %C(Green)%an %C(magenta)%gs %C(White)%<(80,trunc)%s"'
git flog

As a side note, there are two other syntaxes involving HEAD in Git. One of them is HEAD~, which refers to the parent commit of HEAD, while HEAD~2 refers to the grandparent commit of HEAD. The other syntax, HEAD^, is less commonly used and is primarily used to refer to the first parent of a merge commit. We will explore merging in more detail in Part 4.

Summary

In this part, we explored some useful tools offered by Git that can come in handy in your daily use. We learned how to create patch files, cherry-pick commits, and clean up our history with interactive rebase. Additionally, we used stashes to temporarily store work that isn’t ready to be committed yet, allowing us to start new tasks. We also introduced some other less common tools.

In Part 4, we’ll delve into what Git offers in a collaborative environment. We’ll take on the roles of different team members and explore how they can work together to maintain a well-structured, functional codebase.