EatGit - Part 2
Preface
In Part 1, we covered the essentials of Git, so now we know how to create commits and branches, and how to investigate the history of a project. In this part, we’ll take a look behind the scenes and create a commit using Git Plumbing commands. Git commands are divided into two sets: Porcelain and Plumbing. Porcelain commands are the ones you use for everyday tasks, while Plumbing commands are the lower-level commands that Git combines to perform higher-level tasks.
Objects
Here, I recommend deleting the example project from Part 1 because you may have been experimenting with it, resulting in many objects in your .git directory. Let’s recreate the example project with only one commit:
rm -rf cnn
mkdir cnn && cd cnn
echo 'John Berman' >> employees.md
echo 'Dana Bash' >> employees.md
git init
git config --local user.name 'Mohammad Rahimi'
git config --local user.email 'rahimi.mhmmd@outlook.com'
git add employees.md
git commit -m 'Add Berman and Bash'
# 130a62108af843b46b9f90a588da8bf056a28778
Note that your commit hashes will be different from mine. Let’s see if we can find our commit among the Git object files. To list the contents of the .git/objects/ directory:
tree .git/objects/
# .git/objects/
# ├── 13
# │ └── 0a62108af843b46b9f90a588da8bf056a28778
# ├── 74
# │ └── 5b5efa7d5acb852d11cf14bc3a07c494bee605
# ├── 98
# │ └── 6e14542dcfbc7a3d13f16f8a8aeabbcda326de
# ├── info
# └── pack
#
# 6 directories, 4 files
You might need to install tree
:
sudo apt-get install tree
Git distributes objects into subdirectories named after the first two characters of their hash. This helps speed up finding objects. Our commit is in the .git/objects/13 directory. Git provides a Plumbing command for inspecting objects. To inspect type of an object:
git cat-file -t 130a
# commit
You don’t need to use the entire hash, just enough characters for Git to uniquely identify the object. Note that auto-complete does not work for Plumbing commands because they are not intended for end users. The command above will confirm that the object is indeed a commit. To pretty-print the content of an object:
git cat-file -p 130a
# tree 986e14542dcfbc7a3d13f16f8a8aeabbcda326de
# author Mohammad Rahimi <rahimi.mhmmd@outlook.com> 1716062920 +0800
# committer Mohammad Rahimi <rahimi.mhmmd@outlook.com> 1716062920 +0800
#
# Add Berman and Bash
There are two pieces of information here that I want to highlight. First, you
can see both the author and the committer, each with an associated timestamp.
This is because you can make changes and send them to someone else to commit,
and they might commit those changes a few days or months later. The second piece
of information is tree 986e14542dcfbc7a3d13f16f8a8aeabbcda326de
. The tree
represents the structure of your directory when you authored the changes. This
tree is also an object in Git that we can inspect. Your hash should be the same
as mine for this tree object.
git cat-file -t 986e
# tree
git cat-file -p 986e
# 100644 blob 745b5efa7d5acb852d11cf14bc3a07c494bee605 employees.md
This means that after the changes, our directory structure contained only one file named employees.md. Let’s inspect further:
git cat-file -t 745b
# blob
git cat-file -p 745b
# John Berman
# Dana Bash
Now you can see the file content. A Git blob (binary large object) is the object type used to store the contents of each file in a repository.
Initial commits are different because they don’t have any parent commits before them. Let’s create another commit and see how Git stores our changes.
# to make the change
echo 'Wolf Blitzer' >> employees.md
# to create the commit
git add employees.md
git commit -m 'Add Blitzer'
# to see the last changes
git show
# commit 7c214b3f0f4a7a9feeecedd6804ce692e4452f5d (HEAD -> main)
# Author: Mohammad Rahimi <rahimi.mhmmd@outlook.com>
# Date: Sun May 19 04:15:27 2024 +0800
#
# Add Blitzer
#
# diff --git a/employees.md b/employees.md
# index 745b5ef..9ead660 100644
# --- a/employees.md
# +++ b/employees.md
# @@ -1,2 +1,3 @@
# John Berman
# Dana Bash
# +Wolf Blitzer
# to show the history
git log --oneline --all --graph
# * 7c214b3 (HEAD -> main) Add Blitzer
# * 130a621 Add Berman and Bash
The object we’re interested in is our last commit, identified by 7c214b3.
git cat-file -p 7c21
# tree 081fb438ab187f1fd3f82d196bce3bc42c2b4d52
# parent 130a62108af843b46b9f90a588da8bf056a28778
# author Mohammad Rahimi <rahimi.mhmmd@outlook.com> 1716063327 +0800
# committer Mohammad Rahimi <rahimi.mhmmd@outlook.com> 1716063327 +0800
#
# Add Blitzer
You can see that the parent has our previous commit hash. If you change something in employees.md file before making the first commit, the file hash changes. Consequently, the commit hash changes as well, and the parent commit in the following commits also changes. Thus, every commit hash in your repository will change. If your colleagues or other open-source contributors based their work on your commit, meaning they have a parent commit somewhere that refers to a commit you made, and you change something in the history, those who have your commit hash in their repository will be left disconnected from your changes , and you will no longer share the same past. Furthermore, there will be a new group of people who will base their work on your new hashes. So, you cannot go back either. This can cause a divergence that’s very difficult, if not impossible, to fix. That’s why it’s advised not to change what you have published. Git heavily relies on commit hashes to track changes in a repository and determine what actions need to be taken.
Branches
Here, I intend to show you how branches are represented internally in Git. We already have a branch called main. Let’s create another branch and add a commit to it. To create a branch called feat from HEAD:
git checkout -b feat HEAD
git log --oneline --all --graph
# * 7c214b3 (HEAD -> feat, main) Add Blitzer
# * 130a621 Add Berman and Bash
HEAD is a pointer to a commit or a branch and usually points to the last commit made on the checked-out branch. We will see that shortly. To create a commit on the current branch:
echo 'Jamie Gangel' >> employees.md
git add employees.md
git commit -m 'Add Gangel'
git log --oneline --all --graph
# * 94e59e2 (HEAD -> feat) Add Gangel
# * 7c214b3 (main) Add Blitzer
# * 130a621 Add Berman and Bash
Now we have two branches that differ by one commit. First things first, what is HEAD?
cat .git/HEAD
# ref: refs/heads/feat
This means that you are currently on the last commit of a branch called feat. If you make any changes, stage them, and create a commit, the parent of that commit will be what refs/heads/feat points to. Let’s see where refs/heads/feat is pointing to:
cat .git/refs/heads/feat
# 94e59e239a86cff253166be935bf96d626912344
It should be familiar to you. It is the last commit we made after switching to the feat branch. Let’s see where .git/refs/heads/main is pointing to:
cat .git/refs/heads/main
# 7c214b3f0f4a7a9feeecedd6804ce692e4452f5d
It points to the second commit that we made in this repository. At that time, we were on the main branch.
As you can see, branches are nothing but pointers to commits. Since each commit also points to its parent(s), you can have multiple lines of work in your repository simultaneously and switch between them. To switch back to the main branch:
git checkout main
We can also check out a commit without creating a new branch. This state is called a Detached HEAD because HEAD is not pointing to a branch. Occasionally, you may need to check out a specific commit to investigate if it contains an issue. To check out a commit, use the following command:
git checkout 130a621
# Note: switching to '130a621'.
#
# You are in 'detached HEAD' state. You can look around, make experimental
# changes and commit them, and you can discard any commits you make in this
# state without impacting any branches by switching back to a branch.
#
# If you want to create a new branch to retain commits you create, you may
# do so (now or later) by using -c with the switch command. Example:
#
# git switch -c <new-branch-name>
#
# Or undo this operation with:
#
# git switch -
#
# Turn off this advice by setting config variable advice.detachedHead to false
#
# HEAD is now at 130a621 Add Gangel
cat .git/HEAD
# 130a62108af843b46b9f90a588da8bf056a28778
The purpose of this section was to help you understand that branches are nothing but pointers to commits. This fact becomes important when, in Part 5, I explain how to fix common issues that arise when working with Git.
A Hand-made Commit
To create a commit using Plumbing commands, ensure that nothing is staged. First, let’s add a change:
git checkout main
echo 'Brianna Keilar' >> employees.md
Then, we add this change to the index. The index is a binary file located at .git/index.
git update-index employees.md
git status
git diff --staged
We need our directory structure to be recorded in the commit. Let’s create a tree object from the Index:
cat employees.md
# John Berman
# Dana Bash
# Wolf Blitzer
# Brianna Keilar
git hash-object employees.md
# c3638757c5a8d73d8390acb58d4154d45a8f699d
git write-tree
# 758f284a55059c6f03464c345ab0e451cd53a833
git cat-file -p 758f
# 100644 blob c3638757c5a8d73d8390acb58d4154d45a8f699d employees.md
First, I printed the content of employees.md. Then, I calculated its SHA256
hash. This hash, with the same content, should always be the same. If you use
the -w
flag with git hash-object
, Git will actually write the object into
its internal files. Then, I created a tree object. This tree object will be
associated with our commit. Your tree object should also have the same hash as
mine.
Now is the time to create the actual commit:
git commit-tree 63bce7 -p 181590f -m 'Add Keilar'
# 1f2ebc318742f42f86a33db0948f37b724106b87
git log --oneline --all --graph
# * 94e59e2 (feat) Add Gangel
# * 7c214b3 (HEAD -> main) Add Blitzer
# * 130a621 Add Berman and Bash
We will not see our commit in the git log
output because our commit has no
reference pointing to it. In other words, it is not on any branch. In the
git commit-tree
command, I used my first commit as the parent, so it cannot
be associated with any existing branches, as its history is divergent. To create
a new branch from that commit:
git update-ref refs/heads/new-feat 1f2ebc3
git log --oneline --all --graph
# * 1f2ebc3 (new-feat) Add Keilar
# | * 94e59e2 (feat) Add Gangel
# | * 7c214b3 (HEAD -> main) Add Blitzer
# |/
# * 130a621 Add Berman and Bash
git show new-feat
# commit 1f2ebc318742f42f86a33db0948f37b724106b87 (new-feat)
# Author: Mohammad Rahimi <rahimi.mhmmd@outlook.com>
# Date: Sun May 19 04:36:52 2024 +0800
#
# Add Keilar
#
# diff --git a/employees.md b/employees.md
# index 745b5ef..c363875 100644
# --- a/employees.md
# +++ b/employees.md
# @@ -1,2 +1,4 @@
# John Berman
# Dana Bash
# +Wolf Blitzer
# +Brianna Keilar
You can also open a text editor and put the commit hash in refs/heads/new-feat . However, this is not recommended because we should not alter any file in the .git directory without Git’s knowledge.
Summary
In this part, we explored Git’s Plumbing commands, inspected the .git directory, created blob, tree, and commit objects, and gained an understanding of how branches are represented internally in Git. At this point, we should have the confidence required to tackle more advanced topics in Git.
In Part 3, I will introduce you to more useful tools that Git has to offer. After Part 3, we will be able to continue our journey in Git at a higher altitude and learn more about concepts that involve Git, rather than being stuck with low-level everyday commands.