This content originally appeared on Level Up Coding - Medium and was authored by Praveen Mathew
Git under the hood
Git is a database to store the snapshots of the codebase throughout its development phase. Although, developers are familiar with the basic commands, most are oblivious to the internal workings.
This will be a hands on tutorial on how git works internally.
Getting Started
Let’s setup a local git repository.
$ mkdir git-internals-tutorial ; cd $_ # create a new directory
$ git init # create a git repository locally
A git repository stores the snapshots in a .git folder. We can visualize the folder with a tree command.
$ tree .git
.git
├── HEAD
├── config
├── hooks
├── objects
│ ├── info
│ └── pack
└── refs
├── heads
└── tags
Git Hash Object
Creates the hash of the contents of the file to be stored in the database.
Lets say we have a file file1.txt
$ echo 'hello' > file1.txt
$ cat file1.txt
hello
$ git hash-object file1.txt
ce013625030ba8dba906f756967f9e9ca394464a
The hash is the id of the content in the git database. The object id is computed solely based on the content and is agnostic of the file name.
$ echo 'hello' > file2.md
$ git hash-object file2.md
ce013625030ba8dba906f756967f9e9ca394464a
The object id may be computed even without creating a file.
$ echo 'hello' | git hash-object --stdin
ce013625030ba8dba906f756967f9e9ca394464a
Saving Git Blob object to Database
To write the file content to the database, a -w flag may be added
$ echo 'hello' | git hash-object --stdin -w
ce013625030ba8dba906f756967f9e9ca394464a
This will store the contents under a file in the objects directory with the object-id as the name
$ tree .git
.git
├── HEAD
├── config
├── hooks
├── objects
│ ├── ce
│ │ └── 013625030ba8dba906f756967f9e9ca394464a
│ ├── info
│ └── pack
└── refs
├── heads
└── tags

The cat-file command is used to get the contents of the file from the git object
$ git cat-file -p ce013625030ba8dba906f756967f9e9ca394464a
hello
The -p flag is to pretty print the contents of the object. Similarly, the -t flag outputs the type of the object. In our case it is a blob.
$ git cat-file -t ce013625030ba8dba906f756967f9e9ca394464a
blob
Creating a Git Tree Object
Similar to a blob, tree another type of object in git. It has blobs and also others trees under it.
Use the git write-tree command add files in the index to a tree
$ git add file1.txt # add file1 to the index space in git
$ git write tree # Add file1 to a tree and write it to database
dca98923d43cd634f4359f8a1f897bf585100cfe
This writes the tree with the file contents as the blob to the database. The git ls-tree can be used to view the contents on the tree:

$ git ls-tree dca98923d43cd634f4359f8a1f897bf585100cfe
100644 blob ce013625030ba8dba906f756967f9e9ca394464a file1.txt
$ git cat-file -t dca98923d43cd634f4359f8a1f897bf585100cfe
tree
The tree object also can be found in the .git folder
$ tree .git
.git
├── HEAD
├── config
├── hooks
├── objects
│ ├── ce
│ │ └── 013625030ba8dba906f756967f9e9ca394464a
│ ├── dc
│ │ └── a98923d43cd634f4359f8a1f897bf585100cfe
│ ├── info
│ └── pack
└── refs
├── heads
└── tags
Making A Git Commit
Most will be familiar with the git commit command that makes a commit out of the changes in the staging area and updates the reference of the branch to that commit. Lets use the low level API to do the same.
A commit object in git has a tree, a link to the parent commit if present and information such as the commit message and the details about the author and the committer.
Lets start with the dca98923d43cd634f4359f8a1f897bf585100cfe tree which we have created in the last section.
$ git commit-tree dca98923d43cd634f4359f8a1f897bf585100cfe -m "Commit Message" # Creates a commit with the changes in the tree and the messages
1185a9903f20ca3059dcc96662fb05cc219bd654

A commit with commit-id 1185a9903f20ca3059dcc96662fb05cc219bd654 has been created. You may see the commit with the command
$ git log 1185a9903f20ca3059dcc96662fb05cc219bd654
commit 1185a9903f20ca3059dcc96662fb05cc219bd654
Author: Praveen Mathew <email@email.com>
Date: Wed Feb 17 19:18:40 2021 +0530
Commit Message
Lets create another commit on top of it using the low level git APIs:
$ touch file2.txt
$ git add file2.txt
$ git write-tree
5d649a7d0557d17655fbdd362b34158a36b0a39d
$ git commit-tree 5d649a7d0557d17655fbdd362b34158a36b0a39d -m "Second Commit Message" -p 1185a9903f20ca3059dcc96662fb05cc219bd654 #
7a4834354b351022ea9ddb818b7b2a889bdbb3cf

Now the log of the commit 7a4834354b351022ea9ddb818b7b2a889bdbb3cf would show both the commits
$ git log 7a4834354b351022ea9ddb818b7b2a889bdbb3cf --oneline
7a48343 Second Commit Message
1185a99 Commit Message
However if you notice current branch pointed to by the HEAD is still the default one. The commits dont show up there.
$ git log
fatal: your current branch 'main' does not have any commits yet
Apart from creating the commits, there is one more thing that the git commit command does. It also updates the reference of the branch pointed to by the HEAD. That is done by the command git update-ref
$ git update-ref refs/heads/main 7a4834354b351022ea9ddb818b7b2a889bdbb3cf # points main reference to the commit

Now the main branch is pointed to the commit.
$ git log --oneline
7a48343 (HEAD -> main) Second Commit Message
1185a99 Commit Message
(END)
Thus, we have a branch with some commits ?
References:
- Git - Git Objects
- Commits are snapshots, not diffs - The GitHub Blog
- What is Git Object Model
- How trees are made
Git Internals was originally published in Level Up Coding on Medium, where people are continuing the conversation by highlighting and responding to this story.
This content originally appeared on Level Up Coding - Medium and was authored by Praveen Mathew

Praveen Mathew | Sciencx (2021-04-06T15:23:28+00:00) Git Internals. Retrieved from https://www.scien.cx/2021/04/06/git-internals/
Please log in to upload a file.
There are no updates yet.
Click the Upload button above to add an update.