How I built a version control system (VCS) using pure Go

Every single artifact related to the creation of your software should be under version control. [1]

A VCS is a system that tracks revisions (versions) of files over time. [2]

Demo

https://asciinema.org/a/487303

Source Code

https://github.com/Abdulsametileri/vX

Motivation

When I read a beautiful book [3] to understand event-driven systems and the idea of event sourcing better, I saw a very good example

As an analogy, imagine you are building a version control system like SVN or Git. When a user commits a file for the first time, the system saves the whole file to disk. Subsequent commits, reflecting changes to that file, might save only the delta — that is, just the lines that were added, changed, or removed. Then, when the user checks out a certain version, the system opens the version-0 file and applies all subsequent deltas, in order, to derive the version the user asked for.” [3]

I just wanted to implement a VCS system with this strategy. So I started an experimental hobby project called vX. All of this is just three days of my hard work. 😅

Architecture

First of all, all related files are within the .vx folder. (The Go tool ignores any directories or files which have names that begin with an “_” or “.”)

.vx
├── checkout
│ ├── v1
│ └── v2
├── commit
│ ├── v1
│ └── v2
├── staging-area.txt
└── status.txt

v1, v2, .., vN are commit versions. I will say more detail in the next section within these folders.

checkout is a folder that includes the result of merging all files with the specified commit version. For example, checkout/v2 includes a combination of commit/v1 + commit/v2 .

It’s reasonable to create a checkout directory because “a good practice in the UNIX world is to deploy each version of the application into a new directory and have a symbolic link that points to the current version [1]”. Currently, I did not implement this behavior but it’s ready.

staging area is files that are going to be a part of the next commit. In this context, it’s just a basic text file with append-only mode. Because we do some updates after creation. For example, a part of the content of this file is like its formatted as file path | file modification time | File Status

"testdata/status.txt|2022-04-14 05:42:15|Created",
"testdata/z.go|2022-04-14 05:11:04|Created",
"README.md|2022-04-14 05:42:11|Created",
"testdata/a1.txt|2022-04-13 06:58:03|Created",
"README.md|2022-04-14 05:49:09|Updated",

For example, README.md a kind of file that is added twice with different modification times and statuses before the commit operation. So the latest state of this file is Updated at 2022-04-14 05:49:09 . This is very similar to the idea of event sourcing; that is, representing the changes to a database as a log of immutable.[4].

status is a text file that keeps track of all files persistently. I clear the contents of staging area after the successful commit. So I need to keep the files under the version control system persistently.

Project structure

I followed the project structure that was recommended for any Cobra-based application.

I used testdata directory. (The Go tool ignores any directories called testdata these scripts will be ignored when compiling your application.)

Not supported actions

Currently, I don’t know how to detect the deleted files so I am just tracking created and modified files. This status is based on the file modification time provided by the File system.

Currently, in order to provide checkout functionality, it’s a really hard job to implement storing only changes and merging them if needed so I delay this task to another release. After some research, I found rsync for this job. Because of this, at every commit operation, I saved files at the staging area as a whole.

Commands

init : creates directories and files.

status : reads staging area text files and parses them in appropriate struct and uses tablewriter to show results. If you look carefully, I created functions with io.Writer interface. At unit tests, I pass bytes.Buffer and assert easily. I recommend reading this great article about interfaces in Go.

history : show all commits. In order to implement this functionality, I keep a metadata.txt file in every commit directory. In this directory, I store commit messages and time separated with |.

.
├── v1
│ ├── ..
│ ├── metadata.txt
└── v2
├── ...
└── metadata.txt

add : adds the specified files and directories to status.txt and staging-area.txt . As previously mentioned, in order to show updated status for some files I keep the latest state of files status.txt so I truncate and write fresh data every time. staging-area.txt is an append-only data so no need to do any operation, just append new data. Duplicate data no problem. After the successful commit, I calculate the latest state.

commit : reads staging-area.txt file, copies with specific commit directory (v1, v2), and after the operation finishes truncate staging-area.txt.

For example, let’s suppose in the v1 commit, user-added README.md testdata/ and in the v2 commit, user-added Makefile. So, the commit folders will look like this

├── commit
│ ├── v1
│ │ ├── README.md
│ │ ├── metadata.txt
│ │ └── testdata
│ │ └── example
│ │ ├── a1.txt
│ │ ├── a2.txt
│ │ ├── example.go
│ │ ├── src
│ │ │ └── hello.js
│ │ └── z.go
│ └── v2
│ ├── Makefile
│ └── metadata.txt

checkout : rsync from commit/ to checkout/ directory with specific commit id. rsync also merges two same files for us.

├── checkout
│ ├── v1
│ │ ├── README.md
│ │ └── testdata
│ │ └── example
│ │ ├── a1.txt
│ │ ├── a2.txt
│ │ ├── example.go
│ │ ├── src
│ │ │ └── hello.js
│ │ └── z.go

Source Code

https://github.com/Abdulsametileri/vX

References

[1] Continuous Integration: Improving Software Quality and Reducing Risk by Andrew Glover, Paul Duvall, and Steve Matyas

[2] Software Engineering at Google Lessons Learned from Programming Over Time by Titus Winters, Tom Manshreck, Hyrum Wright

[3] Designing Event-Driven Systems by Ben Stopford

[4] Making Sense of Stream Processing by Martin Kleppmann


How I built a version control system (VCS) using pure Go 🚀 was originally published in Level Up Coding on Medium, where people are continuing the conversation by highlighting and responding to this story.


This content originally appeared on Level Up Coding - Medium and was authored by Abdulsamet İLERİ

Every single artifact related to the creation of your software should be under version control. [1]
A VCS is a system that tracks revisions (versions) of files over time. [2]

Demo

https://asciinema.org/a/487303

Source Code

https://github.com/Abdulsametileri/vX

Motivation

When I read a beautiful book [3] to understand event-driven systems and the idea of event sourcing better, I saw a very good example

As an analogy, imagine you are building a version control system like SVN or Git. When a user commits a file for the first time, the system saves the whole file to disk. Subsequent commits, reflecting changes to that file, might save only the delta — that is, just the lines that were added, changed, or removed. Then, when the user checks out a certain version, the system opens the version-0 file and applies all subsequent deltas, in order, to derive the version the user asked for.” [3]

I just wanted to implement a VCS system with this strategy. So I started an experimental hobby project called vX. All of this is just three days of my hard work. 😅

Architecture

First of all, all related files are within the .vx folder. (The Go tool ignores any directories or files which have names that begin with an “_” or “.”)

.vx
├── checkout
│ ├── v1
│ └── v2
├── commit
│ ├── v1
│ └── v2
├── staging-area.txt
└── status.txt

v1, v2, .., vN are commit versions. I will say more detail in the next section within these folders.

checkout is a folder that includes the result of merging all files with the specified commit version. For example, checkout/v2 includes a combination of commit/v1 + commit/v2 .

It's reasonable to create a checkout directory because “a good practice in the UNIX world is to deploy each version of the application into a new directory and have a symbolic link that points to the current version [1]”. Currently, I did not implement this behavior but it's ready.

staging area is files that are going to be a part of the next commit. In this context, it’s just a basic text file with append-only mode. Because we do some updates after creation. For example, a part of the content of this file is like its formatted as file path | file modification time | File Status

"testdata/status.txt|2022-04-14 05:42:15|Created",
"testdata/z.go|2022-04-14 05:11:04|Created",
"README.md|2022-04-14 05:42:11|Created",
"testdata/a1.txt|2022-04-13 06:58:03|Created",
"README.md|2022-04-14 05:49:09|Updated",

For example, README.md a kind of file that is added twice with different modification times and statuses before the commit operation. So the latest state of this file is Updated at 2022-04-14 05:49:09 . This is very similar to the idea of event sourcing; that is, representing the changes to a database as a log of immutable.[4].

status is a text file that keeps track of all files persistently. I clear the contents of staging area after the successful commit. So I need to keep the files under the version control system persistently.

Project structure

I followed the project structure that was recommended for any Cobra-based application.

I used testdata directory. (The Go tool ignores any directories called testdata these scripts will be ignored when compiling your application.)

Not supported actions

Currently, I don't know how to detect the deleted files so I am just tracking created and modified files. This status is based on the file modification time provided by the File system.

Currently, in order to provide checkout functionality, it's a really hard job to implement storing only changes and merging them if needed so I delay this task to another release. After some research, I found rsync for this job. Because of this, at every commit operation, I saved files at the staging area as a whole.

Commands

init : creates directories and files.

status : reads staging area text files and parses them in appropriate struct and uses tablewriter to show results. If you look carefully, I created functions with io.Writer interface. At unit tests, I pass bytes.Buffer and assert easily. I recommend reading this great article about interfaces in Go.

history : show all commits. In order to implement this functionality, I keep a metadata.txt file in every commit directory. In this directory, I store commit messages and time separated with |.

.
├── v1
│ ├── ..
│ ├── metadata.txt
└── v2
├── ...
└── metadata.txt

add : adds the specified files and directories to status.txt and staging-area.txt . As previously mentioned, in order to show updated status for some files I keep the latest state of files status.txt so I truncate and write fresh data every time. staging-area.txt is an append-only data so no need to do any operation, just append new data. Duplicate data no problem. After the successful commit, I calculate the latest state.

commit : reads staging-area.txt file, copies with specific commit directory (v1, v2), and after the operation finishes truncate staging-area.txt.

For example, let’s suppose in the v1 commit, user-added README.md testdata/ and in the v2 commit, user-added Makefile. So, the commit folders will look like this

├── commit
│ ├── v1
│ │ ├── README.md
│ │ ├── metadata.txt
│ │ └── testdata
│ │ └── example
│ │ ├── a1.txt
│ │ ├── a2.txt
│ │ ├── example.go
│ │ ├── src
│ │ │ └── hello.js
│ │ └── z.go
│ └── v2
│ ├── Makefile
│ └── metadata.txt

checkout : rsync from commit/ to checkout/ directory with specific commit id. rsync also merges two same files for us.

├── checkout
│ ├── v1
│ │ ├── README.md
│ │ └── testdata
│ │ └── example
│ │ ├── a1.txt
│ │ ├── a2.txt
│ │ ├── example.go
│ │ ├── src
│ │ │ └── hello.js
│ │ └── z.go

Source Code

https://github.com/Abdulsametileri/vX

References

[1] Continuous Integration: Improving Software Quality and Reducing Risk by Andrew Glover, Paul Duvall, and Steve Matyas

[2] Software Engineering at Google Lessons Learned from Programming Over Time by Titus Winters, Tom Manshreck, Hyrum Wright

[3] Designing Event-Driven Systems by Ben Stopford

[4] Making Sense of Stream Processing by Martin Kleppmann


How I built a version control system (VCS) using pure Go 🚀 was originally published in Level Up Coding on Medium, where people are continuing the conversation by highlighting and responding to this story.


This content originally appeared on Level Up Coding - Medium and was authored by Abdulsamet İLERİ


Print Share Comment Cite Upload Translate Updates
APA

Abdulsamet İLERİ | Sciencx (2022-04-17T11:53:38+00:00) How I built a version control system (VCS) using pure Go. Retrieved from https://www.scien.cx/2022/04/17/how-i-built-a-version-control-system-vcs-using-pure-go/

MLA
" » How I built a version control system (VCS) using pure Go." Abdulsamet İLERİ | Sciencx - Sunday April 17, 2022, https://www.scien.cx/2022/04/17/how-i-built-a-version-control-system-vcs-using-pure-go/
HARVARD
Abdulsamet İLERİ | Sciencx Sunday April 17, 2022 » How I built a version control system (VCS) using pure Go., viewed ,<https://www.scien.cx/2022/04/17/how-i-built-a-version-control-system-vcs-using-pure-go/>
VANCOUVER
Abdulsamet İLERİ | Sciencx - » How I built a version control system (VCS) using pure Go. [Internet]. [Accessed ]. Available from: https://www.scien.cx/2022/04/17/how-i-built-a-version-control-system-vcs-using-pure-go/
CHICAGO
" » How I built a version control system (VCS) using pure Go." Abdulsamet İLERİ | Sciencx - Accessed . https://www.scien.cx/2022/04/17/how-i-built-a-version-control-system-vcs-using-pure-go/
IEEE
" » How I built a version control system (VCS) using pure Go." Abdulsamet İLERİ | Sciencx [Online]. Available: https://www.scien.cx/2022/04/17/how-i-built-a-version-control-system-vcs-using-pure-go/. [Accessed: ]
rf:citation
» How I built a version control system (VCS) using pure Go | Abdulsamet İLERİ | Sciencx | https://www.scien.cx/2022/04/17/how-i-built-a-version-control-system-vcs-using-pure-go/ |

Please log in to upload a file.




There are no updates yet.
Click the Upload button above to add an update.

You must be logged in to translate posts. Please log in or register.