.git file internal working process

Most developers use Git every day, but many only interact with commands like git add, git commit, and git push without understanding what actually happens behind the scenes.

Learning Git internally changes how you think about version control. Instead of memorizing commands, you begin to understand Git as a data storage system that tracks snapshots of your project efficiently and safely.

What Is Git Really?

At its core, Git is:

A distributed version control system
A content-addressable filesystem
A snapshot-based database

Git does not mainly store “differences between files” like many people think.

Instead, Git stores snapshots of your project over time.

Every commit represents a snapshot of your files at a particular moment.

Why the `.git` Folder Exists

When you run

git init

Git creates a hidden folder named:

.git

This folder is the heart of the repository.

Your actual project files are only the working area.

The .git folder stores:

Commit history
Branches
Configuration
Git objects
Staging information
References

Without the .git folder, your project is just a normal folder.

Basic Structure of the `.git` Folder

A simplified structure looks like this:

.git/
├── objects/
├── refs/
├── HEAD
├── config
├── index
└── logs/

Let’s understand the important parts of .git

`objects/` — The Git Database

This is where Git stores all internal objects.

Everything in Git eventually becomes an object.

Examples:

File contents
Directories
Commits

Git stores them using hashes.

`refs/` — Branch References

Branches are actually lightweight pointers.

Example:

refs/heads/main

This stores the latest commit hash of the main branch.

`HEAD`

HEAD tells Git which branch or commit you are currently on.

Example:

ref: refs/heads/main

Meaning:

“You are currently on the main branch.”

`index` — The Staging Area

The index file represents the staging area.

When you run:

git add .

Git updates the index.

The staging area is basically:

“Files prepared for the next commit.”

Git’s Core Idea: Everything Is an Object

Git stores data as objects.

The three most important object types are:

Blob
Tree
Commit

Understanding these is the key to understanding Git.

Blob Object — File Content

A blob stores the contents of a file.

Example:

hello.txt

Containing:

Hello World

Git creates a blob object for that content.

Important:

A blob only stores raw content.

It does NOT store:

Filename
Folder location
Permissions

Just the content itself.

Tree Object — Directory Structure

A tree object represents a directory.

It connects:

Filenames
Blob objects
Other tree objects

You can think of a tree as a folder snapshot.

Example project:

project/
├── app.js
└── styles/
    └── main.css

Git creates:

Blob for app.js
Blob for main.css
Tree for styles
Root tree for entire project

Commit Object — Project Snapshot

A commit object represents a saved snapshot.

It contains:

Reference to a tree
Parent commit reference
Author info
Commit message
Timestamp

A commit does NOT directly store files.

Instead:

Commit → Tree → Blob

This relationship is extremely important.

How Git Tracks Changes

Git tracks changes by comparing snapshots.

When you modify a file:

Old Snapshot vs New Snapshot

Git determines what changed.

But internally, Git stores content as objects identified by hashes.

Git Uses Hashes Everywhere

Git generates a unique hash for every object.

Historically Git used SHA-1 hashes.

Example:

e965047ad7c57865823c7d992b1d046ea66edf78

This hash is based on the content itself.

Meaning:

Same content → same hash
Different content → different hash

This is called content-addressable storage.

Why Hashes Matter

Hashes give Git several advantages.

1. Integrity

If file contents change unexpectedly, the hash changes too.

Git can immediately detect corruption.

2. Deduplication

If two files contain identical content:

Hello World

Git stores only one blob internally.

Very storage efficient.

3. Fast Comparison

Git can compare hashes instead of entire files.

Much faster for large repositories.

What Happens During `git add`

Suppose you create:

notes.txt

Then run:

git add notes.txt

Internally Git:

Reads the file content
Creates a blob object
Generates a hash
Stores blob in .git/objects
Updates the staging area (index)

Important:

git add does NOT create a commit.

It only prepares content for commit.

Internal Flow of `git add`

Working Directory
       │       
Create Blob Object
       │
Store in .git/objects
       │
Update Index (Staging Area)

What Happens During `git commit`

When you run:

git commit -m "Add notes"

Git internally:

Reads staged files from index
Creates tree objects
Creates commit object
Links commit to parent commit
Updates branch reference

Internal Flow of `git commit`

Index (Staging Area)
        │
Create Tree Objects
        │
Create Commit Object
        │
Update Branch Reference

Git Is Basically a Snapshot System

Many beginners imagine Git like this:

Version 1 → differences → Version 2

But Git is conceptually closer to:

Snapshot 1
Snapshot 2
Snapshot 3

Git stores snapshots efficiently using shared objects.

Unchanged files are reused instead of duplicated.

Inside Git: How It Works and the Role of the .git Folder

What Is Git Really?

Why the `.git` Folder Exists

Basic Structure of the `.git` Folder

Let’s understand the important parts of .git

`objects/` — The Git Database

`refs/` — Branch References

`HEAD`

`index` — The Staging Area

Git’s Core Idea: Everything Is an Object

Blob Object — File Content

Tree Object — Directory Structure

Commit Object — Project Snapshot

How Git Tracks Changes

Git Uses Hashes Everywhere

Why Hashes Matter

1. Integrity

2. Deduplication

3. Fast Comparison

What Happens During `git add`

Internal Flow of `git add`

What Happens During `git commit`

Internal Flow of `git commit`

Git Is Basically a Snapshot System

Comments

More from this blog

Blocking vs Non-Blocking Code in Node.js

REST API Design Made Simple with Express.js

Why Version Control Exists: The Pendrive Problem

Middleware in Express and How It Works

File Uploads in Express with Multer

Command Palette

What Is Git Really?

Why the .git Folder Exists

Basic Structure of the .git Folder

Let’s understand the important parts of .git

objects/ — The Git Database

refs/ — Branch References

HEAD

index — The Staging Area

Git’s Core Idea: Everything Is an Object

Blob Object — File Content

Tree Object — Directory Structure

Commit Object — Project Snapshot

How Git Tracks Changes

Git Uses Hashes Everywhere

Why Hashes Matter

1. Integrity

2. Deduplication

3. Fast Comparison

What Happens During git add

Internal Flow of git add

What Happens During git commit

Internal Flow of git commit

Git Is Basically a Snapshot System

Comments

More from this blog

Why the `.git` Folder Exists

Basic Structure of the `.git` Folder

`objects/` — The Git Database

`refs/` — Branch References

`HEAD`

`index` — The Staging Area

What Happens During `git add`

Internal Flow of `git add`

What Happens During `git commit`

Internal Flow of `git commit`