Project by Asif Sayyed

The Ship of Theseus

Track how much of your codebase's original code still survives over time.

Mode:
Scale:

How much has changed? ?

--

of original code replaced

How old is the code? ?

--

Repository birth year

--

Earliest code still here

When was the biggest rewrite? ?

--

Biggest rewrite date

Which year's code survives most? ?

--

Year with most surviving code

How fast is code being replaced? ?

--

Lines replaced per month

How many times did the original codebase die? ?

--

Complete rebuilds

What's the average code age? ?

--

Average code age

How to read this chart

The X-axis shows time. The Y-axis shows total lines of code. Each colored band represents code that was originally written in a specific year.

How the data is collected

Every month, we analyze the repository and use git blame to determine when each line was last modified. This gives us a snapshot of how much original code survives over time.

Ancient Code Fragments

Historical Fossil ----
--
Loading...

The oldest line ever written in this repo's history, even if deleted long ago.

Living Fossil ----
--
Loading...

The oldest line that is still alive in the codebase today.

Where did this all come from?

Honestly, I'm just a guy who spent a bit too much time reading Plato and not enough time touching grass. This project is basically what happens when you combine a bit of a philosophy obsession with a healthy dose of data engineering. I've always felt that data isn't just numbers in a JSON file, it's a living record of evolution, like a digital ancestry.

I wanted to see if I could apply the Ship of Theseus paradox to software. If you haven't heard of it, it's an ancient Greek thought experiment that asks: if you replace every single part of a ship, plank by plank, is it still the same ship? Or is it just a new ship wearing its ancestor's name tag?

We do this to codebases all the time. We refactor, delete, and rewrite until the original 2013 'timber' is long gone. This tool is my way of staring at that Identity Problem without having to write a 50-page thesis. It gives us a window into how our projects are constantly being reborn. Is it still the same repo? I have no idea, but the data is fascinating, and looking at entropy is better than staring at a blank terminal.

If you find this digital paradox as fascinating as I do, consider dropping a ⭐ on GitHub. It helps keep the ship afloat!

— Asif Sayyed
Data Scientist who also happens to read far too much philosophy