The Ship of Theseus
Track how much of your codebase's original code still survives over time.
How much has changed?
--
of original code replaced
How old is the code?
--
Repository birth year
--
Earliest code still here
When was the biggest rewrite?
--
Biggest rewrite date
Which year's code survives most?
--
Year with most surviving code
How fast is code being replaced?
--
Lines replaced per month
How many times did the original codebase die?
--
Complete rebuilds
What's the average code age?
--
Average code age
How to read this chart
The X-axis shows time. The Y-axis shows total lines of code. Each colored band represents code that was originally written in a specific year.
How the data is collected
Every month, we analyze the repository and use git blame to determine when each line was last modified. This gives us a snapshot of how much original code survives over time.
Ancient Code Fragments
Loading...
The oldest line ever written in this repo's history, even if deleted long ago.
Loading...
The oldest line that is still alive in the codebase today.
Where did this all come from?
Honestly, I'm just a guy who spent a bit too much time reading Plato and not enough time touching grass. This project is basically what happens when you combine a bit of a philosophy obsession with a healthy dose of data engineering. I've always felt that data isn't just numbers in a JSON file, it's a living record of evolution, like a digital ancestry.
I wanted to see if I could apply the Ship of Theseus paradox to software. If you haven't heard of it, it's an ancient Greek thought experiment that asks: if you replace every single part of a ship, plank by plank, is it still the same ship? Or is it just a new ship wearing its ancestor's name tag?
We do this to codebases all the time. We refactor, delete, and rewrite until the original 2013 'timber' is long gone. This tool is my way of staring at that Identity Problem without having to write a 50-page thesis. It gives us a window into how our projects are constantly being reborn. Is it still the same repo? I have no idea, but the data is fascinating, and looking at entropy is better than staring at a blank terminal.
If you find this digital paradox as fascinating as I do, consider dropping a ⭐ on GitHub. It helps keep the ship afloat!
Data Scientist who also happens to read far too much philosophy