You Don’t Build Data Lineage — You Earn It
We built lineage to prove control. Somewhere along the way, we started treating lineage as control.
What We Use Lineage For Today
In most organisations, lineage has turned into a thing on its own.There’s a tool, a budget, a roadmap, a slide in every steering deck.Ask what it’s actually used for and you’ll usually hear the same answers:
- impact analysis
- tracing issues
- understanding dependencies
- “getting ready for the audit”
All of that is fine. Useful even.But none of that proves you are in control of your data.It only proves you can follow the problem, not that you’re set up to prevent it.
Somewhere, the industry quietly decided: “If we have lineage, we must be mature.”And everyone nodded and moved on.
What Lineage Was Actually Meant To Be
Lineage was never supposed to be a standalone programme. It was supposed to be a lens. A way to see how a data point comes into existence, changes shape, and ends up in a decision, a report, a model. At its best, lineage gives you a 360° view of a data point:
- where it originates
- how it transforms
- who owns it
- where it’s used
- and what purpose it serves
It joins movement with meaning. Not just “this field goes there”, but “this number exists for this reason, under this responsibility.” If governance is working properly , ownership clear, quality measured, metadata alive, this picture exists anyway.
Lineage is just how you choose to look at it. You don’t need a “lineage project” for that. You need governance that actually works.
What It Has Turned Into
Now compare that to what happens in reality. Most lineage in the wild is technical lineage harvested from code, ETL, SQL, pipelines.
It’s great for:
- “If I change this, what breaks?”
- “Why did this job fail?”
That’s impact analysis and debugging. Engineering work. Important, yes. But that’s not governance. Technical lineage doesn’t tell you:
- who is accountable
- whether the data is fit for purpose
- if the number in the report should be trusted at all
Yet this is where most of the effort and money goes. We treat the plumbing map as if it were a risk control.It looks complex, so it must be valuable.
“Complexity always photographs better than clarity.”
What Lineage Is Not
Lineage has picked up a lot of assumptions it never asked for. Might as well strip those out clearly.
Lineage is not a regulation.
No supervisor is saying “show me end-to-end lineage for every dataset within the company”
They ask much simpler questions:
- Can you explain your numbers?
- Can you show where they come from?
- Can you fix them when they’re wrong?
That’s control. Lineage can help answer it, but end-to-end lineage for every dataset in the company is not the requirement.
Lineage is not completeness.
Total, perfect, end-to-end lineage is a unicorn. Systems change, people move, definitions evolve. By the time you’ve mapped “everything”, the environment has already moved on.
Lineage is not control.
It was meant to demonstrate control. Somewhere along the way, we swapped the labels.
Doing the Boring Things Well
So what’s the alternative? The alternative is to stop chasing lineage as a deliverable, and start doing the boring fundamentals so well that lineage shows up on its own.
Things like:
- Ownership
Someone who can say “this is my data, here is what it means, here is what I do when it’s wrong.” - Quality
Evidence that data is monitored and corrected. - Metadata
Definitions people trust, up to date, referenced in daily work.
If you get these right, lineage exists whether you diagram it or not. You don’t have to “build” it.You just have to reveal it where needed.
That’s what I mean by: you don’t build lineage, you earn it.
The Point
We’ve spent years funding lineage projects, buying tools, drawing maps, chasing “end-to-end” like it’s a finish line.
Maybe the next step is simpler:
- Build control.
- Make ownership real.
- Make quality non-negotiable.
- Make metadata usable.
Lineage will follow.
Because regulators don’t care how pretty your lineage graph looks. They care how quickly you can respond when something breaks, and how confidently you can say,
“We know what this data is, where it comes from, and who owns it. We are in control.”
Lineage is just the reflection of that.
You don’t build it. You earn it.