You Don’t Build Data Lineage — You Earn It

You Don’t Build Data Lineage — You Earn It
Photo by Conny Schneider / Unsplash
We built lineage to prove control. Somewhere along the way, we started treating lineage as control.

What We Use Lineage For Today

In most organisations, lineage has turned into a thing on its own.There’s a tool, a budget, a roadmap, a slide in every steering deck.Ask what it’s actually used for and you’ll usually hear the same answers:

  • impact analysis
  • tracing issues
  • understanding dependencies
  • “getting ready for the audit”

All of that is fine. Useful even.But none of that proves you are in control of your data.It only proves you can follow the problem, not that you’re set up to prevent it.

Somewhere, the industry quietly decided: “If we have lineage, we must be mature.”And everyone nodded and moved on.


What Lineage Was Actually Meant To Be


Lineage was never supposed to be a standalone programme. It was supposed to be a lens. A way to see how a data point comes into existence, changes shape, and ends up in a decision, a report, a model. At its best, lineage gives you a 360° view of a data point:

  • where it originates
  • how it transforms
  • who owns it
  • where it’s used
  • and what purpose it serves

It joins movement with meaning. Not just “this field goes there”, but “this number exists for this reason, under this responsibility.” If governance is working properly , ownership clear, quality measured, metadata alive, this picture exists anyway.

Lineage is just how you choose to look at it. You don’t need a “lineage project” for that. You need governance that actually works.


What It Has Turned Into

Now compare that to what happens in reality. Most lineage in the wild is technical lineage harvested from code, ETL, SQL, pipelines.

It’s great for:

  • “If I change this, what breaks?”
  • “Why did this job fail?”

That’s impact analysis and debugging. Engineering work. Important, yes. But that’s not governance. Technical lineage doesn’t tell you:

  • who is accountable
  • whether the data is fit for purpose
  • if the number in the report should be trusted at all

Yet this is where most of the effort and money goes. We treat the plumbing map as if it were a risk control.It looks complex, so it must be valuable.

“Complexity always photographs better than clarity.”

What Lineage Is Not

Lineage has picked up a lot of assumptions it never asked for. Might as well strip those out clearly.

Lineage is not a regulation.

No supervisor is saying “show me end-to-end lineage for every dataset within the company”

They ask much simpler questions:

  • Can you explain your numbers?
  • Can you show where they come from?
  • Can you fix them when they’re wrong?

That’s control. Lineage can help answer it, but end-to-end lineage for every dataset in the company is not the requirement.

Lineage is not completeness.

Total, perfect, end-to-end lineage is a unicorn. Systems change, people move, definitions evolve. By the time you’ve mapped “everything”, the environment has already moved on.

Lineage is not control.

It was meant to demonstrate control. Somewhere along the way, we swapped the labels.


Doing the Boring Things Well

So what’s the alternative? The alternative is to stop chasing lineage as a deliverable, and start doing the boring fundamentals so well that lineage shows up on its own.

Things like:

  • Ownership
    Someone who can say “this is my data, here is what it means, here is what I do when it’s wrong.”
  • Quality
    Evidence that data is monitored and corrected.
  • Metadata
    Definitions people trust, up to date, referenced in daily work.

If you get these right, lineage exists whether you diagram it or not. You don’t have to “build” it.You just have to reveal it where needed.

That’s what I mean by: you don’t build lineage, you earn it.


The Point

We’ve spent years funding lineage projects, buying tools, drawing maps, chasing “end-to-end” like it’s a finish line.

Maybe the next step is simpler:

  • Build control.
  • Make ownership real.
  • Make quality non-negotiable.
  • Make metadata usable.

Lineage will follow.

Because regulators don’t care how pretty your lineage graph looks. They care how quickly you can respond when something breaks, and how confidently you can say,

“We know what this data is, where it comes from, and who owns it. We are in control.”

Lineage is just the reflection of that.

You don’t build it. You earn it.

a row of gold medals with a red ribbon
Photo by George Pisarevsky / Unsplash