
Data Lineage
Lead Designer • 2021 - present • IBM
Trust your data with visibility into your data pipeline
Overview
Data Lineage is an interactive visualization of your data supply chain that enables enterprise teams to understand how data flows across complex systems — from source to consumption — with confidence and clarity.
Over several years, this work evolved from deprecating a legacy lineage system, to rebuilding a business-first lineage experience, to integrating and eventually unifying Manta’s advanced technical lineage through acquisition. The result is a scalable, automated, and explorable lineage system that supports both business and technical users, improves trust in data, and enables critical workflows such as impact analysis, governance, and compliance.
Impact
Single, interactive interface
Visualize enterprise-wide data in a customizable graph that supports both technical and non-technical users, enabling flexible exploration across complex data ecosystems.
Integrated and enriched with business context
Understand your data through business metadata embedded directly in the lineage graph, and quickly access lineage from anywhere via platform integrations.
Wide spectrum of users and use cases
Leverage lineage across a wide range of users — from data providers to data consumers — and support use cases such as impact analysis, root cause analysis, and compliance.
Automated lineage scanning
Covering more than 75 technology scanners, automated lineage scanning continuously ingests metadata from multiple data sources to deliver comprehensive, up-to-date lineage.
Delivery
GA release
IBM Cloud Pak for Data 5.1 on Saas and on premise
Recognition
OTAA 2024
Outstanding Technical Achievement Award for Data Lineage
D&UX review score
B- (Good)
Highest scoring sections in usability, onboarding, and use (2024)
800 hours of manual effort reduced to just 7 hours for cloud dependency mapping
Large North American Bank
Audit reporting cycles shortened from weeks to less than a day
Large North American Bank
Saved over $2 million by eliminating the need to hire 35+ professionals
Leading healthcare company
Context
Problem
Relying on data without clear visibility into its origins and transformations exposes organizations to unpredictable risk, costly errors, and compliance failures.
Legacy lineage tools attempted to increase transparency into data flows but consistently fell short. They produced dense, uncontrolled visualizations that were difficult to interpret, performed poorly at enterprise scale, and were highly technical, making them inaccessible to business users who needed them for compliance reporting and data finding.
At the same time, organizations were forced to manually assemble lineage information from disparate sources. This labor-intensive process was slow, error-prone, and difficult to maintain, further undermining trust in data and increasing exposure to regulatory and operational risk.
Together, fragmented lineage data and unusable analysis tools left enterprises spending excessive time and effort to achieve basic compliance — while still lacking confidence in how their data was being used.
Users
Data engineers understanding upstream and downstream effects
Data scientists validating transformations and diagnosing issues
Compliance officers tracing data usage for regulatory and AI governance
Data analysts conducting impact analysis and reporting
Data steward understanding at a high level how their systems process and manage data
Use cases
Impact analysis & change management: Data engineers and data stewards understand downstream impact before making changes to data pipelines, schemas, or systems.
Root cause analysis & troubleshooting: Data engineers, data scientists trace issues back to their upstream source, reducing time spent diagnosing data quality and pipeline failures.
Data trust & validation: Data analysts and compliance officers gain visibility into how data was sourced and transformed, to assess whether data is compliant and fit for use.
My role
As the lead product designer, I led the multi-year evolution of data lineage ingestion and visualization, balancing long-term system thinking with pragmatic delivery through organizational and technical change:
Led UX design across multiple lineage initiatives over several years
Partnered closely with engineering on performance and scalability constraints
Collaborated with research to validate mental models and interaction patterns
Helped align design decisions across acquisitions
Project stakeholders
Product Management
Engineering
UX Design
UX Research
Content Design
Enterprise clients and users
Method
Research → Concept → Launch (2022) → Acquisition → Concept → Iteration → Launch (2024) → Iteration
2021
Build & Introduce
Began to transition away from the legacy Information Governance Catalog lineage by:
Building a new business data lineage in Watson Knowledge Catalog, later renamed IBM Knowledge Catalog (IKC)
Introducing Manta, the leading independent data lineage vendor at the time, as an OEM in IKC.
2022
Launch & Integrate
Designed and launched the IKC business data lineage experience — a new, summarized, business user-first lineage model — in June with CPD 4.5.
Integrated Manta further for automated scanning and advanced visualization for technical users.
2023
Enhance & Acquire
Enhanced the IKC lineage experience with support for more lineage metadata.
In December, IBM acquired Manta with the vision to merge the strengths of Manta’s technical lineage with IKC’s business data lineage in a new, unified experience.
2024
Rebuild & Launch
Designed a unified, performant lineage visualization for business and technical users within 10 months: 6 sprints, 2 design milestone reviews, and 1 DUX review.
Launched in October with CPD 5.1, the GA product:
Integrated business context from IKC
Leveraged automated scanning from Manta
2025
Strengthen & Scale
Scaled Data Lineage for enterprise use by strengthening automated ingestion and deepening analytical capabilities
More complete lineage with expanded data source scanner support, agent management, and alias assignment
Deeper analysis with column-level lineage and historical lineage
Research and validation
IKC Business Data Lineage
We performed 2 phases of research:
Competitive analysis of lineage tools (Project Gemini)
Foundational research to understand expectations via 5 sponsor user interviews with data analysts and business users
Our research revealed that legacy lineage visualizations like the IGC lineage were overwhelming, slow to render, difficult to analyze, and inaccessible to business users. What users needed was greater flexibility and customizability.
Flexibility
Users need to move fluidly between levels of detail
Enterprise users shift between high-level and detailed views depending on context, rather than staying at one level.
Performance and scale are table stakes for trust
Slow rendering, static diagrams, and limited scalability undermined confidence in lineage.
“It’s important to have the flexibility to shuffle between different view levels and get a view from them quickly."
Data analyst
Customizability
Lineage must be configurable to match organizational needs
Users expected to be able to adapt lineage views to reflect how their organization defines and reasons about data.
Business users need tailored, simpler views
Research highlighted a strong need to translate technical lineage into business-friendly representations.
“It depends a lot on the context of why you’re launching that view. That’s why it’s important to customize it."
Business analyst / consultant
These insights directly informed the design of the new business lineage experience in IKC which started from high-level summarized views, incorporated business metadata into the lineage graph, and was tightly integrated with the IKC platform.
When shown the new UI, those same sponsor users expressed excitement to start using it.
“This is solid. The evolution is early. We’ll complain once we have our hands on it.”
Sponsor User
Data Lineage (IBM Manta Data Lineage)
We performed 2 phases of research:
Secondary research to understand key personas
Concept testing to identify core pain points.
Overall, data engineers and data analysts gave the concept a business value rating of 4 / 4. Both user groups found the end-to-end flows — from the landing page into the lineage viewer — easy to comprehend and convenient to navigate.
Additionally, users provided feedback about their expectations by referencing prior experience with other tools and familiar interaction patterns, such as right-click actions. This informed our next design enhancements.
4/4
Business value rating
9
Clients
12
Findings
“This is more interactive than Collibra - When you hover you know and you can easily manage on the side menu. In Collibra, you have to drag and drop and have to do multiple steps."
Data Engineer
Constraints and complexity
Enterprise scale and performance with thousands of assets and relationships
Heterogeneous data ecosystems of various technology types and environments to support and visualize
Automated vs. manual ingestion reliability with gaps that must be identified and reconciled
Business and technical mental models ranging widely among users who all engage with the same interface
Legacy system deprecation that required tactful transition and parity planning to avoid major disruptions
Organizational change and acquisition that required robust alignment across differing visions, design systems, technical constraints, and ways of working
Incremental delivery across fast release cycles which felt like sprinting a marathon
Regulatory and compliance requirements that require accessing accurate lineage reports across time and viewing additional metadata in the graph
Outcome
Successfully designed and delivered lineage through multiple organizational transitions, including OEM integration and acquisition
Helped establish a model approach for post-acquisition design integration, later shared across other M&A teams
Elevated lineage from a specialist, technical tool to a shared enterprise capability
Delivery
IKC Business Data Lineage launched with CPD 4.5 in June 2022, replacing legacy IGC lineage with a clearer, business-first model
Manta OEM integration enabled early access to advanced technical lineage and accelerated deprecation of legacy tooling
Data Lineage (IBM Manta Data Lineage) launched GA with CPD 5.1, on cloud in October 2024 and on prem in December 2024.
UX Quality
DUX Review (Oct 2024):
Overall score: B- (Good)
Strengths: Usability, Get started, Use
Identified accessibility as the primary improvement area
Recognition
IKC Business Data Lineage
3 international design awards
Red Dot
iF
German Design Award
Data Lineage (IBM Manta Data Lineage)
Outstanding Technical Achievement Award (OTAA) — Nov 2024






