An automated and fully managed data catalogue based on Amundsen, the open-source data catalogue built at Lyft.

Data Cataloguing
Open Source

What need does Stemma fulfill?

Everyone has access to data, but few know what exists, what’s trustworthy and how to use it. Stemma makes finding trustworthy data easy and offers an always up-to-date view of your data’s usage at any time.

It enables data producers and consumers to discover, understand and trust the data in the organization.

What are the benefits of using Stemma?

    Data users can easily discover trustworthy data for their use and onboard a lot faster than before
    Data engineers and data owners can understand the impact of changes on their data and much more easily triage data issues
    Data Governance users can standardize and evangelize single source of truth data within the organization

What are the core features of Stemma?

    Search for data through a page-rank style algorithm based on how often a data set is queried, how many dashboards are built on it, who uses it, etc
    Visualize lineage for data to understand dependencies and perform impact analysis
    Understand how data is commonly used by automated inference of common join and filter conditions
    Read related Slack conversations by linking them to the data pages via a Slack bot

Which teams does Stemma cater to?

Data Engineering
Analytics Engineering
Data Science

Authored By

Mark Grover's profile on astorik

Mark Grover

CEO of Stemma