Sentry - Snuba Data Model - Moment For Technology

A series of

1 minute Quick use of The latest version of Docker Sentry-CLI – create version
Quick use of Docker start Sentry-CLI – 30 seconds start Source Maps
Sentry For React
Sentry For Vue
Sentry-CLI usage details
Sentry Web Performance Monitoring – Web Vitals
Sentry Web performance monitoring – Metrics
Sentry Web Performance Monitoring – Trends
Sentry Web Front-end Monitoring – Best Practices (Official Tutorial)
Sentry Backend Monitoring – Best Practices (Official Tutorial)
Sentry Monitor – Discover Big data query analysis engine
Sentry Monitoring – Dashboards Large screen for visualizing data
Sentry Monitor – Environments Distinguishes event data from different deployment Environments
Sentry monitoring – Security Policy Reports Security policies
Sentry monitoring – Search Indicates the actual Search
Sentry monitoring – Alerts Indicates an alarm
Sentry Monitor – Distributed Tracing
Sentry Monitoring – Distributed Tracing 101 for Full-stack Developers
Sentry Monitoring – Snuba Data Mid platform Architecture introduction (Kafka+Clickhouse)

This section describes how data is organized in Snuba and how user-facing data is mapped to the underlying database (such as Clickhouse).

Snuba data model is horizontally divided into logical model and physical model. The logical data model is visible to the Snuba client through the Snuba query language. Elements in this model may or may not map 1:1 to tables in the database. Instead, the physical model maps 1:1 to database concepts such as tables and views.

The reason behind this division is that it allows Snuba to expose a stable interface through the logical data model and perform complex mappings internally, performing queries on different tables that are part of the physical model, improving performance in a way that is transparent to the client.

The rest of this section Outlines the concepts that make up the two models and how they relate to each other.

The main concepts described below are dataset, Entity, and storage.

The data set

Dataset is the namespace of Snuba data. It provides its own schema and is independent of other datasets in terms of both logical and physical models.

Examples of data sets are Discover, outcomes, and sessions. There is no relationship between them.

A data set can be thought of as a container for the components that define its abstract data model and its concrete data model, as described below.

Entities and entity types

The fundamental block of the logical data model that Snuba exposes to clients is the entity. In the logical model, entities represent instances of abstract concepts such as transaction or error. In practice, an Entity corresponds to a row in a database table. Entity Type is the class of the Entity (such as Errors or Transactions).

A logical data model consists of a set of Entity Types and their relationships.

Each Entity Type has a schema defined by a list of fields with associated abstract data types. The schema of all Entity Types (which can be multiple) of the Dataset forms the logical data model visible to the Snuba client, against which the Snuba query is validated. Lower-level concepts should not be exposed.

Entity Types are explicitly included in the Dataset. An Entity Type cannot appear in more than one dataset.

Relationships between entity types

The entity types in the dataset are logically related. Two types of relationships are supported:

Entity set relationship (Entity Set Relationship). This mimics foreign keys. This relationship is designed to allow connections between entity types. Currently it only supportsOne to oneandMore than a pair ofThe relationship between.
Inheritance (Inheritance Relationship). This mimics the nominal subtype (subtyping). A group of entity types can share a parent entity type. A child type inherits from a parent typeschema. Semantically, a parent entity type must represent a union of all entities from which its type inherits. You must also be able to query the parent entity type. It can’t just be a logical relationship.

Entity type and consistency

The Entity Type is the largest unit that Snuba can provide some strong data consistency guarantees. Specifically, you can query entity types that expect Serializable Consistency(Serializable Consistency). This will not scale to any query that spans multiple entity types, in which case we will at best have final consistency.

This also has an impact on Subscription queries. These can only work on one entity type at a time; otherwise, they will require consistency between entity types, which we do not support.

Please pay attention!

To be precise, the consistency unit (depending on Entity Type) can be even smaller, and depending on how the data ingestion Topics are partitioned (such as project_id), the Entity Type is the maximum Snuba allows.

storage

Storage represents and defines the physical data model of the Dataset. Each Storage representation is materialized in a physical database concept, such as a table or materialized view. Therefore, each store has a schema defined by fields and their types that reflect the physical schema of the DB table/ View that the storage maps to, and can provide all the details for generating DDL statements to build the tables on the database.

Storage can map the logical concepts in the logical model discussed above to the physical concepts of the database, so each Storage needs to be associated with an Entity Type. To be specific:

eachEntity TypeThere must be at least oneReadable StorageWe can run queries on itStorage), but can be multipleStorage(For example, preaggregate materialized viewspre-aggregate materialized view) support. eachEntity TypeThe multipleStorageDesigned to allow query optimization.
eachEntity TypeMust have one and only one for ingesting data and populating database tablesWritable StorageSupport.
eachStorageOnly one type is supportedEntity Type.

The sample

This section provides examples of how the Snuba Data Model represents some real-world models.

These case studies do not necessarily reflect the current Sentry Production Model, nor are they necessarily part of the same deployment. They must be seen as isolated examples.

Single entity data set

This looks like the Outcomes data set used by Sentry. This does not actually reflect Outcomes as of April 2020. Although design Outcomes should move in that direction.

This Dataset has only one Entity Type, representing the single Outcome ingested by the Dataset. Querying raw Outcome is very slow, so we have two storages. One is the Raw storage that reflects the data we ingests and a Materialized View that calculates hourly aggregation, making the query more efficient. Query Planner selects storage based on whether the Query can be executed on aggregated data.

Multiple entity type data sets

A typical example of this dataset is the Discover dataset.

There are three Entity Types. Errors, Transactions and both inherit from Events. These form the logical data model, so the Events Entity Type query gives a union of Transactions and Errors, but it only allows the common fields between the two to exist in the query.

Errors Entity Type is supported by both storages for performance reasons. One is the primary Errors Storage for ingesting data, and the other is a Read Only view, which is less loaded on Clickhosue at query time, but provides a lower consistency guarantee. Transactions has only one storage and a Merge Table to service Events (essentially a view of two tables joined).

Connected entity type

This is a simple example of a data set that contains multiple entity types that can be joined together in a query.

GroupedMessage and GroupAssingee can be part of the Left Join query with Errors. The rest is similar to what was discussed in the previous example.

mo4tech.com (Moment For Technology) is a global community with thousands techies from across the global hang out!Passionate technologists, be it gadget freaks, tech enthusiasts, coders, technopreneurs, or CIOs, you would find them all here.

Sentry – Snuba Data Model

A series of

The data set