Data Model Design 101: Composite vs Surrogate Keys | by Madison Schott | Feb, 2024

Editor
3 Min Read


When to know which type of key to use in your data models

Photo by Jason D on Unsplash

I’ve recently been writing a data model to represent a new part of our business. The data requires a lot of questions to be asked, as it’s quite difficult to understand intuitively.

The data model requires that I join similar, yet different, datasets from two different sources into one dataset. Any time you merge two datasets, it’s integral for you to think about the unique key that will then act as the primary key of this new dataset.

Unfortunately, you can’t assume that the primary key in each dataset will carry over into the resultant one. This is because these keys, if incrementing integers, will tend to be duplicated from dataset to dataset.

However, you can create a new key.

In this article, we will discuss two options for creating a unique key in a data model—a surrogate key or a composite key. What are the differences between these? When should you use one versus the other?

Composite keys are made up of more than one identifying field, together the fields that make it up are unique. They are created from real-world values and whose meaning can be understood when read

Surrogate keys are generated for the sole purpose of being a primary key and don’t contain any real-world meaning. They are typically hash values that make data retrieval fast and easy.

Composite keys are ideal when you still want to maintain the value of your data. While composite keys are a unique combination of fields, you can generate a new field based on these values to make the unique lookup of records easier.

This is what I recommend when using a composite key in a data model. We will go over an easy way to use SQL or dbt to generate a composite key within any of your data models.

Surrogate keys are ideal when you don’t need to maintain the value of your data and want a fast and efficient way to retrieve your data. These are often used when datasets are unique across 3 or…

Share this Article
Please enter CoinGecko Free Api Key to get this plugin works.