Data Modelling For Data Engineers | by 💡Mike Shakhomirov | Dec, 2023

Editor
2 Min Read


The definitive guide for beginners

Photo by Sebastian Svenson on Unsplash

Data modelling is an essential part of data engineering. In this story, I would like to talk about different data models, the role of SQL in data transformation and the data enrichment process. SQL is a powerful tool that helps to manipulate data. With data transformation pipelines we can transform and enrich data loaded into our data platform. We will discuss various methods of data manipulation, scheduling and incremental table updates. In order to make this process efficient we would want to know a few essential things about data modelling first.

What is data modelling?

A data model aims to organise elements of your data and standardise how the data elements relate to one another.

Data Models ensure the quality of the data, semantic configurations and consistency in naming conventions. It helps to design the database conceptually and create logical connections between data elements, i.e. primary and foreign keys, tables, etc.

Good and thorough data model design is crucial if you need the most reliable and cost-effective data transformation for your data platform. It guarantees that the data is processed without delays and unnecessary steps.

Companies use a procedure known as dimensional data modelling to process data. Source — Production — Analytics level split between schemas (datasets) enables effective data governance and makes sure our data is ready for business intelligence and machine learning.

Any measurable information is being stored in fact tables, i.e. transactions, sessions, requests, etc.

Foreign keys are used in the fact tables, and they are connected to Dimension Tables. Dimension Tables have descriptive data that is linked to the Fact Table, i.e. brand, product type/code, country, etc.

Dimensions and Facts based on business requirements are being tied into the Schema.

The two most popular schema types are Star and Snowflake. Not to say that these are the most frequent questions during data engineering job interviews [1].

Share this Article
Please enter CoinGecko Free Api Key to get this plugin works.