Author Highlight: Edward Hayter on dbt

13 December 2024

Over on The Data School blog data engineering consultant Edward Hayter has been blogging about various aspects of dbt.

Modularizing SQL (using dbt)

Introduce yourself to the principles of modularization in software engineering more generally before focusing on how these principles can be applied to SQL with the help of dbt

Connecting dbt to Microsoft SQL Server

With many dbt training resources referencing cloud databases such as Snowflake and Databricks you may feel left out if your primary database is SQL Server. This blog outlines the challenges and learnings encountered while setting up dbt with Microsoft SQL Server.

What is CI/CD? How can Azure DevOps Help?

Continuous Integration (CI) and Continuous Deployment (CD) are two pillars of modern software development. Together, they aim to automate the processes of building, testing, and deploying code, ensuring that software is delivered more quickly and with fewer errors.

Unit Testing In Data Engineering

Unit testing is rooted in software development and is aimed at maintaining quality code as a project develops and grows over time. As code grows over time, the idea is to isolate individual code modules and test that they perform as expected and at an acceptably performant level.

Introduction to Normalized and Denormalized Data

An awareness of database structures is important contextual knowledge for data engineering. One of the key principles when thinking about database design is normalization, an approach to organizing data. This blog introduces normalization and denormalization, discussing their strengths and weaknesses along with an approach to balancing them.

dbt Command Guide

For SQL users venturing into dbt, writing models often feels intuitive. However, other aspects—like mastering dbt's Jinja macros and using the dbt Command Line Interface (CLI) effectively—can present a steeper learning curve. This guide aims to introduce dbt’s CLI, explaining its role in managing data projects, providing setup tips, and offering principles to maximize its impact.

Documentation with dbt

Documentation is important for giving context to the data (both incoming and outgoing) and explaining transformations - making processes easier to debug when they go wrong and ensuring that the data is not misunderstood and misused. The following blog will cover two key areas of documentation, the meta-data for data governance purposes, applying Don't Repeat Yourself (DRY) principles to documentation and a brief section offering thoughts on structuring and documenting SQL for handover to a fellow developer.

Using dbt and Jinja to Write Custom SQL - Unioning Data

In data preparation, consolidating multiple datasets with identical structures is a frequent task. SQL's UNION operation is commonly used to stack data from these datasets into one consolidated table. However, as the number of datasets increases, maintaining and updating these SQL scripts becomes cumbersome. This is where dbt and Jinja come into play, revolutionizing how SQL queries are written and maintained.

Dynamic Rename Tables in SQL with dbt and Jinja

A macro designed to be used in all instances where you want to rename columns of a table based on a mapping table of the structure new_header | header.

Author:

Craig Bloodworth

View More Posts