Upserting Data using Spark and Iceberg
May 25, 2023
•
Jonathan Merlevede
Use Spark and Iceberg’s MERGE INTO syntax to efficiently store daily, incremental snapshots of a mutable source table.
Latest
Data Stability with Python: How to Catch Even the Smallest Changes
As a data engineer, it is nearly always the safest option to run data pipelines every X minutes. This allows you to sleep well at night…
Clear signals: Enhancing communication within a data team
Demystifying Device Flow
Implementing OAuth 2.0 Device Authorization Grant with AWS Cognito and FastAPI