Diving Headfirst into Delta Lake
Michael Johnson
Delta Lake is an exciting new file format that supports all the benefits of open file formats such as parquet while adding the ACID (Atomicity, Consistency, Isolation and Durability) properties that we are used to when working with Transactional databases such as SQL Server. With its cheap storage costs, high performance, and reliability, Delta Lake is quickly becoming the default file format in big data processing platforms such as Databricks.
Together, we will examine the internals of the Delta Lake file structure to better understand how transactional guarantees are achieved while supporting both concurrent reads and writes.
Delta Lake, however, is not a silver bullet to all your big data woes and like any transactional database requires regular maintenance to keep them running at their best.
Finally, Delta Lake is not the only format looking to support these features and there are two notable competitors Apache Iceberg and Apache Hudi. We will briefly discuss some of the advantages of each of these systems and look at when you may consider them over Delta Lake
Get the Latest
Sign up to stay up to date with news, special announcements and educational content.
Redgate will only contact you about PASS Data Community Summit (in line with our Privacy Policy) unless you separately request emails about Redgate. You can unsubscribe from these updates at any time.
Thanks for submitting! We'll be in touch soon.
