Sponsored
Enjoy fast, free delivery, exclusive deals, and award-winning movies & TV shows.
$49.00 with 39 percent savings
List Price: $79.99 Image
FREE delivery Sunday, May 3
Or Prime members get FREE delivery Tomorrow, April 29. Order within 9 hrs 58 mins. Join Prime
Only 12 left in stock - order soon.
$$49.00 () Includes selected options. Includes initial monthly payment and selected options. Details
Price
Subtotal
$$49.00
Subtotal
Initial payment breakdown
Shipping cost, delivery date, and order total (including tax) shown at checkout.
Shipper / Seller
Amazon.com
Amazon.com
Shipper / Seller
Amazon.com
Returns
FREE 30-day refund/replacement
FREE 30-day refund/replacement
Quick refund
Usually issued within 24 hours. See exceptions
FREE return
At least one free return option available.
Convenient dropoff
At any of our 50,000 US locations.
See return policy
Gift options
Available at checkout
Available at checkout This item is a gift. Change
At checkout, you can add a custom message, a gift receipt for easy returns and have the item gift-wrapped
Payment
Secure transaction
Your transaction is secure
We work hard to protect your security and privacy. Our payment security system encrypts your information during transmission. We don’t share your credit card details with third-party sellers, and we don’t sell your information to others. Learn more
Added to

Sorry, there was a problem.

There was an error retrieving your Wish Lists. Please try again.

Sorry, there was a problem.

List unavailable.
Kindle app logo image

Download the free Kindle app and start reading Kindle books instantly on your smartphone, tablet, or computer - no Kindle device required.

Read instantly on your browser with Kindle for Web.

Using your mobile phone camera - scan the code below and download the Kindle app.

QR code to download the Kindle App

  • Delta Lake: The Definitive Guide: Modern Data Lakehouse Architectures with Data Lakes

Follow the authors

Get new release updates & improved recommendations
See all
Something went wrong. Please try your request again later.

Delta Lake: The Definitive Guide: Modern Data Lakehouse Architectures with Data Lakes 1st Edition

4.6 out of 5 stars (9)

{"desktop_buybox_group_1":[{"displayPrice":"$49.00","priceAmount":49.00,"currencySymbol":"$","integerValue":"49","decimalSeparator":".","fractionalValue":"00","symbolPosition":"left","hasSpace":false,"showFractionalPartIfEmpty":true,"offerListingId":"%2Fyunp53eRinP8njgAWsjdwbIAJ2TQJuv7cicoO0dPmyLgRz0nqH%2BKz%2Bov8ADJAA9XCqKq2uId79Kbe7EuQ2zW28LxLTHNEBb1OxRdtjpEkSQtDrk646DyflbtSuv3ODTaBRl6pE02r30PMZ8VMdW%2BA%3D%3D","locale":"en-US","buyingOptionType":"NEW","aapiBuyingOptionIndex":0}]}

Purchase options and add-ons

Ready to simplify the process of building data lakehouses and data pipelines at scale? In this practical guide, learn how Delta Lake is helping data engineers, data scientists, and data analysts overcome key data reliability challenges with modern data engineering and management techniques.

Authors Denny Lee, Tristen Wentling, Scott Haines, and Prashanth Babu (with contributions from Delta Lake maintainer R. Tyler Croy) share expert insights on all things Delta Lake--including how to run batch and streaming jobs concurrently and accelerate the usability of your data. You'll also uncover how ACID transactions bring reliability to data lakehouses at scale.

This book helps you:

  • Understand key data reliability challenges and how Delta Lake solves them
  • Explain the critical role of Delta transaction logs as a single source of truth
  • Learn the Delta Lake ecosystem with technologies like Apache Flink, Kafka, and Trino
  • Architect data lakehouses with the medallion architecture
  • Optimize Delta Lake performance with features like deletion vectors and liquid clustering

Sponsored

Frequently bought together

This item: Delta Lake: The Definitive Guide: Modern Data Lakehouse Architectures with Data Lakes
$49.00
Get it as soon as Sunday, May 3
Only 12 left in stock - order soon.
Ships from and sold by Amazon.com.
+
$39.49
Get it as soon as Sunday, May 3
In Stock
Ships from and sold by Amazon.com.
+
$40.99
Get it as soon as Sunday, May 3
In Stock
Ships from and sold by Amazon.com.
Total price: $00
To see our price, add these items to your cart.
Details
Added to Cart
Some of these items ship sooner than the others.
Choose items to buy together.

Customers also bought or read

Loading...

From the brand


From the Publisher

Delta Lake: The Definitive Guide: Modern Data Lakehouse Architectures with Data Lakes

From the Preface

Welcome to Delta Lake: The Definitive Guide! Since it became an open source project in 2019, Delta Lake has revolutionized how organizations manage and process their data. Designed to bring reliability, performance, and scalability to data lakes, Delta Lake addresses many of the inherent challenges traditional data lake architectures face.

Over the past five years, Delta Lake has undergone significant transformation. Originally focused on enhancing Apache Spark, Delta Lake now boasts a rich ecosystem with integrations across various platforms, including Apache Flink, Trino, and many more. This evolution has enabled Delta Lake to become a versatile and integral component of modern data engineering and data science workflows.

Who This Book Is For

As a team of production users and maintainers of the Delta Lake project, we’re thrilled to share our collective knowledge and experience with you. Our journey with Delta Lake spans from small-scale implementations to internet-scale production lakehouses, giving us a unique perspective on its capabilities and how to work around any complexities.

The primary goal of this book is to provide a comprehensive resource for both newcomers and experts in data lakehouse architectures. For those just starting with Delta Lake, we aim to elucidate its core principles and help you avoid the common mistakes we encountered in our early days. If you’re already well versed in Delta Lake, you’ll find valuable insights into the underlying codebase, advanced features, and optimization techniques to enhance your lakehouse environment.

Throughout these pages, we celebrate the vibrant Delta Lake community and its collaborative spirit! We’re particularly proud to highlight the development of the Delta Rust API and its widely adopted Python bindings, which exemplify the community’s innovative approach to expanding Delta Lake’s capabilities. Delta Lake has evolved significantly since its inception, growing beyond its initial focus on Apache Spark to embrace a wide array of integrations with multiple languages and frameworks. To reflect this diversity, we’ve included code examples featuring Flink, Kafka, Python, Rust, Spark, Trino, and more. This broad coverage ensures that you’ll find relevant examples regardless of your preferred tools and languages.

While we cover the fundamental concepts, we’ve also included our personal experiences and lessons learned. More importantly, we go beyond theory to offer practical guidance on running a production lakehouse successfully. We’ve included best practices, optimization techniques, and real-world scenarios to help you navigate the challenges of implementing and maintaining a Delta Lake–based system at scale.

Whether you’re a data engineer, architect, or scientist, our goal is to equip you with the knowledge and tools to leverage Delta Lake effectively in your data projects. We hope this guide serves as your companion in building robust, efficient, and scalable lakehouse architectures.

Editorial Reviews

About the Author

Denny Lee is a Staff Developer Advocate at Databricks. He is a hands-on distributed systems and data sciences engineer with extensive experience developing internet-scale infrastructure, data platforms, and predictive analytics systems for both on-premise and cloud environments. He also has a Masters of Biomedical Informatics from Oregon Health and Sciences University and has architected and implemented powerful data solutions for enterprise Healthcare customers. His current technical focuses include Distributed Systems, Apache Spark, Deep Learning, Machine Learning, and Genomics.

Tristen Wentling works in machine learning, data engineering, and statistical analysis using Python, Apache Spark, and Scala. He is a machine learning advocate loves the flexibility of neural networks. Tristen holds an M.S. in Mathematics and B.S. in Applied Mathematics.

Scott Haines is a Databricks Beacon and has been working with data systems and distributed systems and architectures for over 15 years. He recently wrote a book encapsulating his journey called Modern Data Engineering with Apache Spark: A Hands-on guide for building mission-critical streaming applications. He enjoys teaching people how to simplify data systems and data-intensive services and takes to the snow in the winter to pursue his love of snowboarding.

Prashanth Babu is a Databricks Certified Developer who helps guide design and implementation of customer use cases by building out reference architectures, best practices, frameworks, MVP, and prototypes, which enables customers to succeed in turning their data into value.

Product details

  • Publisher ‏ : ‎ O'Reilly Media
  • Publication date ‏ : ‎ December 10, 2024
  • Edition ‏ : ‎ 1st
  • Language ‏ : ‎ English
  • Print length ‏ : ‎ 380 pages
  • ISBN-10 ‏ : ‎ 1098151941
  • ISBN-13 ‏ : ‎ 978-1098151942
  • Item Weight ‏ : ‎ 1.43 pounds
  • Dimensions ‏ : ‎ 7 x 0.79 x 9.19 inches
  • Best Sellers Rank: #988,460 in Books (See Top 100 in Books)
  • Customer Reviews:
    4.6 out of 5 stars (9)

About the authors

Follow authors to get new release updates, plus improved recommendations.
Sponsored

Customer reviews

4.6 out of 5 stars
9 global ratings
Sponsored

Top reviews from the United States

  • 5 out of 5 stars
    Learn the Ins and Outs of Delta Lake
    Reviewed in the United States on February 20, 2025
    Brief content visible, double tap to read full content.
    Full content visible, double tap to read brief content.

    For anyone wanting to kick start their Delta Lake journey this book is a must have. Not only are you presented with the whys and hows to utilize this novel open table format, but you’ll learn a lot from the authors own stories. Lastly, there is a rich set of accompanying code written in pyspark, python, and even Scala and rust and virtual environments running jupyterlab for some of the chapters.

    Sending feedback...
    Thank you for your feedback.
    Sending feedback...
    Thanks, we'll investigate in the next few days.
  • 5 out of 5 stars
    Good book on Data Architecture on Modern Data Lakes
    Reviewed in the United States on February 13, 2025
    Brief content visible, double tap to read full content.
    Full content visible, double tap to read brief content.

    Good book on Data Architecture on Modern Data Lakes.

    One person found this helpful
    Sending feedback...
    Thank you for your feedback.
    Sending feedback...
    Thanks, we'll investigate in the next few days.