Learning Fundamentals of Data Engineering

Fundamentals Of Data Engineering

These are my notes from the book Fundamentals Of Data Engineering.

Although you can access the content through the github page, this is a served with mkdocs-material πŸ’•

Header image

Why ? πŸ€”

This is an amazing book for everyone involved in data.

By the end of the book you'll be better equipped to:

Which is a pretty good deal. πŸŽ‰

I thought, I can share some of my highlights from it. If you want to discover more about any of the topics, please check out the book.

If you’re interested in the book, you can purchase one. It was previously available via Redpanda, but the free copy is no longer offered. Now, that link redirects to a guide, which is still useful.

The Structure πŸ”¨

The book consists of 3 parts, made up of 11 chapters and 2 appendices.

Here is the tree of the book.

And the following are my notes, following this structure.

So grateful that this book exists. Thanks to Joe Reis and Matt Housley.

Fundamentals of Data Engineering
β”œβ”€β”€ Part 1 – Foundation and Building Blocks
β”‚   β”œβ”€β”€ 1. Data Engineering Described
β”‚   β”œβ”€β”€ 2. The Data Engineering Lifecycle
β”‚   β”œβ”€β”€ 3. Designing Good Data Architecture
β”‚   └── 4. Choosing Technologies Across the Data Engineering Lifecycle
β”œβ”€β”€ Part 2 – The Data Engineering Lifecycle in Depth
β”‚   β”œβ”€β”€ 5. Data Generation in Source Systems
β”‚   β”œβ”€β”€ 6. Storage
β”‚   β”œβ”€β”€ 7. Ingestion
β”‚   β”œβ”€β”€ 8. Orchestration
β”‚   └── 9. Queries, Modeling, and Transformation
└── Part 3 – Security, Privacy, and the Future of Data Engineering
    β”œβ”€β”€ 10. Security and Privacy
    └── 11. The Future of Data Engineering

Contents

Part 1 - Foundation and Building Blocks

  • 1. Data Engineering Described
  • 2. The Data Engineering Lifecycle
  • 3. Designing Good Data Architecture
  • 4. Choosing Technologies Across the Data Engineering Lifecycle
  • Part 2 - The Data Engineering Lifecycle in Depth

  • 5. Data Generation in Source Systems
  • 6. Storage
  • 7. Ingestion
  • 8. Queries, Modeling, and Transformation
  • 9. Serving Data for Analytics, Machine Learning, and Reverse ETL
  • Part 3 - Security, Privacy, and the Future of DE

  • 10. Security and Privacy
  • 11. The Future of Data Engineering
  • Appendices

  • Appendix A - Serialization and Compression
  • Appendix B - Cloud Networking