← all repositories
GoogleCloudPlatform/data-science-on-gcp

O'Reilly book code that actually stays maintained

A textbook repo where the author keeps the notebooks current with GCP changes, not abandonware.

1.4k stars Jupyter Notebook Learning
data-science-on-gcp
Velocity · 7d
+0.4
★ / day
Trend
steady
star history

What it does This is the companion code for Valliappa Lakshmanan’s O’Reilly book, covering data pipelines, machine learning, and visualization on Google Cloud Platform. The repo tracks two editions: a 2022 second edition (main branch) and an obsolete 2017 first edition branch. Notebooks and scripts walk through BigQuery, Cloud Functions, TensorFlow 2.0, and BigQuery ML.

The interesting bit Most book repos fossilize within a year. This one is explicitly kept in sync with Qwiklabs quests that are “continually tested,” and the author updated the whole stack for TensorFlow 2.0 in 2019. It’s essentially a living textbook with a bug-report pipeline.

Key highlights

  • Covers end-to-end GCP data science: ingestion, processing, ML, visualization
  • Two Qwiklabs quests map directly to the chapters for hands-on practice
  • “Open in Cloud Shell” button spins up the environment without local setup
  • 2017 edition branch exists but is explicitly marked obsolete and unmaintained
  • Author directs broken-code issues to Qwiklabs first, then GitHub issues

Caveats

  • The 2017 edition branch is dead; don’t accidentally start there
  • No local setup instructions visible; Cloud Shell or Qwiklabs appear to be the intended paths

Verdict Worth bookmarking if you’re learning GCP data engineering or following the book. Skip it if you want a standalone framework — this is curriculum, not a library.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.