rw-book-cover

Metadata

Highlights

  • Versioning GBs of datasets is practically impossible with GitHub because it imposes an upper limit on the file size we can push to its remote repositories. (View Highlight)
  • The core idea is to integrate another version controlling system with Git, specifically used for large files. (View Highlight)
  • To help you skill up to build and deliver large data projects, we covered this in a dedicated deep dive here: You Cannot Build Large Data Projects Until You Learn Data Version Control! (View Highlight)
  • This 32-minute deep dive will teach you everything you need to know about building 100% reproducible ML projects. (View Highlight)
  • As we transition to real-world data projects, we need data version control. (View Highlight)