Reproducible data science over data lakes: replayable data pipelines with Bauplan and Nessie.
Reproducible data science is enabled through Bauplan and Nessie, providing time-travel and branching semantics on data lakes, decoupling compute from data management.
arxiv.org