The Architecture of Grab's Data Lake
(1) The separation between storage and compute , as encouraged by data lake architectures (e.g. the implementation of P would look different in a traditional database like PostgreSQL, or a cloud warehouse like Snowflake). This architecture is the focus of the current system, and it is prevalent in most mid-to-large enterprises (its benefits that... See more
Jacopo Tagliabue • Reproducible data science over data lakes: replayable data pipelines with Bauplan and Nessie.
Behind AWS S3’s Massive Scale
highscalability.comThe commonly agreed on benefits of this style include: increase in agility, developer productivity, resilience, scalability, reliability, maintainability, separation of concerns, and ease of deployment. However, those benefits come with challenges, such as discovering services over the network, security management, communication optimization, data
... See more
.. you’ve got all of these disparate functions, which all basically run off the same underlying CAD [Computer-Aided Design] data but are just transformed in lots of different ways. So because of those transformations, you end up having the same data represented in different data silos, which then leads to error.
- James Proud, interview with Ben
... See more