Breaking Down the Modern Data Stack: Practical Insights for Leveraging Analytics Progress
You need to be able to draw insight and to structure that information such that people can then act on it. What is doing that? Today, transformation tools like dbt are doing that, if you take the lens of the data team really owning everything end-to-end, but I think also applications that are able to plug into the data warehouse, consume this raw i... See more
Jan-Erik Asplund • Earl Lee, co-founder and CEO of HeadsUp, on the modern data stack value chain
Ted Glasnow and added
despite the incredible shift of data to cloud services (improving accessibility/latency) and the venture-fueled explosion of the "Modern Data Stack" over the past 5y, the vast majority of companies are still... not very good at working with their data to make business decisions
Mo Shafieeha added
The platform needs to facilitate integrating new data, ad hoc queries, and visualization to accelerate human understanding. As valuable insights emerge from this platform, they become the requirements for changes to production systems and processes.
Thomas H. Davenport • Big Data at Work: Dispelling the Myths, Uncovering the Opportunities
Why is data integration so hard? The data is often in different formats that aren’t easily analyzed by computers – PDFs, notebooks, Excel files (my god, so many Excel files) and so on. But often what really gets in the way is organizational politics: a team, or group, controls a key data source, the reason for their existence is that they are the g... See more
Nabeel S. Qureshi • Reflections on Palantir
The next step is data transformation. Data transformation is the process of converting data from one format to another. Reasons for doing this could be to optimize the data for a different use case than it was originally intended or to meet the requirements for storing data in a different system. Data transformation may involve steps such as cleans... See more
Data Engineering • The Open Data Stack Distilled into Four Core Tools
Nicolay Gerold added
My top issues with the foundations of the "modern data stack":
- Snowflake - you may only pay for what you use, but you pay through the nose for it. The cost is so high that in many cases it can exceed the cost of binders full of dbas.
- Fivetran - convenient, but they tried to triple my licensing cost last year, they have almost constant 15 minute o
r/dataengineering - Reddit
Nicolay Gerold added