Data Storage
Who will this data model serve? These are the stakeholders and users of the data model.
Why does this data model need to be built? What is the purpose and objective of the data model?
What are the data modelβs core entities, attributes, and relationships? This is where you learn to see the business through the lens of data.
When is the timeframe for t... See more
Why does this data model need to be built? What is the purpose and objective of the data model?
What are the data modelβs core entities, attributes, and relationships? This is where you learn to see the business through the lens of data.
When is the timeframe for t... See more
Shortwave β rajhesh.panchanadhan@gmail.com [Gmail alternative]
A serverless vector database
built from first principles on object storage: 10-100x cheaper, usage-based pricing, massive scalability
built from first principles on object storage: 10-100x cheaper, usage-based pricing, massive scalability
turbopuffer
With Quary, engineers can:
View the documentation.
- π Connect to their Database
- π Write SQL queries to transform, organize, and document tables in a database
- π Create charts, dashboards and reports (in development)
- π§ͺ Test, collaborate & refactor iteratively through version control
- π Deploy the organised, documented model back up to the database
View the documentation.
GitHub - quarylabs/quary: Open-source BI for engineers
- Scalability is crucial - systems need to be designed with the assumption that query volume, document corpus size, indexing complexity etc. could increase by 10x. What works at one scale may completely break at a higher scale.
- Sharding the index, either by document or by word, is important to distribute the indexing and querying load across machines.
Claude
For High Throughput data, Grab uses Apache Avro with a strategy called Merge on Read (MOR) .
Here's the main operations with Merge on Read:
Here's the main operations with Merge on Read:
- Write Operations - When data is written, it's appended to the end of a log file. This is much more efficient than merging it in the current data and reduces the latency of writes.
- Read Operations - When you need
The Architecture of Grab's Data Lake
At the current pace of media content creation, Reddit expects their media metadata to be roughly 50 terabytes. This means they need to implement sharding and partition their tables across multiple Postgres instances.
Reddit shards their tables based on post_id where they use range-based partitioning. All posts with a post_id in a certain range will ... See more
Reddit shards their tables based on post_id where they use range-based partitioning. All posts with a post_id in a certain range will ... See more
Shortwave β rajhesh.panchanadhan@gmail.com [Gmail alternative]
Denormalization
Another way Reddit minimizes joins is by using denormalization.
They took all the metadata fields required for displaying an image post and put them together into a single JSONB field. Instead of fetching different fields and combining them, they can just fetch that single JSONB field.
This made it much more efficient to fetch all the ... See more
Another way Reddit minimizes joins is by using denormalization.
They took all the metadata fields required for displaying an image post and put them together into a single JSONB field. Instead of fetching different fields and combining them, they can just fetch that single JSONB field.
This made it much more efficient to fetch all the ... See more
Shortwave β rajhesh.panchanadhan@gmail.com [Gmail alternative]
This README is under construction as we work to build a new community driven high performance key-value store.
This project was forked from the open source Redis project right before the transition to their new source available licenses.
This README is just a fast quick start document. We are currently working on a more permanent documentation page.
W... See more
This project was forked from the open source Redis project right before the transition to their new source available licenses.
This README is just a fast quick start document. We are currently working on a more permanent documentation page.
W... See more
GitHub - valkey-io/valkey: A new project to resume development on the formerly open-source Redis project. We're calling it Valkey, like a Valkyrie.
SQLite Studio
Single binary, single command SQLite database explorer.
sqlite-studio <sqlite_db>
Features
More features available ... See more
Single binary, single command SQLite database explorer.
sqlite-studio <sqlite_db>
Features
- Overview page with common metadata.
- Tables page with each table's metadata, including the disk size being used by each table.
- Infinite scroll rows view.
- A custom query page that gives you more access to your db.
More features available ... See more