Data Storage
Our Goals
We made it lightweight and kept the efficiency in mind:
We made it lightweight and kept the efficiency in mind:
- Self-contained
We ship a single dependency-free binary that runs on all Linux distributions - Fast to deploy, safe to operate
We are sysadmins, we know the value of operator-friendly software - Deploy everywhere on every machine
We do not have a dedicated backbone, and neither do you,
so
Garage - An open-source distributed object storage service
We can't share the exact formula for our search ranking, but here are the few parameters we consider:
- Exact match (rank #1)
- Frequency of matching lexemes using ts_rank
- Similarity score using similarity
- Type of record
- Popularity of the search result
- Similarity between the result’s alias and query
- Inverse of the result’s string length
How Levels.fyi Built Scalable Search with PostgreSQL
This README is under construction as we work to build a new community driven high performance key-value store.
This project was forked from the open source Redis project right before the transition to their new source available licenses.
This README is just a fast quick start document. We are currently working on a more permanent documentation page.
W... See more
This project was forked from the open source Redis project right before the transition to their new source available licenses.
This README is just a fast quick start document. We are currently working on a more permanent documentation page.
W... See more
GitHub - valkey-io/valkey: A new project to resume development on the formerly open-source Redis project. We're calling it Valkey, like a Valkyrie.
A serverless vector database
built from first principles on object storage: 10-100x cheaper, usage-based pricing, massive scalability
built from first principles on object storage: 10-100x cheaper, usage-based pricing, massive scalability
turbopuffer
ReadySet is a transparent database cache for Postgres & MySQL that gives you the performance and scalability of an in-memory key-value store without requiring that you rewrite your app or manually handle cache invalidation. ReadySet sits between your application and database and turns even the most complex SQL reads into lightning-fast lookups.... See more
readysettech • GitHub - readysettech/readyset: Readyset is a MySQL and Postgres wire-compatible caching layer that sits in front of existing databases to speed up queries and horizontally scale read throughput. Under the...
Expose Delta Tables via REST APIs
Git repo to test 3 architectures to expose delta tables via REST APIs. See also my blogpost here. Architectures can be described as follows:
Git repo to test 3 architectures to expose delta tables via REST APIs. See also my blogpost here. Architectures can be described as follows:
- Architecture A: Direct, Web App with DuckDB. In this architecture, APIs are directly connecting to the delta table and there is no layer in between. This implies that all data
GitHub - rebremer/expose-deltatable-via-restapi
Overview
pg_lakehouse is an extension that transforms Postgres into an analytical query engine over object stores like S3 and table formats like Delta Lake. Queries are pushed down to Apache DataFusion, which delivers excellent analytical performance. Combinations of the following object stores, table formats, and file formats are supported.
Object... See more
pg_lakehouse is an extension that transforms Postgres into an analytical query engine over object stores like S3 and table formats like Delta Lake. Queries are pushed down to Apache DataFusion, which delivers excellent analytical performance. Combinations of the following object stores, table formats, and file formats are supported.
Object... See more
https://github.com/paradedb/paradedb/tree/dev/pg_l...
Data bases have gotten so good at this, that the term is almost misleading now. “Base” suggests something rigid, without which the data would slip away. But the data is always there, just bits on a nameless hard disk. The structure and the accessibility that a modern database provides exist completely independently from that hard disk. That’s right... See more
DuckDB Doesn’t Need Data To Be a Database
Datasette is a tool for exploring and publishing data. It helps people take data of any shape, analyze and explore it, and publish it as an interactive website and accompanying API.
Datasette is aimed at data journalists, museum curators, archivists, local governments, scientists, researchers and anyone else who has data that they wish to share with... See more
Datasette is aimed at data journalists, museum curators, archivists, local governments, scientists, researchers and anyone else who has data that they wish to share with... See more