Data Storage
SQL Studio
Single binary, single command SQL database explorer. SQL studio supports SQLite , libSQL , PostgreSQL , MySQL and DuckDB .
Local SQLite DB File
sql-studio sqlite [sqlite_db]
Remote libSQL Server
sql-studio libsql [url] [auth_token]
PostgreSQL Server
sql-studio postgres [url]
MySQL/MariaDB Server
sql-studio mysql [url]
Local DuckDB File
sq... See more
Single binary, single command SQL database explorer. SQL studio supports SQLite , libSQL , PostgreSQL , MySQL and DuckDB .
Local SQLite DB File
sql-studio sqlite [sqlite_db]
Remote libSQL Server
sql-studio libsql [url] [auth_token]
PostgreSQL Server
sql-studio postgres [url]
MySQL/MariaDB Server
sql-studio mysql [url]
Local DuckDB File
sq... See more
frectonz • GitHub - frectonz/sql-studio: SQL Database Explorer [SQLite, libSQL, PostgreSQL, MySQL/MariaDB, DuckDB, ClickHouse]
WebDataset
WebDataset is a library for writing I/O pipelines for large datasets. Its sequential I/O and sharding features make it especially useful for streaming large-scale datasets to a DataLoader.
The WebDataset format
A WebDataset file is a TAR archive containing a series of data files. All successive data files with the same prefix are consider... See more
WebDataset is a library for writing I/O pipelines for large datasets. Its sequential I/O and sharding features make it especially useful for streaming large-scale datasets to a DataLoader.
The WebDataset format
A WebDataset file is a TAR archive containing a series of data files. All successive data files with the same prefix are consider... See more
WebDataset
For High Throughput data, Grab uses Apache Avro with a strategy called Merge on Read (MOR) .
Here's the main operations with Merge on Read:
Here's the main operations with Merge on Read:
- Write Operations - When data is written, it's appended to the end of a log file. This is much more efficient than merging it in the current data and reduces the latency of writes.
- Read Operations - When you need
The Architecture of Grab's Data Lake
Our Goals
We made it lightweight and kept the efficiency in mind:
We made it lightweight and kept the efficiency in mind:
- Self-contained
We ship a single dependency-free binary that runs on all Linux distributions - Fast to deploy, safe to operate
We are sysadmins, we know the value of operator-friendly software - Deploy everywhere on every machine
We do not have a dedicated backbone, and neither do you,
so
Garage - An open-source distributed object storage service
SQL has limitations as it is built on relational concepts and relies on binary joins.
The future of databases is shifting towards relational knowledge graphs, allowing the flexibility to work with various data structures beyond tables.
Businesses are moving towards explicitly modeling business semantics and logic, which are often stored in document... See more
The future of databases is shifting towards relational knowledge graphs, allowing the flexibility to work with various data structures beyond tables.
Businesses are moving towards explicitly modeling business semantics and logic, which are often stored in document... See more
Nicolay Gerold • Tweet
pg_vectorize: a VectorDB for Postgres
A Postgres extension that automates the transformation and orchestration of text to embeddings and provides hooks into the most popular LLMs. This allows you to do vector search and build LLM applications on existing data with as little as two function calls.
This project relies heavily on the work by pgvector f... See more
A Postgres extension that automates the transformation and orchestration of text to embeddings and provides hooks into the most popular LLMs. This allows you to do vector search and build LLM applications on existing data with as little as two function calls.
This project relies heavily on the work by pgvector f... See more
GitHub - tembo-io/pg_vectorize: The simplest way to orchestrate vector search on Postgres
Data bases have gotten so good at this, that the term is almost misleading now. “Base” suggests something rigid, without which the data would slip away. But the data is always there, just bits on a nameless hard disk. The structure and the accessibility that a modern database provides exist completely independently from that hard disk. That’s right... See more
DuckDB Doesn’t Need Data To Be a Database
filesystem_spec
A specification for pythonic filesystems.
Install
pip install fsspec
would install the base fsspec. Various optionally supported features might require specification of custom extra require, e.g. pip install fsspec[ssh] will install dependencies for ssh backends support. Use pip install fsspec[full] for installation of all known extr... See more
A specification for pythonic filesystems.
Install
pip install fsspec
would install the base fsspec. Various optionally supported features might require specification of custom extra require, e.g. pip install fsspec[ssh] will install dependencies for ssh backends support. Use pip install fsspec[full] for installation of all known extr... See more
fsspec • GitHub - fsspec/filesystem_spec: A specification that python filesystems should adhere to.
Unlike some other popular algorithms, DiskANN is designed to keep memory usage to a minimum. This makes it a great match for use cases where Turso already excels at.
#Multitenancy
Turso allows for an easy implementation of a database-per-tenant pattern, where databases can be cheaply created on-demand. Keeping memory consumption at bay is critical f... See more
#Multitenancy
Turso allows for an easy implementation of a database-per-tenant pattern, where databases can be cheaply created on-demand. Keeping memory consumption at bay is critical f... See more
Turso brings Native Vector Search to SQLite
memary: Open-Source Longterm Memory for Autonomous Agents
memary demo
Why use memary?
Agents use LLMs that are currently constrained to finite context windows. memary overcomes this limitation by allowing your agents to store a large corpus of information in knowledge graphs, infer user knowledge through our memory modules, and only retrieve relevan... See more
memary demo
Why use memary?
Agents use LLMs that are currently constrained to finite context windows. memary overcomes this limitation by allowing your agents to store a large corpus of information in knowledge graphs, infer user knowledge through our memory modules, and only retrieve relevan... See more
GitHub - kingjulio8238/memary: Longterm Memory for Autonomous Agents.
Data