Data Storage
SQL has limitations as it is built on relational concepts and relies on binary joins.
The future of databases is shifting towards relational knowledge graphs, allowing the flexibility to work with various data structures beyond tables.
Businesses are moving towards explicitly modeling business semantics and logic, which are often stored in... See more
The future of databases is shifting towards relational knowledge graphs, allowing the flexibility to work with various data structures beyond tables.
Businesses are moving towards explicitly modeling business semantics and logic, which are often stored in... See more
Nicolay Gerold • Tweet
- Scalability is crucial - systems need to be designed with the assumption that query volume, document corpus size, indexing complexity etc. could increase by 10x. What works at one scale may completely break at a higher scale.
- Sharding the index, either by document or by word, is important to distribute the indexing and querying load across machines.
Claude
Data bases have gotten so good at this, that the term is almost misleading now. “Base” suggests something rigid, without which the data would slip away. But the data is always there, just bits on a nameless hard disk. The structure and the accessibility that a modern database provides exist completely independently from that hard disk. That’s right... See more
DuckDB Doesn’t Need Data To Be a Database
WebDataset
WebDataset is a library for writing I/O pipelines for large datasets. Its sequential I/O and sharding features make it especially useful for streaming large-scale datasets to a DataLoader.
The WebDataset format
A WebDataset file is a TAR archive containing a series of data files. All successive data files with the same prefix are... See more
WebDataset is a library for writing I/O pipelines for large datasets. Its sequential I/O and sharding features make it especially useful for streaming large-scale datasets to a DataLoader.
The WebDataset format
A WebDataset file is a TAR archive containing a series of data files. All successive data files with the same prefix are... See more
WebDataset
- Always use BUFFERS when running an EXPLAIN . It gives some data that may be crucial for the investigation.
- Always, always try to get an Index Cond (called Index range scan in MySQL) instead of a Filter .
- Always, always, always assume PostgreSQL and MySQL will behave differently. Because they do.
Making a Postgres query 1,000 times faster
memary: Open-Source Longterm Memory for Autonomous Agents
memary demo
Why use memary?
Agents use LLMs that are currently constrained to finite context windows. memary overcomes this limitation by allowing your agents to store a large corpus of information in knowledge graphs, infer user knowledge through our memory modules, and only retrieve... See more
memary demo
Why use memary?
Agents use LLMs that are currently constrained to finite context windows. memary overcomes this limitation by allowing your agents to store a large corpus of information in knowledge graphs, infer user knowledge through our memory modules, and only retrieve... See more
GitHub - kingjulio8238/memary: Longterm Memory for Autonomous Agents.
Data
filesystem_spec
A specification for pythonic filesystems.
Install
pip install fsspec
would install the base fsspec. Various optionally supported features might require specification of custom extra require, e.g. pip install fsspec[ssh] will install dependencies for ssh backends support. Use pip install fsspec[full] for installation of all known... See more
A specification for pythonic filesystems.
Install
pip install fsspec
would install the base fsspec. Various optionally supported features might require specification of custom extra require, e.g. pip install fsspec[ssh] will install dependencies for ssh backends support. Use pip install fsspec[full] for installation of all known... See more
fsspec • GitHub - fsspec/filesystem_spec: A specification that python filesystems should adhere to.
pgmock
Demo — Discord
pgmock is an in-memory PostgreSQL mock server for unit and E2E tests. It requires no external dependencies and runs entirely within WebAssembly on both Node.js and the browser.
Installation
npm install pgmock
If you'd like to run pgmock in a browser, see the Browser support section for detailed instructions.
Demo — Discord
pgmock is an in-memory PostgreSQL mock server for unit and E2E tests. It requires no external dependencies and runs entirely within WebAssembly on both Node.js and the browser.
Installation
npm install pgmock
If you'd like to run pgmock in a browser, see the Browser support section for detailed instructions.
stackframe-projects • GitHub - stackframe-projects/pgmock: In-memory Postgres for unit/E2E tests
SQLite Studio
Single binary, single command SQLite database explorer.
sqlite-studio <sqlite_db>
Features
More features available on the r... See more
Single binary, single command SQLite database explorer.
sqlite-studio <sqlite_db>
Features
- Overview page with common metadata.
- Tables page with each table's metadata, including the disk size being used by each table.
- Infinite scroll rows view.
- A custom query page that gives you more access to your db.
More features available on the r... See more