Data Storage
Getting Started
Keyv is a simple key-value storage system that supports multiple backends. It's designed to be a simple and consistent way to work with key-value stores.
To learn how to use Keyv, check out the keyv README. To learn how to use a specific storage adapter, check out the README for that adapter under Storage Adapters.
Keyv is a simple key-value storage system that supports multiple backends. It's designed to be a simple and consistent way to work with key-value stores.
To learn how to use Keyv, check out the keyv README. To learn how to use a specific storage adapter, check out the README for that adapter under Storage Adapters.
jaredwray • GitHub - jaredwray/keyv: Simple key-value storage with support for multiple backends
SQL Studio
Single binary, single command SQL database explorer. SQL studio supports SQLite , libSQL , PostgreSQL , MySQL and DuckDB .
Local SQLite DB File
sql-studio sqlite [sqlite_db]
Remote libSQL Server
sql-studio libsql [url] [auth_token]
PostgreSQL Server
sql-studio postgres [url]
MySQL/MariaDB Server
sql-studio mysql [url]
Local DuckDB File
sq... See more
Single binary, single command SQL database explorer. SQL studio supports SQLite , libSQL , PostgreSQL , MySQL and DuckDB .
Local SQLite DB File
sql-studio sqlite [sqlite_db]
Remote libSQL Server
sql-studio libsql [url] [auth_token]
PostgreSQL Server
sql-studio postgres [url]
MySQL/MariaDB Server
sql-studio mysql [url]
Local DuckDB File
sq... See more
frectonz • GitHub - frectonz/sql-studio: SQL Database Explorer [SQLite, libSQL, PostgreSQL, MySQL/MariaDB, DuckDB, ClickHouse]
Data bases have gotten so good at this, that the term is almost misleading now. “Base” suggests something rigid, without which the data would slip away. But the data is always there, just bits on a nameless hard disk. The structure and the accessibility that a modern database provides exist completely independently from that hard disk. That’s right... See more
DuckDB Doesn’t Need Data To Be a Database
filesystem_spec
A specification for pythonic filesystems.
Install
pip install fsspec
would install the base fsspec. Various optionally supported features might require specification of custom extra require, e.g. pip install fsspec[ssh] will install dependencies for ssh backends support. Use pip install fsspec[full] for installation of all known... See more
A specification for pythonic filesystems.
Install
pip install fsspec
would install the base fsspec. Various optionally supported features might require specification of custom extra require, e.g. pip install fsspec[ssh] will install dependencies for ssh backends support. Use pip install fsspec[full] for installation of all known... See more
fsspec • GitHub - fsspec/filesystem_spec: A specification that python filesystems should adhere to.
At the current pace of media content creation, Reddit expects their media metadata to be roughly 50 terabytes. This means they need to implement sharding and partition their tables across multiple Postgres instances.
Reddit shards their tables based on post_id where they use range-based partitioning. All posts with a post_id in a certain range will... See more
Reddit shards their tables based on post_id where they use range-based partitioning. All posts with a post_id in a certain range will... See more
Shortwave — rajhesh.panchanadhan@gmail.com [Gmail alternative]
Expose Delta Tables via REST APIs
Git repo to test 3 architectures to expose delta tables via REST APIs. See also my blogpost here. Architectures can be described as follows:
Git repo to test 3 architectures to expose delta tables via REST APIs. See also my blogpost here. Architectures can be described as follows:
- Architecture A: Direct, Web App with DuckDB. In this architecture, APIs are directly connecting to the delta table and there is no layer in between. This implies that all data
GitHub - rebremer/expose-deltatable-via-restapi
Datasette is a tool for exploring and publishing data. It helps people take data of any shape, analyze and explore it, and publish it as an interactive website and accompanying API.
Datasette is aimed at data journalists, museum curators, archivists, local governments, scientists, researchers and anyone else who has data that they wish to share with... See more
Datasette is aimed at data journalists, museum curators, archivists, local governments, scientists, researchers and anyone else who has data that they wish to share with... See more
Datasette
Classwords are suffixes added to database column names to indicate the type of data they contain. This improves readability and makes it easier to understand the database schema. Base classwords include text, calendar, numeric and domain-specific types. It is best to avoid redundancy in column names, as this can lead to unnecessary verbosity. Using... See more
Gemini - chat to supercharge your ideas
Text Classwords
identifier (or id)
code[_<standard>]
name
description (or desc)
indicator (or ind)
number
text
Calendar Classwords
date
datetime[<timezone>] (or dt[<timezone>])
timestamp[<timezone>] (or ts[<timezone>])
Numeric Classwords
count
amount[_<currency>]
<quantity_property>[_<unit_of_measure>]
ratio
factor
percent (or pct)
Domain-Specific Classwords
uri
address
email
sku
json
geojson
Rottnest : Data Lake Indices
You don't need ElasticSearch or some vector database to do full text search or vector search. Parquet + Rottnest is all you need. Rottnest is like Postgres indices for Parquet. Read more on what it can do for e.g. logs here.
Installation
Local installation: pip install rottnest .
Rottnest supports many different index... See more
You don't need ElasticSearch or some vector database to do full text search or vector search. Parquet + Rottnest is all you need. Rottnest is like Postgres indices for Parquet. Read more on what it can do for e.g. logs here.
Installation
Local installation: pip install rottnest .
Rottnest supports many different index... See more