Sublime
An inspiration engine for ideas
Zapper
@zapper-a453b21417df437a
(1) The separation between storage and compute , as encouraged by data lake architectures (e.g. the implementation of P would look different in a traditional database like PostgreSQL, or a cloud warehouse like Snowflake). This architecture is the focus of the current system, and it is prevalent in most mid-to-large enterprises (its benefits that be... See more
Jacopo Tagliabue • Reproducible data science over data lakes: replayable data pipelines with Bauplan and Nessie.

Some practices do the data entry of codes via internal staff, a biller or coder. Others send the paper out to be handled by a third party known as a billing service that does the data entry.
David Uhlman • Hacking Healthcare: A Guide to Standards, Workflows, and Meaningful Use
Messy data in SFDC? Auto-compare against signed Order Forms and update SFDC records
tryklarity.com
This text can be written material like an e-mail, transcribed material such as a medical dictation, or even text that has been scanned from a hard copy and converted to electric form like old courthouse records.
Bill Franks • Taming The Big Data Tidal Wave: Finding Opportunities in Huge Data Streams with Advanced Analytics (Wiley and SAS Business Series)
Special K
@codingatnight
DataTrove
DataTrove is a library to process, filter and deduplicate text data at a very large scale. It provides a set of prebuilt commonly used processing blocks with a framework to easily add custom functionality.
DataTrove processing pipelines are platform-agnostic, running out of the box locally or on a slurm cluster. Its (relatively) low memory... See more
DataTrove is a library to process, filter and deduplicate text data at a very large scale. It provides a set of prebuilt commonly used processing blocks with a framework to easily add custom functionality.
DataTrove processing pipelines are platform-agnostic, running out of the box locally or on a slurm cluster. Its (relatively) low memory... See more
huggingface • GitHub - huggingface/datatrove: Freeing data processing from scripting madness by providing a set of platform-agnostic customizable pipeline processing blocks.
low-level textuality of hashing accords with Hayles's conception of a "flexible chain of markers bound together by the arbitrary relations specified by the relevant codes."