updated 3mo ago
notes.billmill.org
- ETL
The part of the system I'm most proud of, and on which I spent the most effort, is the ETL process.
We had a series of shell scripts for each data source we ingested (there were many), which would pull the data and put it in an s3 bucket.
Then, early in the morning, a cron job would spin up an EC2 instance, which would pull in the latest ETL code... See morefrom notes.billmill.org by Bill Mill
Nicolay Gerold added 3mo ago
- We had a CLI tool, written mostly as a bunch of shell scripts, with a ton of available commands that performed all kinds of utility functions related to observability and operations.
It was mostly written by an excellent coworker, and working with it is where I learned to write effective shell scripts. (Thanks to Nathan and shellcheck for this vital... See morefrom notes.billmill.org by Bill Mill
Nicolay Gerold added 3mo ago
- Local database for development
Each table in the database had an accompanying script that would generate a subset of the data for use in local development, since the final database was too large to run on a developer's machine.
This let each developer work with a live, local, copy of the database and enabled efficient development of changes.
I highly... See morefrom notes.billmill.org by Bill Mill
Nicolay Gerold added 3mo ago
- ETL
The part of the system I'm most proud of, and on which I spent the most effort, is the ETL process.
We had a series of shell scripts for each data source we ingested (there were many), which would pull the data and put it in an s3 bucket.
Then, early in the morning, a cron job would spin up an EC2 instance, which would pull in the latest ETL code... See morefrom notes.billmill.org by Bill Mill
Nicolay Gerold added 3mo ago