
Data Machina #222

(1) Tasks (a) Data collection, cleaning & labeling: human annotators , exploratory data analysis (b) Embeddings & feature engineering: normalization , bucketing / binning , word2vec (c) Data modeling & experimentation: accuracy , F1-score , precision , recall (d) Testing: scenario testing , AB testing , adaptive test-data (2) Biz/Org Ma... See more
Shreya Shankar • "We Have No Idea How Models will Behave in Production until Production": How Engineers Operationalize Machine Learning.
The three dynamics above can help us understand DeepSeek's recent releases. About a month ago, DeepSeek released a model called "DeepSeek-V3" that was a pure pretrained model 3 — the first stage described in #3 above. Then last week, they released "R1", which added a second stage. It's not possible to determine everything about these models from th... See more
Dario Amodei • On DeepSeek and Export Controls
LiteLLM
litellm.ai
fal.ai
fal.ai
DeepEval — It’s a tool for easy and efficient LLM testing. Deepeval aims to make writing tests for LLM applications (such as RAG) as easy as writing Python unit tests.
Testing framework for LLM Part
One concern with the development of these types of machine learning technologies is the issue of bias — ensuring the programs work equitably for all patients, regardless of age, gender, ethnicity, nationality and other demographic criteria.