updated 1y ago
Data Machina #222
- Large variety of ready-to-use LLM evaluation metrics (all with explanations) powered by
ANY
LLM of your choice, statistical methods, or NLP models that runs
locally on your machine
:- G-Eval
- Summarization
- Answer Relevancy
- Faithfulness
- Contextual Recall
- Contextual Precision
- RAGAS
- Hallucination
- Toxicity
- Bias
- etc.
from GitHub - confident-ai/deepeval: The LLM Evaluation Framework
Nicolay Gerold added
- Large variety of ready-to-use LLM evaluation metrics (all with explanations) powered by
- am using my own hardware at home to infer, train, and fine-tune (or trying to; my training efforts have been pretty disasterous so far, but inference works very well).
My current uses of LLM inference are:- Asking questions of a RAG system backed by a locally indexed Wikipedia dump, mainly with Marx-3B and PuddleJumper-13B-v2,
- Code co-pilot with Rift-C
from r/LocalLLaMA - Reddit
Nicolay Gerold added
Anthropic has started building out team collaboration tools. One big question for the next couple of years might be whether the general LLMs become useful for specific, vertical tasks faster than the vertical tools get LLM features.
from Benedict's Newsletter: No. 547 by Benedict Evans
Jimmy Cerone added
Multiplayer AI sounds genuinely exciting. Waveform (social startup) was doing this before shutting down recently.