Large variety of ready-to-use LLM evaluation metrics (all with explanations) powered by
ANY
LLM of your choice, statistical methods, or NLP models that runs
locally on your machine
:
- G-Eval
- Summarization
- Answer Relevancy
- Faithfulness
- Contextual Recall
- Contextual Precision
- RAGAS
- Hallucination
- Toxicity
- Bias
- etc.
Nicolay Gerold added 4mo ago