• Large variety of ready-to-use LLM evaluation metrics (all with explanations) powered by

    ANY

    LLM of your choice, statistical methods, or NLP models that runs

    locally on your machine

    :

    • G-Eval

    • Summarization

    • Answer Relevancy

    • Faithfulness

    • Contextual Recall

    • Contextual Precision

    • RAGAS

    • Hallucination

    • Toxicity

    • Bias

    • etc.

from GitHub - confident-ai/deepeval: The LLM Evaluation Framework

Nicolay Gerold added

  • from r/LocalLLaMA - Reddit

    Nicolay Gerold added

  • from Benedict's Newsletter: No. 547 by Benedict Evans

    Jimmy Cerone added