DeepSpeed-FastGen

Prompt Engineering for LLMs

oreilly.com
Thumbnail of Prompt Engineering for LLMs

microsoft GitHub - microsoft/LLMLingua: To speed up LLMs' inference and enhance LLM's perceive of key information, compress the prompt and KV-Cache, which achieves up to 20x compression with minimal performance loss.

Introducing GPT-4.1 in the API

Introduction to Generative AI

Laura Hartenberger What AI Teaches Us About Good Writing

Evan Armstrong What Actually Matters (And What Doesn’t) for DeepSeek