r/MachineLearning

r/MachineLearning - Reddit

RelatedInsightsHighlights

A friend of mine recently bombed MLE interview at NVIDIA, they asked: "We need to deploy a Llama-3 70B model on hardware with limited VRAM. You propose quantization. When is this a bad idea?" Here's how you break it down:

Ashutosh Maheshwari x.com

Thumbnail of www-x-com-alexocheema-status-1908651942777397737-bc9fdb4a9e06405a

Meta 🤝 Apple Llama 4 + Apple Silicon is a match made in heaven. Here's why: Like DeepSeek V3/R1, all of the new Llama 4 variants are massive sparse MoE models. They have a massive amount of parameters, but only a small number of those are active each time a token is generated. We don't know... See more

Alex Cheema - e/acc

x.com

karpathy on using LLMs for coding exactly how i use them, interactively in way where i also learn and this is why we focus so much on speed https://t.co/VXuAzGwqyx

dax

x.com