
Gradually, then Suddenly: Upon the Threshold


I was given early access to Grok 3 earlier today, making me I think one of the first few who could run a quick vibe check.
Thinking
✅ First, Grok 3 clearly has an around state of the art thinking model ("Think" button) and did great out of the box on my Settler's of Catan... See more
Okay so I didn't super expect the results of the GPT4 vs. GPT4.5 poll from earlier today 😅, of this thread:
https://t.co/9A3nsWh8BG
✅ Question 1: GPT4.5 is A; 56% of people prefer it.
❌Question 2: GPT4.5 is B; 43% of people prefer it.
❌Question 3: GPT4.5 is A; 35% of people... See more
Andrej Karpathyx.comGemini Paper Flipthrough
TLDR: one step closer to AGI, but the Ultra model requires max nerfing because it’s way too dangerous to release for Google. They have complex multi-step reasoning, with tool use, search, comprehension working. Don’t know reliability, latency, cost.
> 3 models... See more
Prakash (Ate-a-Pi)x.com