Sublime
An inspiration engine for ideas

man, scientists working on optimizing matrix multiplications have oppenheimer level of aura
- use a RL agent to spit out heckload of bilinear products
- slap two MILP to combine and filter those
- iterate on top of a Large Neighborhood Search flow until it’s fast... See more

Quantum Monte Carlo Algorithm for Aolving Black-Scholes PDE’s for High-Dimensional Option Pricing in Finance & Complexity Analysis https://t.co/qETffafL2Z
Okay okay, spent my weekend gooning around learning GRPO math. Here's some takes.
Essentially, this is me yapping through a recap of smaller details on how GRPO is implemented, what Dr. GRPO changes, why, DAPO, connections to PPO, aggregating batches...
Reading list below.... See more
Nathan Lambertx.com
even if you're working exclusively with API models, you still should be learning about reward shaping for RL.
o4-mini is a great model, and is very popular among agent builders. you can make it even better for your task with RFT.
but you have to bring your own reward function.... See more
We should use soft-max to mean "log of sum of exponentials."
What is often called soft-max should really be called soft-argmax.
Even John Bridle, who coined the word soft-max, agrees.
Yann LeCunx.comProfessor Howard Raiffa’s maximum bid of others (or MBOO, pronounced “maboo”) analysis, which captures the fundamental trade-off in graphical form.
Guhan Subramanian • Dealmaking: The New Strategy of Negotiauctions (Second Edition)
Utility maximization was first developed by utilitarian philosophers Jeremy Bentham and John Stuart Mill. In microeconomics, the utility maximization problem is the problem consumers face: "How should I spend my money in order to maximize my utility?" It is a type of optimal decision problem. It consists of choosing how much of each available good... See more
Jonothan Levin • Utility maximization problem

MoRA: High-Rank Updating for Parameter-Efficient Fine-Tuning
Outperforms LoRA on memoryintensive tasks and achieves comparable performance on other tasks
repo: https://t.co/EV3CSsYpKq
abs: https://t.co/4WpHBl4EPt https://t.co/Gl8yBzeobi