Imo we have hit or are close to hitting the LLM plateau or at least period of stagnation. OpenAI fumbled massively. Sonnet 3.7 seems like a small improvement and maybe even a regression. DeepSeek R1 is great, Grok 3 is great, however thinking models in general seem to be a band aid.