GPT-4.1 benchmark results:
GPT-4.1 scores worse than GPT-4, Opus and Llama-3.1-70B
GPT-4.1 API version is WORSE than Optimus Alpha and Quasar Alpha
GPT-4.1 mini scores worse than Qwen2.5 32B, Llama-4 Maverick and Claude 3 Haiku
GPT-4.1 Nano, which is OpenAIs Gemini 2.0 Flash Lite competitor gets crushed