News
Evaluation results show that gpt-oss-120b matches or outperforms OpenAI’s o4-mini in several benchmarks, including competition-level coding, mathematics, and health-related tasks. The smaller 20b ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results