Don't Worry About the Vase • 1881 implied HN points • 04 Mar 26
- Gemini 3.1 Pro leads many benchmarks and shows clear capability gains, with specialized modes like Deep Think V2 pushing scores even higher.
- Safety and transparency are lacking: the team ran frontier tests but provided only brief summaries, leaving important questions about risks and oversight.
- Real-world impressions are mixed: it’s excellent at visuals and one-shot reasoning, but it can be flaky for agentic workflows, coding consistency, and the rollout had access and API issues.