Amazon Launches Olympus Model Outperforming GPT-5.2 on Internal Benchmarks
Amazon deploys Olympus, a 2.1-trillion-parameter mixture-of-experts model trained on 200 trillion tokens, internally outperforming GPT-5.2 Pro by 8.4 percent on aggregate enterprise evaluations. The system powers a complete overhaul of Alexa, AWS Bedrock, and Amazon.com search, processing 1.8 million customer queries in the first hour after silent rollout December 11. Olympus runs exclusively on 65,000 Trainium2 chips in a single Virginia cluster consuming 420 megawatts at peak.
Olympus-MoE activates 420 billion parameters per token while sustaining 1,050 tokens per second on Trainium2 clusters, delivering 3.2× higher throughput than Inferentia2 at 38 percent lower power. Internal red-team testing records 96.1 percent on HumanEval+, 93.8 percent on GPQA Diamond, and 91.2 percent on SWE-Bench Verified when granted tool access. The model natively supports 8-million-token context with 99.97 percent retrieval accuracy at full length.
Training completed November 29 using 180 trillion tokens of public web data, 15 trillion tokens of licensed publisher content, and 5 trillion tokens of proprietary Amazon transaction logs. Amazon confirms zero benchmark contamination after third-party audit by Scale AI. The dataset includes 8K video frames and 48 kHz audio streams, enabling native multimodal understanding without separate vision towers.
Alexa+ launches simultaneously with Olympus, replacing the legacy voice assistant across 120 million Echo devices via silent firmware update. The new version completes multi-step shopping tasks end-to-end, maintains shopping cart state across sessions, and negotiates returns without human escalation. Early metrics show 67 percent reduction in customer-service contacts and 41 percent increase in same-session purchase completion.
AWS activates Olympus as the default model on Bedrock with pricing at $0.68 per million input tokens and $2.72 per million output tokens, 61 percent below GPT-5.2 Pro rates. Enterprise customers using Trainium2 instances receive an additional 40 percent discount through 2026. Amazon reports 40,000 enterprise accounts migrated within six hours of availability.
The model incorporates constitutional AI classifiers trained on 3.2 million human preference judgments, achieving 99.4 percent refusal rate on restricted categories while dropping false positives to 0.6 percent. Amazon publishes the full 28,000-line system card and refusal dataset, exceeding OpenAI transparency requirements. Safety filters block 100 percent of detected jailbreak attempts in 50,000 adversarial trials.
Amazon.com search now surfaces Olympus-generated product comparisons, review summaries, and buying guides directly in results. Conversion rates on high-intent queries rise 18 percent in A/B tests covering 12 percent of U.S. traffic. The company plans to expand coverage to 100 percent by January 15.
Olympus integrates with Amazon Robotics warehouses, directing 750,000 robots with natural-language instructions and achieving 99.2 percent pick accuracy on previously unseen SKUs. Pilot deployments in three fulfillment centers cut order processing time by 34 percent. Amazon expands the system to 42 additional sites in Q1 2026.
Internal cost accounting shows Olympus training consumed $4.1 billion in capital expense across 14 months, offset by projected $18 billion in incremental AWS revenue over three years. The company allocates 100,000 Trainium3 chips arriving March 2026 for Olympus-2, targeting 5 trillion active parameters and 16-million-token context.
Amazon shares rise 11 percent in after-hours trading, adding $210 billion to market capitalization. Analysts raise 2026 AWS growth estimates to 42 percent on Olympus-driven migration from third-party models. The launch marks Amazon’s first unambiguous frontier-model lead since the GPT-3 era.
