Major Milestones in Generative AI
A reverse-chronological timeline of the most significant developments in generative AI—from the latest model releases and agent capabilities back to the launch of ChatGPT. Covers multimodality, MCP adoption, reasoning models, and more.
-
February 2025
Claude Opus 4.6 & Claude fleets
Anthropic releases Claude Opus 4.6 with "agent teams"—multiple Claude instances working in parallel to tackle complex, long-running tasks. Notably demonstrated by a 16-agent run that produced a ~100k-line C compiler. Features 1M-token context (beta), hybrid reasoning, stronger coding and debugging, and adaptive thinking for autonomous workflows. Anthropic
-
February 2025
OpenAI GPT-4.5 Orion
OpenAI releases GPT-4.5 (codenamed Orion), its largest AI model yet. Introduces new scalable alignment techniques, broader world knowledge, improved natural conversation, and significantly lower hallucination rates. Significant efficiency gains over prior models. TechCrunch
-
January 2025
OpenAI o3-mini
Smaller, cost-efficient reasoning model with three reasoning-effort settings (low/medium/high). Supports function calling, structured outputs, and streaming. Replaces o1-mini with stronger STEM and coding performance. OpenAI
-
December 2024
Sora text-to-video
OpenAI releases Sora, its text-to-video model, to ChatGPT Plus and Pro users. Users can generate realistic video from natural language, with support for storyboards, widescreen/vertical/square outputs, and C2PA metadata. OpenAI · Reuters
-
December 2024
Google Gemini 2.0
Google announces Gemini 2.0, positioned for the "agentic era" with native multimodal understanding and improved reasoning for complex, multi-step tasks. Google
-
November 2024
Model Context Protocol (MCP) introduced
Anthropic announces and open-sources the Model Context Protocol—an open standard to connect LLMs to external data and tools ("USB-C for LLMs"). Claude Desktop gets local MCP server support; pre-built servers for Google Drive, Slack, GitHub, Postgres, and more. Block and Apollo among early adopters. Anthropic
-
Late 2024 – 2025
MCP adopted by AI models and tools
MCP becomes widely adopted: Cursor, Claude Code, Zed, Replit, Codeium, Sourcegraph, and other development platforms integrate MCP. Claude 4 API adds MCP connector. Models can now discover and invoke external tools through a standardized protocol. Claude
-
October 2024
Claude 3.5 Sonnet computer use
Anthropic adds "computer use" (beta) to Claude 3.5 Sonnet—the model can control a user's screen, navigate apps, and perform actions. Represents a major step toward AI that can act in the world, not just respond. Anthropic
-
September 2024
OpenAI o1 reasoning model
OpenAI releases o1-preview, a new "reasoning" model trained to "think longer" for harder STEM, math, and coding problems. Large gains over prior models on complex reasoning; improved safety and jailbreak resistance. OpenAI
-
June 2024
Apple Intelligence
Apple announces Apple Intelligence at WWDC—a personal AI system integrated into iOS 18, iPadOS 18, and macOS Sequoia. Systemwide Writing Tools, image generation, app actions, and personalized assistance. Uses on-device models plus Private Cloud Compute for privacy. Apple
-
June 2024
Claude 3.5 Sonnet
Claude 3.5 Sonnet launches with major improvements in coding, analysis, and creative writing. Strong performance on benchmarks; availability on Claude.ai, API, Bedrock, and Vertex AI. Anthropic
-
May 2024
GPT-4o: native multimodal
OpenAI releases GPT-4o ("o" for omni)—a single model that natively handles text, audio, and vision. Real-time voice conversation, vision in ChatGPT, and DALL·E 3 integration. Faster and more capable than GPT-4 Turbo across modalities. OpenAI
-
April 2024
Meta Llama 3
Meta releases Llama 3 (8B and 70B) as open-weights. Powers Meta AI across Facebook, Instagram, WhatsApp, Messenger. Strong benchmark performance; supports text and training included images (multimodal variants later). Meta
-
March 2024
Claude 3 family
Anthropic releases Claude 3 Opus, Sonnet, and Haiku—a full lineup with strong performance across coding, analysis, and long-context tasks. Multimodal (vision) support. Anthropic
-
February 2024
Gemini 1.5
Google announces Gemini 1.5 with 1M-token context window. Breakthrough in long-context; Pro and Flash variants; native audio understanding, system instructions, JSON mode. Google
-
November 2023
GPT-4 Turbo
OpenAI announces GPT-4 Turbo at DevDay—128K context window, lower pricing, improved function calling. DALL·E 3, TTS, and vision integrated into ChatGPT. OpenAI
-
October 2023
DALL·E 3 in ChatGPT
OpenAI integrates DALL·E 3 into ChatGPT Plus and Enterprise. Text-to-image with stronger fidelity, better prompt adherence, and multi-tier safety. OpenAI
-
December 2023
Google Gemini 1.0
Google announces Gemini 1.0 (Ultra, Pro, Nano)—native multimodal from the ground up. Powers Bard (later Gemini); NotebookLM and Vertex AI integration. Google
-
March 2023
GPT-4: multimodal foundation
OpenAI releases GPT-4—a large multimodal model accepting image and text inputs. Major leap in reasoning, coding, and complex tasks. Image-input capability rolled out via partners; text available in ChatGPT and API. OpenAI
-
November 30, 2022
ChatGPT launched
OpenAI releases ChatGPT to the public, built on GPT-3.5. A conversational AI that answers questions, writes code, and assists with tasks. The release that ignited the generative AI boom and shifted how millions interact with AI. OpenAI · History