The new Mercury 2 AI model uses diffusion reasoning to generate 1,000 tokens per second; it runs about 5x faster than Haiku, speed limits are ...
Inception, the company behind the first commercial diffusion large language models (dLLMs), today announced the launch of Mercury 2, the fastest reasoning LLM and first reasoning dLLM. Mercury 2 ...
Mercury 2 targets structured tasks with schema-aligned JSON output; supports OpenAI API drop-in integration, for simpler deployment.