Testing Grok 4 On Math

xAI is testing Grok 4.20 to take on GPT-5, may launch this month

Elon Musk-owned xAI is testing Grok 4.20, a new model update to Grok 4, which already competes with GPT-5 in some benchmarks, such as ARC-AGI 2. GPT-5 is one of the best models for coding, and it ...

Gemini 3 Flash Crushes ChatGPT-5.2 in Accuracy Test – ORCA Benchmark Update

New ORCA results show Gemini leading in practical math, but no AI matches the consistency of a simple calculator.

Hosted on MSN

Which AI chatbot is the best at simple math? Gemini, ChatGPT, Grok put to the test

Artificial Intelligence (AI) is becoming an integral part of daily life, including everyday calculations. But how well do these systems actually handle basic math? And how much should users trust them ...

NextBigFuture

XAI Grok 4 Scoring Poorly in Some Realworld Tests

There is a common problem for all AI companies for overfitting to benchmarks. XAI Grok 4 has some problems with prompt adherence. XAI could have had overfitting resulted from the reinforcement ...

13d

Grok 4.2 Beginner Guide : Reasoning Traces & Supports Source Priority for Research

Grok 4.2 has no memory, so each prompt needs full context; use reasoning traces and source priority for clearer results.

25d

The best AI chatbots of 2026: I tested ChatGPT, Copilot, and others to find the top tools around

I pushed eight free AI chatbots to their limits to find the best AI chatbots of 2026. To explore our top picks, check out ZDNET's chatbot-by-chatbot guide.

Some results have been hidden because they may be inaccessible to you

Show inaccessible results