Google’s new AI model doubles reasoning performance
Digest more
MLCommons today released AILuminate, a new benchmark test for evaluating the safety of large language models. Launched in 2020, MLCommons is an industry consortium backed by several dozen tech firms. It primarily develops benchmarks for measuring the speed ...
AI companies regularly tout their models' performance on benchmark tests as a sign of technological and intellectual superiority. But those results, widely used in marketing, may not be meaningful.… A study [PDF] from researchers at the Oxford Internet ...
Google DeepMind researchers introduce new benchmark to improve LLM factuality, reduce hallucinations
Hallucinations, or factually inaccurate responses, continue to plague large language models (LLMs). Models falter particularly when they are given more complex tasks and when users are looking for specific and highly detailed responses. It’s a ...
AWS Premier Tier Partner leverages its AI Services Competency and expertise to help founders cut LLM costs using
Simbian today announced the “AI SOC LLM Leaderboard,” a comprehensive benchmark to measure LLM performance in Security Operations Centers (SOCs). The new benchmark compares LLMs across a diverse range of attacks and SOC tools in a realistic IT ...
Cerebras Systems upgrades its inference service with record performance for Meta’s largest LLM model
Cerebras Systems Inc., an ambitious artificial intelligence computing startup and rival chipmaker to Nvidia Corp., said today that its cloud-based AI large language model inference service can run Meta Platforms Inc.’s largest model at almost 1,000 ...
Taalas has launched an AI accelerator that puts the entire AI model into silicon, delivering 1-2 orders of magnitude greater performance. Seriously.
In today's crowded AI landscape, organizations looking to leverage AI models are faced with an overwhelming number of options. But how to choose? An obvious starting point are all the various AI leaderboards that have sprung up. However, while AI ...
Sarvam AI launches two advanced LLM models, 30B and 105B, outperforming competitors in key benchmarks, focusing on Indian language support.