AI agent task success jumps from 12% to 66% in a year, but deployment gap widens at SMBs, nexos.ai warns

AI agent task success on real computer work has climbed from 12% to 66% in a single year, putting autonomous agents within six percentage points of human-level performance on tasks such as opening files, navigating applications and running multi-step workflows, according to Stanford's 2026 AI Index. Vilnius-based AI platform nexos.ai has published commentary on the finding, arguing that the bottleneck has shifted from model quality to deployment readiness, particularly at small and mid-sized firms.

Stanford's data sits alongside figures showing organisational AI adoption reached 88% in 2025 and that generative AI has hit 53% global population adoption in three years — faster than either the personal computer or the internet. Against that, McKinsey classifies just 6% of companies as high performers, meaning those able to attribute meaningful bottom-line impact to their AI investments.

Stanford has confirmed what our customers already see: AI agents can now do real business work. The challenge has shifted. It is no longer about whether the model is good enough. It is about whether the people closest to the work can build and run agents themselves, safely, without waiting for IT. The companies that win in 2026 will be the ones that give their business teams a governed operating layer to build inside, not just another tool to play with.

Zilvinas Girenas, head of product, nexos.ai

nexos.ai argues that the deployment gap is sharpest among SMBs, which lack the dedicated IT and engineering teams larger enterprises use to stand up agent platforms. Industry survey data the company cites shows 76% of small businesses now use AI in some form but only 14% have integrated it into daily operations — a gap that tracks the divide between experimentation and production value.

The argument the vendor makes is familiar in the agent-platform category: that the limiting factor is no longer model capability but governance, identity, and who in a business is permitted to build and run agents. Whether that positioning reflects buyer reality or vendor interest will depend on how customers decide to buy agent capability in 2026 — either as standalone platforms or as features inside the tooling they already use.

AI agent task success jumps from 12% to 66% in a year, but deployment gap widens at SMBs, nexos.ai warns

More News

Scaleway picked as sovereign cloud host for France's Health Data Hub

FCA and PRA SM&CR reforms bring flexibility but accountability risks persist, Capco warns

GDPR fines hit €68m in Q1 2026, France and UK drive the surge

More Articles

Five Things UK Manufacturers Got Wrong When They First Tried AI on the Production Line

What the NCSC Actually Says About Using ChatGPT and Copilot With Sensitive Business Data

How to Automate Customer Support With AI for a UK E-Commerce Business Using Freshdesk or Zendesk

Vibe Coding With Cursor AI and Replit Agent — What a UK Business Owner With No Dev Skills Needs to Know Before Building an App

More News

Scaleway picked as sovereign cloud host for France's Health Data Hub

FCA and PRA SM&CR reforms bring flexibility but accountability risks persist, Capco warns

GDPR fines hit €68m in Q1 2026, France and UK drive the surge