AT&T Achieves 90% Cost Reduction in AI Operations by Shifting from Large to Small Language Models

March 2, 2026

AT&T has significantly enhanced the performance of its internal ‘Ask AT&T’ AI assistant by redesigning its orchestration layer and transitioning workloads from large language models (LLMs) to small language models (SLMs), VentureBeat reported on February 26.

The strategic overhaul resulted in marked improvements in latency, speed, and response times. Furthermore, it reduced operational costs by 90% and tripled the system’s token processing capacity.

“I believe the future of agentic AI is many, many, many small language models,” stated AT&T Chief Data Officer Andy Markus. “We find small language models to be just about as accurate, if not as accurate, as a large language model on a given domain area.”

Small language models are streamlined versions of their larger counterparts, featuring fewer parameters. While they may lack the broad general knowledge of LLMs, SLMs are often faster, less expensive, and provide greater control. For domain-specific tasks, they can perform equivalently or even outperform larger models, particularly when trained on specialized industry data.

This approach aligns with industry research, including findings from Nvidia, which indicate that SLMs offer a more practical and profitable enterprise solution. They are sufficiently powerful for numerous real-world applications, incur lower operational costs, and can be deployed at scale without the substantial infrastructure demands of LLMs.

In complex AI agent systems, where multiple steps are orchestrated to complete assignments, the majority of tasks do not require the most computationally intensive models. A hybrid strategy, utilizing smaller models for routine operations and reserving LLMs for critical, high-stakes steps, optimizes efficiency and cost.

The evolution of enterprise AI is increasingly focused on efficiency—developing models that are smaller, faster, and more economical to operate without compromising performance. This shift is critical as nearly 47% of businesses cite cost as the primary barrier to generative AI adoption, making cost-effective strategies like SLM deployment essential for scalable implementation.

Ready to Transform Your Payment Infrastructure?

Latest Post

AT&T Achieves 90% Cost Reduction in AI Operations by Shifting from Large to Small Language Models

Dots Secures $8.9 Million in Series A Funding to Expand Global Payout Platform

AI Pioneer Andrew Ng Forecasts Artificial General Intelligence Remains Decades Away

Consensys Launches MetaMask Card in the United States, Enabling Crypto Spending via Mastercard Network

Wharton Research: Structured AI Assistance Boosts Long-Term Learning Outcomes

Anthropic AI Tool Triggers IBM Stock Selloff, Threatening Legacy COBOL Business Model