AT&T has significantly enhanced the performance of its internal ‘Ask AT&T’ AI assistant by redesigning its orchestration layer and transitioning workloads from large language models (LLMs) to small language models (SLMs), VentureBeat reported on February 26.
The strategic overhaul resulted in marked improvements in latency, speed, and response times. Furthermore, it reduced operational costs by 90% and tripled the system’s token processing capacity.
“I believe the future of agentic AI is many, many, many small language models,” stated AT&T Chief Data Officer Andy Markus. “We find small language models to be just about as accurate, if not as accurate, as a large language model on a given domain area.”
Small language models are streamlined versions of their larger counterparts, featuring fewer parameters. While they may lack the broad general knowledge of LLMs, SLMs are often faster, less expensive, and provide greater control. For domain-specific tasks, they can perform equivalently or even outperform larger models, particularly when trained on specialized industry data.
This approach aligns with industry research, including findings from Nvidia, which indicate that SLMs offer a more practical and profitable enterprise solution. They are sufficiently powerful for numerous real-world applications, incur lower operational costs, and can be deployed at scale without the substantial infrastructure demands of LLMs.
In complex AI agent systems, where multiple steps are orchestrated to complete assignments, the majority of tasks do not require the most computationally intensive models. A hybrid strategy, utilizing smaller models for routine operations and reserving LLMs for critical, high-stakes steps, optimizes efficiency and cost.
The evolution of enterprise AI is increasingly focused on efficiency—developing models that are smaller, faster, and more economical to operate without compromising performance. This shift is critical as nearly 47% of businesses cite cost as the primary barrier to generative AI adoption, making cost-effective strategies like SLM deployment essential for scalable implementation.





