Some Technologies Just Don’t Scale Down

By Published On: October 6, 2025

Long before AI, Google tried to sell its search technology to the enterprise.

It was called the Google Search Appliance (GSA)—a bright yellow box that brought Google’s famous web search into corporate data centers. The pitch was simple: “If Google can find anything on the internet, imagine what it could do for your company data.”

That was the early 2000s—before Google Cloud, before G Suite. But for those of us who were there, the lesson stuck.

The GSA Problem

The problem wasn’t technical. It was context.

Google’s algorithms thrived on the open web—billions of pages, constant user feedback, and endless training signals. Enterprises had none of that. Inside the firewall, there were thousands of documents, not billions. There were no clickstreams to learn from. And the IT teams asked to manage relevance tuning didn’t have the time or resources to do it.

The technology worked—but it didn’t scale down.

Foundation Models Feel Familiar

Fast forward 20 years. We’re watching history rhyme.

Large foundation models like ChatGPT and Gemini are astonishing feats of engineering. They can answer anything from “what’s the square root of a whale” to “write me a Kubernetes operator.”

But inside enterprises, I’m starting to see the same pattern: these models are too big, too broad, and too detached from domain reality. Most companies don’t need a model that knows everything. They need one that understands their business. Their acronyms. Their documents. Their process quirks.

That’s a much smaller world—and one where trillion-parameter models start to feel like trying to light a candle with a flamethrower.

The Problem of Surface Area and Guardrails

The challenge with these monolithic models is a strategic IT principle: unnecessary capability is a security and operational liability.

The same reason we don’t install SSH on every client desktop is the reason we shouldn’t use a general-purpose LLM, which includes the knowledge of everything, for a highly specific task like managing VMware vSphere in a manufacturing use case.

Every additional model capability widens the compliance and operational attack surface—something every IT leader understands intuitively. When a model knows everything, its surface area for misuse, hallucination, and data leakage becomes massive. A small, domain-specific model tuned only on vSphere logs, manufacturing procedures, and internal documentation is inherently easier to govern and audit. More is not better; more is a risk vector.

Crucially, the cost model follows the size: every token, every API call, and every day of management adds to a massive TCO burden. Worse, value without governance is technical debt in disguise.

IBM’s Bet on Small Models

That’s why IBM’s approach here at IBM TechXchange feels noteworthy.

Instead of chasing ever-larger general models, IBM is building Granite—a family of smaller, open, transparent models tuned for business contexts. The focus isn’t just on intelligence. It’s on fit: models that run securely inside enterprise systems, can be audited, and don’t require a hyperscaler’s budget to fine-tune.

This movement is about closing the AI delivery gap. The vast, hyperscale-optimized foundation models don’t easily map to the internal developer platforms, compliance regimes, and isolated data environments of the Fortune 2000. Practical AI requires a path to production that is as repeatable and auditable as any other enterprise application.

It’s a reminder that sometimes progress means scaling down, not up.

The Next Phase of Enterprise AI

The first phase of AI was about capability—look what it can do. The next phase will be about context—does it fit the way enterprises work?

Big models proved what’s possible. Small models may prove what’s practical.

For CIOs, the takeaway isn’t to pick a model—it’s to design an AI operating model. The question isn’t “which foundation model,” but “which architecture fits your governance, data, and cost posture.” Scaling down is only powerful if it scales predictably and securely inside your enterprise.

Share This Story, Choose Your Platform!

RelatedArticles