Small Models Are Having a Moment — and the UAE Wants to Lead It

Small language models may not deliver the same level of spectacle as their larger counterparts, but they can drive real-world business.

Topics

  • The idea that bigger is better doesn’t hold true in every case, especially for AI systems. Large AI models need massive processing power—resources that only a few global organizations can afford. 

    Currently, only a handful of companies, such as Google, Meta, OpenAI, and Microsoft, dominate the AI landscape. Their large language models (LLMs), which can contain hundreds of billions or even trillions of parameters, rely on supercomputers, extensive energy resources, and enormous budgets.

    However, the bigger a model gets, the more opaque its inner workings become. Even AI scientists who build them describe them as “black boxes.” With every layer of complexity, transparency slips further away. 

    Yet a shift is now underway.

    A recent study by Nvidia researchers stated that small language models (SLMs) could become the backbone of the next generation of intelligence. 

    Less Is More

    To grasp the concept of small models, it’s helpful to revisit how generative AI rapidly grew in size and impact.

    LLMs are trained to predict the next token in a sequence based on the patterns they learn from vast amounts of text. The more they ingest, the better they regurgitate. OpenAI’s GPT-3, released in 2020, was trained on roughly 300 billion tokens. Two years later, DeepMind’s Chinchilla consumed a trillion.

    While bigger models became capable of handling general tasks, they also grew expensive, environmentally taxing, and inaccessible due to the massive resource and data requirements. Training the latest models requires energy equivalent to powering thousands of homes for a year. 

    In response, a group of young natural language processing (NLP) researchers launched the BabyLM Challenge in early 2024. It was a call to the global AI community to build models using datasets less than one-tenth the size of those used by large models. The idea was to see how close they could come to LLM-level reasoning with a fraction of the data.

    The path to intelligence, experts believe, lies in building systems that learn like humans rather than just memorize like machines.

    The UAE’s Case for Small Models

    Few countries have moved as quickly as the UAE in translating this idea into action. Its unique position, in terms of infrastructure, policy framework, and institutional scaffolding, has given the country an upper hand to capitalize on SLMs.

    In July, MBZUAI and G42 unveiled K2 Think, a compact, open-source reasoning model with 32 billion parameters — a fraction of GPT-4’s size. The model was co-developed with Cerebras, a US-based chipmaker specializing in high-speed AI hardware, and stands as one of the UAE’s most consequential contributions to open-source.

    “K2 Think reflects how open innovation and close public–private partnerships can position Abu Dhabi as a global leader in AI,” says Hector Liu, Director of the Institute of Foundation Models at MBZUAI’s Silicon Valley Lab. “It demonstrates that the UAE can develop advanced systems locally and share them globally. The future of reasoning will be shaped not by size, but by ingenuity and collaboration.”

    Inside K2 Think

    What makes K2 Think stand out is not just its size but what it does with that size.

    It emphasizes reasoning rather than relying solely on computational power. The model incorporates four essential design principles: a long chain-of-thought supervised fine-tuning, reinforcement learning with verifiable rewards, agentic planning, and scaling at test time.

    These allow K2 Think to deliver high reasoning performance while maintaining transparency and reproducibility. “Every element was built for parameter efficiency,” says Liu. “It proves that smarter models can outperform bigger ones.”

    Running on Cerebras’ inference-optimized hardware, K2 Think processes up to 2,000 tokens per second per query, dramatically reducing response time and energy use. 

    Tokens are the basic units of text that language models read, interpret, and generate. A single token may represent just a few characters, meaning that 2,000 tokens per second translates to roughly 1,500 words of processed text each second. This metric defines how quickly a system can reason through complex prompts or deliver responses. By comparison, open-source models such as LLaMA 3 and Mistral 7B typically process between 200 and 500 tokens per second, depending on hardware.

    “With just 32 billion parameters, it consumes a fraction of the energy used by models 20 times larger,” Liu says. “This is how we make AI both accessible and sustainable.” 

    Applications That Matter

    Since its launch, K2 Think has established a presence across multiple sectors, particularly in domains that demand reasoning rather than raw linguistic diversity: mathematics, biology, and physics. It is being used to simulate biological processes, financial analysts are applying it to model market dynamics, and energy companies are deploying it for predictive maintenance.

    MBZUAI also launched the K2 Think Hackathon, which concluded in November 2025.  The winning solutions included a self-correcting AI mathematician, an AI performance trainer, and an AI-run system for drug repurposing through biological evidence integration.

    This open-source ethos aligns with the UAE’s larger AI vision. As Liu puts it, “Our partnership model — between MBZUAI, G42, and Cerebras — combines academic excellence with industrial capability. It’s how the UAE translates ambition into action.”

    The Enterprise Equation

    While the UAE is leading the charge in AI sovereignty, enterprises globally are discovering similar advantages in smaller models.

    At ServiceNow, for instance, the shift from general-purpose to domain-specific AI is already underway. The software company recently introduced Apriel 5B, a small enterprise language model co-developed with Hugging Face, Nvidia, and IBM’s Watson team.

    “When we talk about a large language model being ‘smart,’ we mean its ability to reason, interpret context, and recognize uncertainty,” says Jessica Constantinidis, Innovation Officer for EMEA at ServiceNow. “But smartness isn’t always about scale. A smaller, fine-tuned model that deeply understands your business processes can often outperform a generic, massive one.” A smart model recognizes the limits of its knowledge and avoids fabricating answers.

    Each enterprise LLM ServiceNow builds is trained per customer, per instance, meaning it learns only from that client’s own workflows and systems, without sharing data externally. “This ensures true data sovereignty,” Constantinidis says. “It’s not just about efficiency; it’s about trust.”

    “Efficiency comes from recognizing that not all tasks require the same level of intelligence,” she adds. “It’s about smart allocation of intelligence and cost rather than a race to use the biggest model.”

    This domain-specific approach dovetails neatly with the UAE’s broader AI objectives. By developing localized, smaller models, especially those trained in Arabic and regional dialects, the UAE can simultaneously preserve data independence and accelerate digital transformation.

    On Prioritizing Sovereignty

    AI sovereignty has become a defining issue of the decade. As global models consolidate around Western data and linguistic norms, many nations are asking: Who writes the future of intelligence?

    For the UAE, the answer lies in building local, open, and efficient models that serve regional priorities.

    The country’s open-source family of models, including Jais, NANDA, Sherkala, and now K2 Think, forms the foundation of a sovereign AI stack. 

    Liu believes this local focus will enable the UAE to set new standards in responsible AI. “Our roadmap includes multilingual and domain-specific K2 variants, particularly strengthening Arabic,” he says. “We are extending our methodology to sectors like healthcare, energy, and public services — all under an open science framework.”

    Smaller models are inherently more governable. They enable policymakers to monitor and regulate outputs, identify errors, and incorporate ethical guardrails.

    The Bottom Line

    For decades, the world has equated technological progress with size, whether in terms of data centers, models, or their parameters. But as AI begins to mirror the limits of human cognition, the frontier is shifting inward, toward precision, reasoning, and adaptability.

    The compact nature of SLMs allows them to run directly on edge devices or local servers, eliminating the need for massive cloud infrastructure. In industries where AI must operate across thousands of devices or users, this capability can dramatically reshape the economics of deployment.

    Although SLMs may not provide the same level of spectacle as their larger counterparts, they are intended to drive substantial, real-world business impact over the coming years.

    As Liu states, “The future won’t be defined by who builds the biggest model, but by who builds the smartest one.”

    Topics

    More Like This

    You must to post a comment.

    First time here? : Comment on articles and get access to many more articles.