Inside the UAE’s Push to Build AI That Thinks in Arabic

Topics

What to Read Next

Dr. Hakim Hacid, Chief Researcher at TII’s AI and Digital Science Research Center, argues that AI’s future will be defined not by model size alone, but by cultural authenticity, accessibility, openness, and real-world value.

While the Middle East invests billions to expand AI infrastructure, develop AI-enabled services, and build powerful AI systems, an important question arises: What happens if AI does not really understand the people it interacts with?

This question is especially urgent for the Arabic-speaking world, which includes over 450 million people in more than 20 countries.

Despite Arabic being one of the world’s most widely spoken languages, it remains significantly underrepresented in global AI development. Research suggests Arabic accounts for less than 0.5% of AI language datasets worldwide, creating a profound technological and cultural gap.

Leading efforts to address this gap is the UAE’s Technology Innovation Institute (TII). Its Arabic-native Falcon models are challenging prevailing assumptions about how language AI should be built. In a recent conversation with MIT Sloan Management Review Middle East, Dr. Hakim Hacid, Chief Researcher at TII’s AI and Digital Science Research Center, outlines a vision for AI that values cultural authenticity, accessibility, openness, and practical benefits over making bigger models.

Why Arabic AI Must Think Natively

For Dr. Hacid, the issue with many existing multilingual AI systems is not simply linguistic performance; it is philosophy.

“A model speaking Arabic while thinking in English is not the right thing to do,” he says.

Many AI systems today are mainly trained on English data, which is then translated into other languages. While this approach is functional, it often removes the cultural nuances, historical context, and social subtleties inherent in native language expressions.

Arabic is even more challenging because it is not just one language. In addition to Modern Standard Arabic, there are many regional dialects, local phrases, and unique cultural expressions that differ across the Arab world.

“The dialects are transporting cultural aspects related to specific populations,” Dr. Hacid says. “When you start translating, you lose those particularities.”

This way of thinking guided the creation of Falcon H1 Arabic, TII’s first large language model built to think in Arabic from the start, instead of just translating from English.

For TII, preserving cultural context was not simply about improving user experience. It was about ensuring that Arabic speakers interact with AI systems that can understand their realities in their own linguistic terms.

The Technical Challenge of Building Arabic AI

Building such systems, however, required overcoming major structural and technical barriers.

One of the biggest obstacles was data scarcity. Although hundreds of millions of people worldwide speak Arabic, high-quality Arabic digital content remains limited compared to English-language content available online.

“The challenge is not only the amount of data,” Dr. Hacid says. “It is the quality of the data.”

To address this, TII invested in developing high-quality native Arabic datasets rather than relying on translated content. The institute also created data augmentation pipelines to expand Arabic-language material while maintaining linguistic authenticity.

TII went beyond the usual transformer-only models, which Dr. Hacid said struggle with efficiency and handling long sequences of information. Falcon H1 uses a mix of transformers and state-space models, which helps it process longer texts and use less computing power.

The result is a model capable of deeper reasoning, improved contextual understanding, and more efficient real-time performance, all while operating within the linguistic complexity of Arabic.

AI Adoption Must Be Driven by Real Use Cases

While much of the global AI industry remains focused on building ever-larger models, Dr. Hacid believes the next stage of AI development must shift toward practical application and business value.

“One of the things the community has done in AI is everybody rushed to build these models,” he says. “Then we asked ourselves what to do with them later.”

That reversal, he argues, has become one of the industry’s defining challenges. Although conversational AI applications generated enormous excitement, organizations are now under pressure to demonstrate meaningful enterprise impact and return on investment.

Dr. Hacid believes the future of AI lies in solving real problems across healthcare, transportation, security, law, and industry, not just in making better chatbots.

“Adoption is not an easy step,” he says. “After the hype, bringing AI to business is a huge challenge.”

This emphasis on utility also extends to sustainability. Dr. Hacid questions whether current approaches to training massive AI models remain viable in the long term, given the immense computational resources and energy consumption involved.

“There is a need to rethink the underlying principles and methodologies,” he says. “We may be missing something somewhere.”

Open Source: A Strategy and a Responsibility

A key part of TII’s AI strategy is its commitment to open source development, which Dr. Hacid sees as both a smart move and a matter of principle.

When Falcon was first made publicly available, it put the UAE among the first countries to share advanced language models. More than just gaining global attention, Dr. Hacid thinks being open is key to building trust in AI.

“AI should somehow be a right for people to use,” he says. “And for them to use it, they need to understand what’s inside.”

As public concerns around data privacy, bias, hallucinations, and safety continue to grow, open-source models offer greater transparency and community oversight than closed systems.

Dr. Hacid acknowledges that hallucinations and bias remain unresolved problems across the AI industry. While some researchers believe larger datasets can gradually reduce these issues, others argue they are fundamentally embedded within current probabilistic architectures.

His perspective sits somewhere in between. Domain-specific AI systems, he explains, can significantly reduce hallucinations by combining models with external knowledge structures and tightly controlled environments.

However, fully resolving such limitations may ultimately require entirely new AI architectures built around more deterministic reasoning frameworks.

The Next Step: Smaller and More Accessible AI

Dr. Hacid believes AI adoption and accessibility will define the industry’s next major phase.

Not every organization or individual can afford the massive GPU infrastructure required to run frontier AI systems. As a result, TII is increasingly focused on building highly efficient compact models capable of operating on laptops, mobile phones, and even Internet-of-Things devices.

Among these efforts is Falcon Tiny, a lightweight model with fewer than 100 million parameters, designed for specialized tasks on minimal infrastructure.

“The models should not only be seen on the big side,” Dr. Hacid says. “They also need to exist on the small side.”

This vision shows the UAE’s broader approach to AI, which focuses not just on being the biggest, but also on making AI accessible, culturally relevant, and useful, while retaining control over its own technology.

In a world where AI is often about speed and big achievements, this different approach could become one of the region’s most important contributions to the future of artificial intelligence.

Topics

About the Author

Tags:

Topics

What to Read Next

Why Arabic AI Must Think Natively

The Technical Challenge of Building Arabic AI

AI Adoption Must Be Driven by Real Use Cases

Open Source: A Strategy and a Responsibility

The Next Step: Smaller and More Accessible AI

Topics

About the Author

Tags:

More Like This