Back to Blog
Traversaal Team/February 18, 2026/2 min read

Announcing Alif 1.0: Our First Urdu LLM Outperforming Other Open Source LLMs

Announcing Alif 1.0: Our First Urdu LLM Outperforming Other Open Source LLMs

We are thrilled to announce Alif 1.0, our first-ever Urdu-English LLM, setting a new benchmark in multilingual AI. Specifically optimized for Urdu, Alif addresses critical challenges in Urdu NLP and brings significant advancements in reasoning, fluency, and cultural alignment. This launch marks a significant milestone in making AI more accessible and accurate for 250 million Urdu speakers worldwide.

Why Alif Matters

Developing a high-performing Urdu LLM presents several hurdles:

  • Most multilingual LLMs struggle with Urdu, often producing inconsistent or extremely hallucinated responses. They also sometimes insert foreign characters during Urdu text generation.
  • Lack of High-Quality Datasets: Urdu lacks a reliable, instruction-tuned dataset for effective training.
  • Translation Limitations: Direct translation is not enough, often resulting in fluency loss and cultural misalignment, highlighting the need for native Urdu data generation.
  • Reasoning & Safety Challenges: Urdu's right-to-left script conflicts with left-to-right reasoning tasks, while existing safety frameworks fail to align with regional requirements.
  • Culturally-Aware AI is Crucial: There's a critical need for AI models that understand and respect the nuances of low-resource languages.
  • Meta-Funded Initiative (LARGE): Our Meta-backed project tackles these challenges head-on, ensuring robust Urdu-language LLM development.

How Our Approach Solves These Challenges

To overcome these challenges, we have designed Alif 1.0 8B Instruct, a powerful Urdu-English model using multilingual synthetic data distillation:

First High-Quality Urdu Alpaca Dataset

Alif is trained on a high-quality Urdu Alpaca dataset, generated through multilingual synthetic data techniques and human feedback refinement. The dataset includes:

  • Classification
  • Sentiment Analysis
  • Logical Reasoning with Urdu Chain-of-Thought (CoT)
  • Question Answering (QA)
  • Text Generation
  • Bilingual Translations
  • Ethics & Safety Assessments

Additionally, we have developed a human-annotated Urdu evaluation suite, including Urdu red-teaming datasets to assess safety and robustness.

Enhanced Urdu Reasoning Capabilities

We have integrated Urdu-native CoT prompts and improved logical reasoning tasks to enhance the model's understanding. This approach also ensures better contextual comprehension, making sentiment analysis and classification more precise.

Optimized Training Pipeline for Efficiency

Our efficient and cost-effective training approach includes:

  • Continued Pretraining: We leveraged Urdu Wikipedia and other curated data sources to strengthen foundational knowledge of the Urdu language.
  • Fine-Tuning: The synthetic dataset is merged with translated Urdu datasets and a small portion of English data to maintain bilingual capability.

Alif-1.0-8B-Instruct

State-of-the-Art Performance on a Budget — By employing high-quality synthetic data distillation, we enhanced Meta Llama 3.1 8B's Urdu capabilities significantly. Alif now outperforms Meta Llama 3.1 8B Instruct in Urdu-specific tasks while maintaining strong English fluency. It also outperforms many open-source multilingual LLMs including Gemma 2 9B, Llama 3.1 8B, Mistral Nemo 12B, Qwen 2.5 7B, and Cohere Aya Expanse 8B — all within a budget of under $100.

What's Next

  • Gather more data to enhance the model's knowledge and understanding.
  • Apply Model Merging and other RL techniques to improve bilingual and reasoning capabilities.
  • Conduct further evaluations and benchmarking.

Alif is a monumental step forward for Urdu NLP, ensuring cultural and linguistic alignment while expanding bilingual AI capabilities. Stay tuned for more updates as we continue to push the boundaries of AI innovation.

Model Card: Alif-1.0-8B-Instruct on Hugging Face

Traversaal Team
Traversaal Team

Former Senior Research Manager at Google and Walmart Labs, leading teams in optimization, NLP, recommender systems, and time series forecasting.