Advertisement

IIT Roorkee Develops World's First AI Model To Read And Convert Ancient Modi Script

IIT Roorkee AI Model 2025: Transliterating these documents could provide valuable insights into India's rich medieval history and scientific heritage.

IIT Roorkee Develops World's First AI Model To Read And Convert Ancient Modi Script
Dataset comprising over 2,000 images of Modi script paired with corresponding Devanagari text was used

IIT Roorkee 2025: The Indian Institute of Technology (IIT) Roorkee has developed the world's first Artificial Intelligence (AI) model capable of transliterating -the process of converting text from one writing system (alphabet) to another and prioritising representing the sounds of the original language -the historic Modi script into the Devanagari script.

This AI model can convert previously unseen Marathi (in Modi script) text into Devanagari, promoting digitisation, transcription, and academic research. The Devanagari script is widely used for languages such as Hindi, Sanskrit, Marathi and Nepali.

According to IIT Roorkee's research paper on the breakthrough, researchers used a dataset comprising over 2,000 images of Modi script paired with corresponding Devanagari text to train the AI model. These examples help the model learn patterns necessary for accurate transliteration.

What Is Modi Script?

The Modi script was used to write the Marathi language during the medieval period. It was commonly applied in domains such as land records, property documentation, yoga, and medieval science. According to IIT Roorkee's research paper, 40 million documents written using the Modi script have not been yet transliterated and only a few experts in this domain can transliterate this into English or Devanagari.

Transliterating these documents could provide valuable insights into India's rich medieval history and scientific heritage.

Challenges In Developing the AI Model

The research team at the IIT Roorkee faced several challenges while developing the AI model, including:

  • The first challenge was the script's cursive nature- first, a horizontal line is drawn and a letter is drawn in such a way that it starts and ends at the line.
  • The script's diverse writing styles, and issues like angular strokes, broken lines, and blurring make accurate recognition difficult.
  • A key challenge was not having a larger dataset to support the better performance of the model.

How the Research Can Be Further Improved

The current dataset used in the AI model includes documents from three medieval periods: Shivkali, Peshwekali, and Anglakali. Incorporating documents from other historical eras, such as the Adyakalin and Yadavkalin periods, could enhance the model further. These additions would help the AI learn more complex and robust patterns, minimising overfitting-where the model performs well on training data but poorly on new unseen data.

Listen to the latest songs, only on JioSaavn.com