Zeerak Ahmed (MDE ’18) brings software innovation to Urdu
In this essay, Zeerak Ahmed (MDE ‘18) writes about his journey to modernize software for the Urdu language and the Arabic script. An additional article, “The Fight to Preserve the Urdu Script in the Digital World”, has been featured in TIME magazine.
Across Pakistan, computers and smartphones are used in English. Most of the population does not speak English, but is unable to access technology in Urdu. Even when people don’t have the desire or ability to speak in English, text from local languages is transliterated into the Latin script, instead of being typed in its native forms.
This is a situation that has unraveled over many decades, and if you look more closely, perhaps even many centuries. The truth is that modern technology, and especially software, has been built with the Latin script in mind. As it extends to the rest of the world, it is slowly adapted to fit local needs. But the complexity of the Urdu language, the strong cultural traditions of the area, economic and political conditions have left local languages in Pakistan on the brink of being non-functional in today’s professional world.
Where technology is used to publish Urdu text, it exists in an almost parallel-reality. Urdu newspapers use specialized desktop publishing software built in the 90s, and which has since not received the advances made in software for other languages. Nearly all Urdu text is published in one font, which in its early years couldn’t be used to type new technical vocabulary. And average users do not go through the many hoops needed to type in Urdu effectively. The technological barriers are so high that the culture is adapting faster than software is iterating. If necessary investments aren’t made to halt this trend, we may relegate an entire population that does not speak English natively to remain outside the benefits of modern technology, not to speak of irreversible loss of cultural heritage.
Zeerak Ahmed collaborated with engineers, designers and researchers across disciplines. He works with professionals in Human-Computer Interaction, Urdu literature, Language & History scholars, UX Designers, and Engineers to continue his work on Urdu software. Many of these collaborations were built during the MDE program. Photo: Massan Photography.
Encouraged by colleagues in the MDE program to address this issue, I chose it as the subject of my Independent Design Engineering Project. An extensive study of the issues surrounding non-Latin technology were conducted. I interviewed dozens of Urdu speakers, collated the development of language technology, and unraveled the technological layers used to reproduce language in modern software.
It became clear through this study that modern software does not really apply human-centered design principles to large parts of the world. A human-centered design process focuses its attention not on metrics of cost, aesthetic, existing technology or timeline, but rather on usefulness to humans. This process is conducted by observing the difficulties of users in their particular setting, ideating ways to alleviate these issues, and conducting iterative tests.
Software built in the West understands the needs of Western users well. Unfortunately as it is spread to other parts of the world, the whole process of understanding the needs of non-Western users is not conducted. Instead, already built technology is retro-fitted with translated language or simple transformations. Inevitably, this lack of focus on non-Western users means that the nuances of their traditions are not accurately represented in published text.
A few examples can illustrate these issues. Existing standards for fonts cannot accommodate the rules and nuances of calligraphy in the Arabic script, which date back 1000 years. Pakistani audiences are particular about the exact calligraphic form that Urdu text is published in, and as a result Pakistani newspapers hand wrote their newspapers until the late 80s, rather than using subpar fonts and typeset them. The most commonly used Urdu keyboard layout in Pakistan phonetically maps Urdu letters to the English keyboard. In both cases, the end user was envisioned as someone who would behave like someone already familiar with Western technology, rather than a user with their own social context and needs. Further research revealed that these were hardly the only examples and such compromises could be seen all the way back to the very first developments in printing the Arabic script.
Our research showed that the failures of modern technology existed in multiple areas. To address these issues with a systems approach was critical. I chose my interventions specifically to create ripple effects not just to one branch of technology but to as many as possible.
Given the cultural importance of Arabic calligraphy, much research and development interest had already turned to typography. While gaps remained in this area, I chose not to focus on the output of the Urdu language but instead on input. By building better ways to input Urdu text we were opening the doors for many more customers to participate in new technology. A realm of AI that relies on existing human-produced content, as well as utilities such as search engines, would benefit further from new bodies of Urdu text.
Much of South Asia will access the internet for the first time through smartphones. They have little technological baggage from past decisions, and so we were free to explore radical departures to existing technological infrastructure. By focusing not on the small percentage of users that are already fluent with technology, but rather on the bigger fraction that is new, we found a blank canvas to innovate.
This is how we began work on Matnsaz, a breakthrough keyboard technology for Urdu. This keyboard provides a number of innovations in interaction design.
By studying the design of the Arabic script itself, we found that many letters share the same basic shape. We took this learning and compressed the Urdu keyboard layout from 39 to 21 keys. Users select the basic shape they want to type, and software selects the correct underlying character. This simplification significantly reduces the learning cost of the keyboard layout, encouraging more users to type and increasing the accuracy and speed of their typing.
A graphic shows how the shape-based Matnsaz keyboard for the Urdu language compares to the default Android Urdu keyboard. The shape-based layout requires less keys, and is hence easier to learn and use. Image courtesy Shanasai LLC.
The Arabic script is also cursive, meaning letters join together and change shape. We created another technical innovation which lets the keyboard show the shape letters will take before they are typed in a word. This is particularly useful for those learning the Urdu language.
The Matnsaz keyboard allows users to see the shapes that Arabic script’s cursive letters will take in any word. The context of the already typed letters is shown in light gray on the keys, and the letters which will be typed as a key is pressed, have their upcoming shape indicated in black. Image courtesy Shanasai LLC.
Both these innovations were only possible using touchscreens and computing power of modern smartphones, and would not have been possible on older technology.
By understanding the historical failures of language technology design, the economic reality of the computing market in the region, and the technology stack available, we were able to create a new set of innovations previously overlooked by software manufacturers.
Building Matnsaz was, however, made difficult by the lack of technical infrastructure in the Urdu language.
Our innovations in interaction design were built on how letters are shaped in Arabic script. Unfortunately, the software stacks on which modern smartphone applications are built are not aware of these character shapes. So in order to group characters into buckets of similar shape, or identify the right way a character may appear in the context of a word, we had to build new string processing software from scratch.
As software development kits are released alongside modern operating systems, many functions thought to be core to the development of new applications are abstracted into libraries. These libraries provide easy ways to conduct common operations, allowing application developers the ability to focus on problems unique to their work. To build Matnsaz, we had to wrangle with and extend the limited functionality of existing libraries when it came to the Arabic script. We open-sourced our new library to handle the Arabic script, by the name of Naqqash.
A bigger problem awaited us when building the autocorrect system that underpins Matnsaz. Smartphone keyboards require corrective abilities to allow users to type effectively. To create corrective software, or any Artificial Intelligence based on language features, a good training dataset is needed. Unfortunately for Urdu, existing datasets were difficult to access and had no attributions. We could not audit data sources for quality or bias. Furthermore, existing published text across the Internet is riddled with errors due to the failings of existing publishing software and fonts.
Our aim was to enable not just the development of Matnsaz, but to catalyze new technology for Urdu broadly. So we took on the goal of creating a new, high-quality, open-sourced dataset or Urdu text that was free to use for commercial and non-commercial uses. We partnered with multiple organizations that had high editorial standards for Urdu, to get donations of Urdu text in their purview. We then assembled a cross-border team of researchers to annotate this text, fix errors, and build new software tools that help assemble our dataset and continually improve its quality. Today this dataset, called Makhzan, is good enough to be the go-to starting point for building new AI for the Urdu language.
Going into this project, we were not completely unaware of these hurdles. In fact, the decision to work on a keyboard was catalyzed by the realization that building a keyboard would allow us to work on intermediary technological infrastructure that itself could enable other applications downstream.
As a result of our systems analysis, and multi-layered approach to technological design, we have been able to build a strong community of collaborators and supporters. The Matnsaz keyboard itself was beta tested by hundreds of users before launch, providing critical feedback on our language model, app design, functionality and creating strong word of mouth. The exercise of using our community of beta testers to build a high-quality software project was an important output of our work. Because of the history of compromised technology for the Urdu language, many customers in Pakistan believe that using Urdu will forever relegate them to using subpar software that will not engage well with modern software they are using in other languages. Matnsaz poses a fundamental challenge to this belief, and set us up as an authoritative force that can be trusted to pursue further innovations in this field.
Through the development of Matnsaz we have built critical partnerships with other collaborators as well: technology teams based in Pakistan, type designers, publishers, academics in technology and the Urdu language, as well as Pakistani interaction designers and technology companies building new software for the local market. This is in addition to corporate infrastructure of legal, public relations and marketing.
Our hope is to take this culturally-aware, deeply technical, and innovative approach to solving a broad cultural issue forward. For now, Matnsaz serves as evidence that solving difficult, entangled technological problems requires a refreshed approach. But large, underserved populations across the world have reason to hope that technology can serve them better than it ever has.