Home - Will the AI era “swallow up” languages with a small number of speakers?
Will the AI era “swallow up” languages with a small number of speakers?
At a time when global AI development is mainly focused on large languages, Baltic language technology company Tilde focuses on the opposite – ensuring that languages with a small number of speakers are also visible, of high quality, and competitive in the digital age.
Tilde is convinced that the vitality of European languages in the AI era must not depend primarily on English-based technology or US and Chinese companies. “Our focus is on Estonian, Latvian, Lithuanian and other languages spoken in Europe. If our goal was to offer better translation quality than Google Translate before, today we can outperform ChatGPT and other large language models,” says the company’s CEO in Estonia, Kalle Kuusik.
TildeOpen LLM is a foundational model, specifically designed for European languages
Of course, the development of large language models has not been ignored either. Thus, Tilde has created its own foundational model, TildeOpen LLM, which underpins specific applications such as machine translation and text summarisation, enhancing translation quality across 34 European languages within a wider range of supported languages.
For the most common US models, about 90% of the training data is in English, meaning the solutions built on this basis are strongly biased towards English. For example, in Estonian, this can manifest itself in both a foreign-sounding sentence structure and words invented by the model. TildeOpen has been created on the principle that all 34 languages are equally represented.
Tilde’s strength is also its scientific competence – the company employs more than ten language technology experts with PhD degrees. “We are the only company in the Baltics to develop a new large language model from scratch,” says Kuusik.
The development of the model was supported by the European Commission, which enabled the use of the LUMI supercomputer in Finland and nearly two million GPU hours. The model was created after winning the European Commission’s Large AI Grand Challenge competition.
Machine translation that actually works
“For the average user, general models like ChatGPT are usually enough, but in a professional environment, their quality is often insufficient. Our strength lies in customised models that take the company’s terminology, specifics, and style into account. Companies can create their own terminology dictionary or train the model based on previous translations. All of this is very difficult to achieve with global large-scale solutions,” adds Kuusik.
In addition to linguistic accuracy, the technical quality of the solution is also important. With Tilde’s solution, you can be sure that the translation will retain the original formatting, fonts, and layout of Word and PDF documents. These are the details that are often lost or mixed up when using general models.
The same logic applies online. “Automatic translation of websites is still not so common in Estonia, but it allows you to keep the content in one language and automatically offer the visitor a version in their native language,” Kuusik explains. This has a direct impact on business, as people trust content in their native language more, which increases the likelihood of making a purchase.
“Even cybercriminals and fraudsters understand that addressing people in their own language makes fraud more successful. In their case, of course, these steps cannot be condoned.”
Service provider reliability and control over data
In a rapidly changing world, data security is also becoming increasingly important. “In the case of US or Chinese service providers, companies have no real control over their data, but we are right here – a local company that takes clear responsibility for managing the information,” says Kuusik.
Knowing that the information does not leave Europe is especially important for the public sector and organisations working with sensitive information. “The solution has been developed in accordance with the European Union’s Artificial Intelligence Act and is part of a broader goal to keep critical technologies under European control,” Kuusik adds.
The customer can also install the machine translation platform on their own IT infrastructure, where all data is fully under their control. Few service providers offer this option. While the prevailing approach thus far has been to move all services to the cloud, for many customers, the only acceptable solution is their own IT infrastructure.
A European solution
“Our goal is to offer solutions that really work in a professional environment and deliver results that can be trusted. And as I said – this is critical; especially for languages with a small number of speakers” emphasises Kuusik.
It will give European companies and institutions a real alternative to global AI solutions: secure, adaptable, and sensitive to local language nuances.
Tilde has been working in-depth with language technology for over ten years. Today, the company’s focus is on machine translation, speech technologies (transcription and speech synthesis), and artificial intelligence assistants and chatbots.
The article was initially published at aripaev.ee
Curious about implementing AI in your organisation?
Contact us today and see how our solutions can improve your workflow


