TildeLM: Transforming AI for a multilingual Europe
We are developing TildeLM, an open foundational LLM (large language model) with over 30 billion parameters covering all European languages, with a focus on Baltic and Eastern European languages. Supported by the European Commission, TildeLM is set to revolutionise the AI landscape, ensuring our region benefit from cutting-edge technology.
THE CHALLENGE
Championing language equity
THE SOLUTION
Building an open model for Europe
TildeLM is being developed to represent a broad range of European languages, including Bulgarian, Latvian, Ukrainian, and others. This model is more than just a technological achievement; it’s a commitment to creating a resource that is fully open and serves as the foundation for a wide array of AI applications, benefiting over 155 million Europeans.
USE CASES AND APPLICATIONS
Powering meaningful innovations across sectors
National Language Models
Research and Development
Researchers can use TildeLM to study languages, enhance translation systems, and create novel language technology applications.
Technological Innovation
Industry-Specific Solutions
COMPUTING RESOURCES
Excellence driven by Europe’s most advanced supercomputer
The development of TildeLM is being accelerated by the LUMI supercomputer, awarded as part of the Large AI Grand Challenge. With 2 million GPU hours at our disposal, LUMI’s immense computational power is crucial for efficiently executing this ambitious project.
OUR PROMISE
Committing to open collaboration
We are dedicated to open science principles and ethical data handling, making TildeLM freely available. We believe that collaboration and shared knowledge are key to innovation, and we invite researchers, developers, and data providers to join us in this mission.
Open access
Integrity and security
Contribute to a multilingual future
To build a robust multilingual language model with over 30B parametrs, we need contributions of language data from across Europe. We welcome involvement from authors, publishers, state libraries, and others who can provide valuable content, with flexible terms to accommodate your needs. This platform is where we share our progress and invite you to be part of this groundbreaking initiative.
Your involvement is essential to ensuring that every language has a voice in the digital age.
Frequently asked questions
What is the TildeLM?
Why is language equity in LLMs important?
What languages does the TildeLM project focus on?
The project targets Eastern European and Baltic languages such as Bulgarian, Croatian, Czech, Estonian, Finnish, Latvian, Lithuanian, Macedonian, Montenegrin, Polish, Serbian, Slovak, Slovene, and Ukrainian. The model will also support bigger languages such as English, French, German and Russian in balanced proportions to support translation and related multilingual tasks.