TildeLM

How TildeLM is built

A closer look at key moments, breakthroughs, and what’s next.
Timeline_uz_tumsa_fona 2

Notes from the Lab 📋

Behind-the-scenes commentary, insights, and updates from our research team on TildeLM development.

09.06.2025.

We’re proud to be among the first companies to test JUPITER, Europe’s first exascale supercomputer! With 1.2 million GPU hours granted to us, we’ll adapt TildeLM for real-world use – including multilingual enterprise search, context-aware assistants, and other secure AI tools.
27.05.2025.

Great news! We’ve secured an additional 140,000 GPU hours on LUMI through EuroHPC JU. These resources will be used to instruction-tune the model as part of the FFplus-funded project, focusing on key multilingual tasks such as translation, summarisation, and question answering.

12.05.2025.
We are halfway there with the pretraining! Seeing one trillion tokens took longer than anticipated due to monkey patching bugfixes and waiting for GPU allocations.
05.05.2025
We’ve introduced a more efficient example‑packing strategy for supervised instruction tuning in EleutherAI’s GPT‑Neox. Early profiling shows roughly 90% packing efficiency, keeping LUMI’s GPUs almost as fully utilised as during pretraining. Another improvement is the multiturn instruction masking strategy, allowing the model to participate in lengthy multi-turn conversations.
15.04.2025
We’ve now completed roughly one-third of pretraining. Getting there meant hammering out a stack of quirks, bugs, and some truly artisan code in EleutherAI’s GPT-Neox – plus a couple of our own blunders. However, this required only a single, very early restart, so almost no GPU time was lost!
15.03.2025
We have finally started the long-awaited TildeLM pretraining. Borrowing from Mark Twain: “Quitting smoking is the easiest thing in the world; I’ve done it thousands of times.” Let’s hope this run is not a false start and delivers the results we’ve been working towards for so long!

See how LLMs really perform

Created by our researchers, TildeBench is a public leaderboard tracking how various LLMs handle tasks like machine translation, in-context question-answering, and grammar-sensitive text generation – all in languages that are often overlooked. It’ll be updated it with new tasks and models over time.

Stay in the loop

Leave your email to get notified when TildeLM goes live on Hugging Face.