Small multilingual LLMs for annotating and curating LLM training data.
AI & ML interests
Open, Multilingual, European, Generative, Foundational LLM
Recent Activity
Organization Card
Europe's leading AI companies and research institutions combine their forces and expertise to develop next-generation open-source language models in an unprecedented collaboration to advance European AI capabilities, the OpenEuroLLM project
models 11
openeurollm/tokenizer-256k
Updated
• 1
openeurollm/tokenizer-128k
Updated
openeurollm/datamix-2b-80-20
Updated
• 4
openeurollm/datamix-2b-50-50
Updated
• 2
openeurollm/datamix-2b-60-40
Updated
• 555
openeurollm/datamix-2b-70-30
Updated
• 4
openeurollm/datamix-2b-90-10
Updated
• 1
openeurollm/datamix-2b-en-80pct-DPO-HelpSteer3-16k
Text Generation • 2B • Updated
• 1
openeurollm/datamix-9b-60-40
Updated
• 468
openeurollm/datamix-2b-en
Updated
• 1
datasets 8
openeurollm/propella-annotations
Viewer
• Updated
• 5.85B • 11.6k • 12
openeurollm/evaluation_singularity_images
Updated
• 398
openeurollm/battle-annotations
Viewer
• Updated
• 163 • 21
openeurollm/contaminated-documents
Viewer
• Updated
• 40.3k • 11
openeurollm/ArenaHard-EU-v0-bis
Viewer
• Updated
• 30 • 11
openeurollm/ArenaHard-EU-v0
Viewer
• Updated
• 360 • 152
openeurollm/nemotron-cc-10K-sample-translated-judged
Viewer
• Updated
• 9.07M • 9
openeurollm/nemotron-cc-10K-sample-translated
Viewer
• Updated
• 450k • 92