MLRS/korpus_malti
Viewer • Updated • 21.3M • 158 • 7
How to use MLRS/mBERTu with Transformers:
# Use a pipeline as a high-level helper
from transformers import pipeline
pipe = pipeline("fill-mask", model="MLRS/mBERTu") # Load model directly
from transformers import AutoTokenizer, AutoModelForMaskedLM
tokenizer = AutoTokenizer.from_pretrained("MLRS/mBERTu")
model = AutoModelForMaskedLM.from_pretrained("MLRS/mBERTu")A Maltese multilingual model pre-trained on the Korpus Malti v4.0 using multilingual BERT as the initial checkpoint.
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. Permissions beyond the scope of this license may be available at https://mlrs.research.um.edu.mt/.
This work was first presented in Pre-training Data Quality and Quantity for a Low-Resource Language: New Corpus and BERT Models for Maltese. Cite it as follows:
@inproceedings{BERTu,
title = "Pre-training Data Quality and Quantity for a Low-Resource Language: New Corpus and {BERT} Models for {M}altese",
author = "Micallef, Kurt and
Gatt, Albert and
Tanti, Marc and
van der Plas, Lonneke and
Borg, Claudia",
booktitle = "Proceedings of the Third Workshop on Deep Learning for Low-Resource Natural Language Processing",
month = jul,
year = "2022",
address = "Hybrid",
publisher = "Association for Computational Linguistics",
url = "https://aclanthology.org/2022.deeplo-1.10",
doi = "10.18653/v1/2022.deeplo-1.10",
pages = "90--101",
}