Instructions to use proxectonos/Carballo-Legal with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use proxectonos/Carballo-Legal with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="proxectonos/Carballo-Legal") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("proxectonos/Carballo-Legal") model = AutoModelForCausalLM.from_pretrained("proxectonos/Carballo-Legal") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use proxectonos/Carballo-Legal with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "proxectonos/Carballo-Legal" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "proxectonos/Carballo-Legal", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/proxectonos/Carballo-Legal
- SGLang
How to use proxectonos/Carballo-Legal with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "proxectonos/Carballo-Legal" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "proxectonos/Carballo-Legal", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "proxectonos/Carballo-Legal" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "proxectonos/Carballo-Legal", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use proxectonos/Carballo-Legal with Docker Model Runner:
docker model run hf.co/proxectonos/Carballo-Legal
Carballo-Legal
Table of Contents
Click to expand
Model description
Carballo-Legal is a specialized 7B-parameter instruction-tuned model designed for legal text understanding and generation in Galician (GL) and Spanish (ES).
It is based on the foundation model BSC-LT/salamandra-7b-instruct and has been further trained on high-quality legal corpora extracted from official public institutions.
This model enhances Salamandra’s instruction-following abilities with legal language, terminology, document structure, and reasoning patterns found in administrative and legislative texts.
Intended uses and limitations
Intended uses
- Legal-oriented text generation (summaries, rephrasing, explanations).
- Chat-style legal assistance (non-professional).
- Downstream fine-tuning for specific legal domains or tasks.
Limitations
- Not a substitute for professional legal interpretation.
- May produce incomplete or incorrect legal statements.
- Not suitable for high-stakes or judicial decision-making.
- Works best for GL and ES; other languages are not reinforced in this checkpoint.
How to use
from datetime import datetime
from transformers import AutoTokenizer, AutoModelForCausalLM
import transformers
import torch
model_id = "proxectonos/Carballo-Legal"
text = "Qué sabes sobre o Proxecto Nós?"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
model_id,
device_map="auto",
torch_dtype=torch.bfloat16
)
message = [ { "role": "user", "content": text } ]
date_string = datetime.today().strftime('%Y-%m-%d')
prompt = tokenizer.apply_chat_template(
message,
tokenize=False,
add_generation_prompt=True,
date_string=date_string
)
inputs = tokenizer.encode(prompt, add_special_tokens=False, return_tensors="pt")
outputs = model.generate(input_ids=inputs.to(model.device), max_new_tokens=200)
generated_tokens = outputs[0][len(inputs[0]):]
response = self.tokenizer.decode(generated_tokens, skip_special_tokens=False).strip()
response = response.split("<|reserved_token_1|>")[0].strip()
print(response)
Training
Training data
The model was trained on a mixture of general instructions and domain-specific legal texts.
| Dataset Type | Languages | Sources |
|---|---|---|
| Instruction set | GL, ES , PT , CAT , EN | Galician Instruction Datasets |
| Legal corpus | GL, ES | DOGA, BOP Pontevedra, BOP A Coruña |
Training hyperparameters
- epochs: 0.5
- dtype: bf16
- block size: 2048
- total batch size: 128
- learning rate: 2e-6
- scheduler: Linear
- optimizations:
- gradient checkpointing: True
- flash attention: True
- liger kernels: True
- DeepSpeed stage: 2
Framework
Training was performed at the Galician Supercomputing Center (CESGA) on 2 nodes with 2× NVIDIA A100 40GB each, totaling 4 GPUs, across 2 days.
Evaluation
Formal evaluation is in progress. Early observations show improved handling of legal terminology, structured documents, and administrative phrasing in GL and ES.
Additional information
Funding
This work is funded by the Ministerio para la Transformación Digital y de la Función Pública - Funded by EU – NextGenerationEU within the framework of the project Desarrollo de Modelos ALIA
Cite this model
Please cite the model as follows:
@misc{carballo_legal_2025,
title = {Carballo-Legal: A Legal Domain Instruction-Tuned Model for Galician and Spanish},
author = {Proxecto Nós Team},
year = {2025},
publisher = {HuggingFace},
howpublished = {\url{https://huggingface.co/proxectonos/Carballo-Legal}},
}
- Downloads last month
- 79