Inference Providers
Active filters: quantllm
codewithdark/Llama-3.2-3B-4bit
3B • Updated • 6
codewithdark/Llama-3.2-3B-GGUF-4bit
3B • Updated • 8
codewithdark/Llama-3.2-3B-4bit-mlx
Text Generation
• 3B • Updated • 67
QuantLLM/Llama-3.2-3B-4bit-mlx
Text Generation
• 3B • Updated • 21
QuantLLM/Llama-3.2-3B-2bit-mlx
Text Generation
• 3B • Updated • 32
QuantLLM/Llama-3.2-3B-8bit-mlx
Text Generation
• 3B • Updated • 37
QuantLLM/Llama-3.2-3B-5bit-mlx
Text Generation
• 3B • Updated • 22
QuantLLM/Llama-3.2-3B-5bit-gguf
3B • Updated • 16
QuantLLM/Llama-3.2-3B-2bit-gguf
3B • Updated • 9
QuantLLM/functiongemma-270m-it-8bit-gguf
0.3B • Updated • 8
• 1
QuantLLM/functiongemma-270m-it-4bit-gguf
0.3B • Updated • 6
QuantLLM/functiongemma-270m-it-4bit-mlx
Text Generation
• 0.3B • Updated • 11
QuantLLM/Meta-Llama-3-70B-Instruct-4bit-gguf
Text Generation
• 71B • Updated • 110
• 1
QuantLLM/Qwen3-0.6B-2bit-gguf
0.6B • Updated • 184
QuantLLM/Qwen3-0.6B-4bit-gguf
0.6B • Updated • 170
QuantLLM/Qwen3-0.6B-8bit-gguf
0.6B • Updated • 11
QuantLLM/TinyLlama-1.1B-Chat-GGUF
1B • Updated • 96
QuantLLM/SmolLM2-135M-QuantLLM
Text Generation
• 0.1B • Updated • 298
QuantLLM/SmolLM2-135M-GGUF
0.1B • Updated • 189