Instructions to use zer0int/CLIP-GmP-ViT-L-14 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use zer0int/CLIP-GmP-ViT-L-14 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("zero-shot-image-classification", model="zer0int/CLIP-GmP-ViT-L-14") pipe( "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/hub/parrots.png", candidate_labels=["animals", "humans", "landscape"], )# Load model directly from transformers import AutoProcessor, AutoModelForZeroShotImageClassification processor = AutoProcessor.from_pretrained("zer0int/CLIP-GmP-ViT-L-14") model = AutoModelForZeroShotImageClassification.from_pretrained("zer0int/CLIP-GmP-ViT-L-14") - Notebooks
- Google Colab
- Kaggle
Is it okay to use for Flux Lora training?
#8
by WilsonModt - opened
Would it be okay to use this model as a replacement for the clip-l?
Of course! It has an MIT license, same as the original CLIP-L by OpenAI. :)
Assuming you mean "training CLIP and the diffusion model together": Yes, technically, that is possible. However, it's not the best approach, considering CLIP does contrastive learning (the more negative examples there are in a batch => the bigger the batch_size, the better [up to a certain point]). In my experience, it is best to:
- Fine-tune the CLIP model in full, text and image, standalone
- Use the CLIP fine-tune with the diffusion model for training, but KEEP CLIP FROZEN and only train the diffusion model for aligning with the fine-tuned CLIP.
Hope that helps!