TIGER-Lab
/

ABC-Qwen2VL-Instruct

Image-Text-to-Text

Model card Files Files and versions

ABC-Qwen2VL-Instruct / README.md

BenSchneider's picture

Add pipeline tag (#1)

acf8470 verified about 1 year ago

|

history blame contribute delete

992 Bytes

	---
	base_model: Qwen/Qwen2-VL-7B-Instruct
	language:
	- en
	library_name: peft
	license: mit
	tags:
	- LLM
	- VLM
	- Embedding
	- Multimodal
	pipeline_tag: image-text-to-text
	---

	```markdown
	## Model Details

	Instruction finetuned adapter for ABC: Acheiving Better Control of Multiomodal Embeddings using VLMs.

	### Model Sources

	This model is trained on top of Qwen2VL-Instruct.

	### Paper and Website

	For more information, please refer to [Website](https://tiger-ai-lab.github.io/ABC/).

	## Citation

	<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
	```
	@misc{schneider2025abcachievingbettercontrol,
	title={ABC: Achieving Better Control of Multimodal Embeddings using VLMs},
	author={Benjamin Schneider and Florian Kerschbaum and Wenhu Chen},
	year={2025},
	eprint={2503.00329},
	archivePrefix={arXiv},
	primaryClass={cs.CV},
	url={https://arxiv.org/abs/2503.00329},
	}
	```
	```