AleksanderObuchowski
/

medembed-small-onnx

Feature Extraction

Transformers.js

sentence-transformers

text-embeddings-inference

Model card Files Files and versions

medembed-small-onnx / README.md

AleksanderObuchowski's picture

AleksanderObuchowski

Upload ONNX model files

51b071c verified 10 months ago

|

history blame contribute delete

2.38 kB

	---
	license: apache-2.0
	tags:
	- onnx
	- embeddings
	- medical
	- transformers.js
	- sentence-transformers
	library_name: transformers.js
	pipeline_tag: feature-extraction
	---

	# AleksanderObuchowski/medembed-small-onnx

	This is an ONNX export of [abhinand/MedEmbed-small-v0.1](https://huggingface.co/abhinand/MedEmbed-small-v0.1) optimized for use with [transformers.js](https://huggingface.co/docs/transformers.js).

	## Model Description

	This model is a medical text embedding model that has been converted to ONNX format for efficient inference in web browsers and edge devices. It includes both regular and quantized versions for different performance requirements.

	## Files

	- `model.onnx` - Full precision ONNX model
	- `model_quantized.onnx` - Quantized ONNX model (recommended for web deployment)
	- `tokenizer.json` - Tokenizer configuration
	- `config.json` - Model configuration
	- Other tokenizer files for full compatibility

	## Usage

	### With transformers.js

	```javascript
	import { pipeline } from '@xenova/transformers';

	// Load the model (quantized version for better performance)
	const extractor = await pipeline('feature-extraction', 'AleksanderObuchowski/medembed-small-onnx', {
	quantized: true
	});

	// Generate embeddings
	const text = "This patient shows symptoms of diabetes.";
	const embeddings = await extractor(text, { pooling: 'mean', normalize: true });
	console.log(embeddings);
	```

	### With Python (ONNX Runtime)

	```python
	import onnxruntime as ort
	from transformers import AutoTokenizer
	import numpy as np

	# Load tokenizer and model
	tokenizer = AutoTokenizer.from_pretrained('AleksanderObuchowski/medembed-small-onnx')
	session = ort.InferenceSession('model_quantized.onnx')

	# Tokenize input
	text = "This patient shows symptoms of diabetes."
	inputs = tokenizer(text, return_tensors="np")

	# Run inference
	outputs = session.run(None, dict(inputs))
	embeddings = outputs[0]
	```

	## Performance

	The quantized model offers:
	- Reduced file size (typically 50-75% smaller)
	- Faster inference on CPU
	- Lower memory usage
	- Maintained accuracy for most use cases

	## Original Model

	This model is based on [abhinand/MedEmbed-small-v0.1](https://huggingface.co/abhinand/MedEmbed-small-v0.1), which is designed for medical text embeddings.

	## License

	This model follows the same license as the original model. Please check the original model's license for details.