Nomic Embed
Collection
Open Source Long Context Text Embedders • 8 items • Updated • 24
How to use nomic-ai/nomic-embed-text-v1-unsupervised with sentence-transformers:
from sentence_transformers import SentenceTransformer
model = SentenceTransformer("nomic-ai/nomic-embed-text-v1-unsupervised", trust_remote_code=True)
sentences = [
"That is a happy person",
"That is a happy dog",
"That is a very happy person",
"Today is a sunny day"
]
embeddings = model.encode(sentences)
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [4, 4]How to use nomic-ai/nomic-embed-text-v1-unsupervised with Transformers:
# Load model directly
from transformers import AutoTokenizer, AutoModel
tokenizer = AutoTokenizer.from_pretrained("nomic-ai/nomic-embed-text-v1-unsupervised", trust_remote_code=True)
model = AutoModel.from_pretrained("nomic-ai/nomic-embed-text-v1-unsupervised", trust_remote_code=True)How to use nomic-ai/nomic-embed-text-v1-unsupervised with Transformers.js:
// npm i @huggingface/transformers
import { pipeline } from '@huggingface/transformers';
// Allocate pipeline
const pipe = await pipeline('sentence-similarity', 'nomic-ai/nomic-embed-text-v1-unsupervised');nomic-embed-text-v1-unsupervised is 8192 context length text encoder. This is a checkpoint after contrastive pretraining from multi-stage contrastive training of the
final model. The purpose of releasing this checkpoint is to open-source training artifacts from our Nomic Embed Text tech report here
If you want to use a model to extract embeddings, we suggest using nomic-embed-text-v1.