Embedding
Embeddings Can Be Used For:
Similarity Search/Retrieval: query for similar items/lines/docs (distance) for RAG, rec systems
Classification/Clustering/Anomaly Detection
Embeddings can be at the word, sentence, paragraph, document level and be across mediums into images/audio(CLIP) too. Can be based on context too(BERT & GPT), so bank different embedding depending on context
Options
Comparing:
MTEB(Massive Text Embedding Benchmark)
Consider embeddings for your use case/lang, as no model is SOTA for all tasks including
Also see
Recommendation 9/21/24:
Other Options
https://replicate.com/collections/embedding-models
Voyage AI
js
py
OpenAI
Js
Last updated