Embedding
Last updated
Last updated
Embeddings Can Be Used For:
Similarity Search/Retrieval: query for similar items/lines/docs (distance) for RAG, rec systems
Classification/Clustering/Anomaly Detection
Embeddings can be at the word, sentence, paragraph, document level and be across mediums into images/audio() too. Can be based on context too(BERT & GPT), so bank different embedding depending on context
Comparing:
Consider embeddings for your use case/lang, as no model is SOTA for all tasks including
Also see
Recommendation 9/21/24:
, voyage-3
and voyage-3-lite
First 200 million tokens
voyage-3-lite
3.82% better retrieval than OpenAI v3 large with 6x less cost and embedding size
, , , and ,
Other Options
https://replicate.com/collections/embedding-models
js
py
Js