Vector representations of text or data that capture semantic meaning, enabling similarity search, clustering, and ranking.
Embedding choice affects recall, speed, and cost. PMs balance vector quality versus storage/compute costs and latency. Good embeddings reduce hallucinations by surfacing the right context; poor ones create noise and frustrate users. Versioning matters because changes can invalidate existing indexes.
Pick a model sized for your latency and domain; test multilingual if needed. Normalize text, remove boilerplate, and store metadata for filtering. Track embedding drift: re-index when you change models or preprocessing. In 2026, prefer cheaper domain-tuned small vectors plus re-ranking for precision.
Switching from a generic 1536-dim model to a domain-tuned 768-dim one cut storage 45% and improved top-3 recall on support tickets from 74% to 86%, dropping answer edits per ticket by 18%.