Choosing a Vector Database in 2025: A Decision Framework
Pinecone vs Weaviate vs Qdrant vs Chroma vs pgvector — when to use each, what the real cost looks like at scale, and how to avoid choosing the wrong one.
Choosing the wrong vector database costs you 3–6 months of migration work, and it usually happens because the team picked based on hype or a quick tutorial rather than actual requirements. The right choice depends on four questions: how many vectors, who is hosting it, how complex is your filtering, and what is already in your stack.
This guide cuts through the marketing. No database is universally best — each has a real use case where it wins. The goal is to find yours.
The decision framework: 4 questions first
- Scale: how many vectors now, and in 12 months? Under 1M, 1-50M, and over 50M are meaningfully different regimes.
- Hosting preference: fully managed (you pay for convenience), self-hosted (you pay with ops burden), or hybrid?
- Filtering complexity: do you need metadata filters on vector search? Simple equality filters or complex multi-field queries?
- Existing stack: do you already run Postgres? Already use Kubernetes? Switching costs are real — factor them in.
Comparison table
| DB | Hosting | Index type | Metadata filtering | Cost at 1M vectors | Best for |
|---|---|---|---|---|---|
| Pinecone | Fully managed | Proprietary (HNSW-based) | Good (serverless indexes) | $70–100/mo | Zero ops, fast time-to-production |
| Weaviate | Managed + self-hosted | HNSW | Excellent — rich GraphQL | $25/mo cloud or free self-hosted | Complex filtering, hybrid search, schema-rich data |
| Qdrant | Managed + self-hosted | HNSW | Very good — payload filtering | $25/mo cloud or free self-hosted | High-performance self-hosted, Rust reliability |
| Chroma | Self-hosted only | HNSW (hnswlib) | Basic | Free | Local dev and prototyping only |
| pgvector | Wherever Postgres runs | IVFFlat + HNSW | Full SQL | Cost of your Postgres instance | Teams on Postgres with under 2M vectors |
Real cost math at scale
Costs at 1M vectors tell a misleading story. The real decision happens at 10M and 100M vectors, where managed databases get expensive fast.
| DB | 1M vectors/mo | 10M vectors/mo | 100M vectors/mo |
|---|---|---|---|
| Pinecone (serverless) | $70–100 | $500–700 | $3,000–5,000 |
| Weaviate Cloud | $25 | $200 | $1,500+ |
| Qdrant Cloud | $25 | $150 | $1,000+ |
| Qdrant self-hosted (AWS) | $50 (EC2) | $150 | $500 (vertical scale) |
| pgvector (RDS Postgres) | $50 | $200 (starts degrading) | Not recommended |
Managed databases scale cost linearly with vectors. Self-hosted scales with instance size and can be 3-5x cheaper at over 10M vectors if you have the ops capability to run it. The break-even on hiring DevOps to manage self-hosted Qdrant vs. Pinecone fees typically occurs around $2,000-3,000/month in database spend.
pgvector: when it is actually good enough
pgvector gets dismissed as 'not a real vector database' but it is the right choice in three specific scenarios:
- You already run Postgres and have under 2M vectors: adding pgvector costs nothing and removes an entire infrastructure dependency
- Your filtering is complex: pgvector lets you write arbitrary SQL joins — metadata filtering that would require workarounds in Pinecone is trivial in SQL
- Transactional consistency matters: vector search and relational data in the same ACID transaction is only possible with pgvector
pgvector's weakness is query performance at scale. At 5M+ vectors with over 100 QPS, IVFFlat index performance degrades and the HNSW index requires significant memory. For high-traffic semantic search at scale, purpose-built vector databases win on latency.
Practical rule: use pgvector if you are under 2M vectors and already on Postgres. Evaluate dedicated vector DBs when you cross 2M vectors OR when vector search latency becomes a user-facing issue. Do not pre-optimise to Pinecone at 50K vectors.
Pinecone: the managed convenience tax
Pinecone is genuinely good at one thing: getting you to production fast with zero ops. Serverless indexes, automatic scaling, solid SDKs, good documentation. The tax you pay is cost at scale and vendor lock-in.
Pinecone uses a proprietary index format with no standard export. Migrating off Pinecone means re-embedding and re-indexing everything from scratch — a non-trivial project at 10M+ vectors.
Weaviate vs. Qdrant: the real differences
Both are excellent open-source vector databases with managed and self-hosted options. The real differences:
- Weaviate: stronger on hybrid search (BM25 + vector out of the box), richer schema/ontology system, GraphQL query interface — better for data with complex structure and cross-object relationships
- Qdrant: faster raw performance in benchmarks, written in Rust (lean operational profile), excellent payload filtering, simpler surface area — better if you want a focused, high-performance vector store with fewer moving parts
Chroma: dev tool, not production tool
Chroma is excellent for local development and prototyping. It is the default in most LangChain tutorials precisely because it requires zero setup. Do not use it in production: no built-in persistence guarantees, no clustering, no managed option, and performance degrades significantly above 500K vectors.
If your team is using Chroma in production with over 200K vectors, migrate now. Not because Chroma is bad software — it was never designed for production load.
Migration pain points
If you need to migrate between vector databases, the main costs are: re-embedding all documents if the new DB uses a different vector dimension or distance metric, rewriting all query code for the new SDK, and re-validating retrieval quality metrics on the new system. Plan for 2–4 weeks for a production migration at 1M+ vectors.
Red flags in each option
- Pinecone: you are at 5M+ vectors and the bill is over $1,500/month with no efficiency path forward
- Weaviate: team has no GraphQL experience and the query interface is causing friction
- Qdrant: you need fully managed but lack ops experience to self-host
- Chroma: any production deployment with over 100K vectors
- pgvector: over 5M vectors or over 100 concurrent QPS — you will hit query latency walls
Compare in Explore →: See real latency and cost comparisons across vector database configurations.
Try it interactively
GenAI Systems Lab is a free platform for AI engineers — configure real failure modes, break things, and build the judgment that gets you hired.
Open GenAI Systems Lab →