Migrating from Pinecone to Redis Cloud: A Complete Guide to Hybrid Search Migration
Pinecone is a popular managed vector database, but teams running production AI agents increasingly hit its limits — vendor lock-in, opaque pricing at scale, limited query flexibility, and no co-located data processing. Redis Cloud offers a compelling alternative: vector search, full-text search, JSON storage, and caching in a single engine with sub-millisecond latency.
This isn't a database swap. It's an architectural upgrade — from a single-purpose vector store to a unified AI data platform you actually control.
This guide walks through the complete migration path, with particular focus on hybrid search — where the architectural differences matter most.
Why Teams Migrate from Pinecone to Redis Cloud
- Cost predictability — Pinecone charges per vector stored and per query. At scale (10M+ vectors), Redis Cloud's subscription model is significantly cheaper.
- Co-located data — Redis stores vectors alongside JSON documents, session data, and cache in one engine. No separate vector DB to manage.
- Full query flexibility — Redis supports full-text search (BM25), vector similarity (KNN/range), tag filters, numeric filters, and geo queries in a single index.
- Network latency — With Redis Cloud on AWS via VPC peering, your vectors live in the same network as your application. No cross-network hops.
- Operational control — Redis Cloud gives you subscription-level control over sharding, replication, persistence, and backup. Pinecone abstracts this away entirely.
Hybrid Search: The Core Architectural Difference
Hybrid search combines text-based relevance (keyword matching) with semantic relevance (vector similarity) in a single query. This is where Pinecone and Redis Cloud diverge fundamentally — in how they represent sparse signals, how they score, and how you control the blend.
Pinecone: Sparse + Dense Vectors with Alpha Blending
Pinecone models hybrid search as two parallel vector spaces. Every document gets a dense vector (from an embedding model like OpenAI or Cohere) and a sparse vector (typically from a BM25 or SPLADE encoder). At query time, you send both a dense query vector and a sparse query vector, and Pinecone blends the scores using an alpha parameter.
The alpha parameter controls the weighting: alpha=1.0 means pure semantic (dense only), alpha=0.0 means pure keyword (sparse only), and alpha=0.5 is an equal 50/50 blend. The final score is: score = alpha × dense_score + (1 - alpha) × sparse_score. This is simple to tune but comes with trade-offs — the sparse vectors must be pre-computed and stored alongside the dense vectors, doubling storage. Sparse vectors from BM25 encoders like Pinecone's sparse encoder or SPLADE models produce high-dimensional, sparse representations that consume significant index space.
Redis Cloud: Dense Vectors + Native BM25 Full-Text Search
Redis takes a fundamentally different approach. Instead of encoding keyword relevance as a sparse vector, Redis has a native full-text search engine built in (RediSearch). This means BM25 scoring runs natively on the text fields — no sparse vector encoding needed. Your documents store the dense vector (embeddings) plus the raw text, and Redis handles keyword scoring at query time.
For hybrid queries, Redis supports two patterns. First, you can run a vector KNN search with a pre-filter — filter by text match, tag, or numeric range, then rank by vector similarity. Second, you can combine BM25 text scoring with vector scoring in a single query using RediSearch's scoring pipeline. The key advantage: no sparse vectors to compute, store, or maintain. The text is the source of truth for keyword relevance.
Pinecone freezes your keyword relevance at index time. Redis computes it live. When your corpus grows, Pinecone's scores go stale — Redis stays accurate.
Redis Tag Filters: The Metadata Advantage
Where Pinecone uses metadata filters (key-value pairs passed at query time), Redis uses TAG fields in the index schema. Tags are indexed as inverted indexes, making exact-match filtering extremely fast. You can combine tag filters with vector search in a single query — for example, find the 10 nearest vectors where category='finance' AND status='published'. This is more expressive and faster than Pinecone's metadata filtering, especially at high cardinality.
Scoring Deep Dive: BM25 vs Pinecone's Sparse Scoring
BM25 (Best Matching 25) is the industry-standard text relevance algorithm. It scores documents based on term frequency (how often the query term appears in the document), inverse document frequency (how rare the term is across all documents), and document length normalization. Redis implements BM25 natively in its full-text search engine with tunable parameters (k1 for term frequency saturation, b for length normalization).
Pinecone does not run BM25 at query time. Instead, it relies on pre-computed sparse vectors that approximate BM25-like scoring. If you use Pinecone's built-in sparse encoder, it generates a sparse representation at index time. If you use SPLADE, the sparse vector captures learned term importance. The critical difference: Redis BM25 scores are computed live against the actual text at query time, while Pinecone's sparse scores are frozen at index time. If your corpus grows or term frequencies shift, Pinecone's sparse scores become stale unless you re-encode.
For the dense (vector similarity) component, both platforms use cosine similarity, dot product, or L2 distance — no meaningful difference there. The scoring divergence is entirely in how keyword/text relevance is handled.
Migration Steps: A Production Playbook
Here is the step-by-step approach we use for production Pinecone-to-Redis migrations. Each step is designed for zero-downtime cutover.
Step 1: Redis Cloud Subscription Creation
Create a Redis Cloud Pro subscription on the same AWS region as your application. Choose the memory tier based on your vector count and dimensionality — a rough formula: memory ≈ (vector_count × dimensions × 4 bytes) × 2.5 (for index overhead). Enable Redis Stack modules (RediSearch + RedisJSON) on the subscription. Configure persistence (AOF or snapshot) based on your durability requirements.
Step 2: VPC Peering with AWS
Set up VPC peering between your AWS VPC and the Redis Cloud VPC. This ensures your application communicates with Redis over private networking — no public internet, no extra latency. In the Redis Cloud console, go to your subscription's Connectivity tab, initiate peering, accept the request in the AWS VPC console, and update your route tables. Verify connectivity from your application's subnet before proceeding.
VPC peering is non-negotiable for production vector search. Every millisecond of network overhead compounds across thousands of queries per second.
Step 3: Index Schema Design and Creation
This is the most critical step. You need to map your Pinecone index schema to a Redis index schema. In Pinecone, you have a namespace, a dimension, a metric (cosine/dotproduct/euclidean), and metadata fields. In Redis, you create an index using FT.CREATE with a VECTOR field (specifying algorithm HNSW or FLAT, dimension, distance metric), TEXT fields (for BM25-searchable content), TAG fields (for filterable metadata), and NUMERIC fields (for range queries).
A typical mapping: Pinecone dense vector → Redis VECTOR field (HNSW, same dimensions, same metric). Pinecone sparse vector → not needed; use TEXT fields on the raw text instead. Pinecone metadata string fields → Redis TAG fields. Pinecone metadata numeric fields → Redis NUMERIC fields. Using RedisVL (Redis Vector Library), schema creation is declarative via YAML.
Step 4: Embedding and Data Migration
Export your vectors and metadata from Pinecone using the fetch or list+fetch API (Pinecone doesn't have a bulk export — you'll need to iterate over IDs or use the list endpoint with pagination). For each record, extract the dense vector, the metadata, and the original text (if stored in metadata). Write each record to Redis as a JSON document using RedisVL's load or direct Redis JSON.SET commands. The dense vector goes into the VECTOR field, text into TEXT fields, and metadata into TAG/NUMERIC fields.
For large migrations (millions of vectors), batch the writes using Redis pipelines — typically 500-1000 records per pipeline batch. This keeps throughput high without overwhelming the cluster. Monitor memory usage during the load and scale the Redis subscription if needed.
Step 5: Query Translation and Mapping
Translate your Pinecone query patterns to Redis query syntax. RedisVL makes this straightforward. A Pinecone query like index.query(vector=query_embedding, sparse_vector=sparse_query, top_k=10, filter={'category': 'finance'}, alpha=0.7) becomes a Redis query that combines KNN vector search with a TAG filter: FT.SEARCH idx '(@category:{finance})=>[KNN 10 @vector $query_vec]' PARAMS 2 query_vec <blob>. For hybrid text+vector queries, use: '(@content:search terms)=>[KNN 10 @vector $query_vec]' where @content is the BM25 text field.
With RedisVL in Python, this is cleaner. Create a VectorQuery with a filter expression, or use HybridQuery to combine text and vector scoring. RedisVL handles the RediSearch syntax generation. The library also supports result ranking, score normalization, and re-ranking patterns.
Step 6: Validation and Benchmarking
Before cutting over, run your top 100-200 production queries against both Pinecone and Redis, and compare results. Check recall (are the same relevant documents in the top-K?), latency (Redis should be faster due to VPC-local networking), and score distributions. Pay special attention to hybrid queries — the scoring blend will differ because Redis uses live BM25 vs Pinecone's pre-computed sparse scores. You may need to adjust your result ranking or re-ranking logic.
Using RedisVL: The Migration Accelerator
RedisVL (redisvl) is the official Python client library for vector search on Redis. It abstracts away the RediSearch query syntax and provides a clean, Pythonic interface for index management, document loading, and querying. For migrations, three features are particularly valuable.
- Schema-as-code — Define your index schema in YAML or Python, version it in git, and create/update indexes programmatically. This makes the Pinecone-to-Redis schema mapping reproducible.
- Bulk loading — RedisVL's SearchIndex.load() accepts lists of dictionaries and handles batching, serialization, and error handling. Point it at your exported Pinecone data and let it write.
- Query abstraction — VectorQuery, FilterQuery, RangeQuery, and HybridQuery classes map directly to common Pinecone query patterns. Swap the client, keep the logic.
The best migration tools don't just move data — they translate intent. RedisVL lets you swap the vector database without rewriting your search logic.
Other Migration Considerations
Embedding Model Compatibility
Your embeddings are model-specific, not database-specific. The same OpenAI, Cohere, or open-source embeddings that work in Pinecone work identically in Redis — just ensure the dimension and distance metric match. No re-embedding is needed.
Namespace to Key-Prefix Mapping
Pinecone uses namespaces to partition data within an index. Redis uses key prefixes — your index is created on a specific key prefix (e.g., docs:), and all documents under that prefix are indexed. Map each Pinecone namespace to a distinct Redis key prefix or a TAG field if you want a single index with namespace filtering.
TTL and Memory Lifecycle
Pinecone doesn't support TTL on vectors. Redis does. If your use case involves expiring embeddings (e.g., session-scoped context, time-sensitive content), you can set per-key TTLs on Redis documents. The index automatically reflects additions and expirations — no manual cleanup.
Multi-tenancy
If you're running multi-tenant workloads, Pinecone uses namespaces or separate indexes per tenant. In Redis, you can use TAG fields for tenant isolation (single index, filter by tenant_id) or separate key prefixes with distinct indexes. The TAG approach is more memory-efficient; the prefix approach gives stronger isolation.
Monitoring and Observability
Redis Cloud provides built-in metrics for memory usage, query latency, throughput, and index size. Integrate with Datadog, Grafana, or New Relic for production dashboards. Monitor FT.INFO for index statistics and FT.PROFILE for per-query latency breakdowns — something Pinecone's managed service doesn't expose.
Rollback Strategy
Keep Pinecone running in read-only mode during the migration. Route writes to both systems during the transition window. Once Redis validation passes and production traffic is stable for 48-72 hours, decommission the Pinecone index. This dual-write pattern ensures zero-risk cutover.
The Bottom Line
Migrating from Pinecone to Redis Cloud is not just a database swap — it's an architectural upgrade. You move from a single-purpose vector store to a unified data platform that handles vectors, full-text search, JSON documents, caching, and session storage in one engine. The hybrid search model is more powerful (live BM25 vs frozen sparse vectors), the cost model is more predictable, and the operational control is significantly deeper.
For teams running production AI agents, the question isn't whether to consolidate your vector search — it's how fast you can move from a managed black box to infrastructure you actually control.