MongoDB Atlas Data Transfer Costs: Why Your Transfer Bill Exceeds Your Cluster Fee — and How to Fix It
MongoDB Atlas pricing looks simple on the surface — pick a tier (M10, M30, M40...), choose a region, and pay a predictable monthly fee for compute and storage. What catches teams off guard is the second line item: data transfer. It's metered separately, charged per GB, and in many production deployments it exceeds the cluster fee itself.
We've audited Atlas deployments where the M40 cluster cost $480/month and the data transfer bill was $1,400/month. The cluster was right-sized. The data transfer was not.
How Atlas Charges for Data Transfer
Atlas runs on the underlying cloud provider (AWS, GCP, or Azure). Data transfer charges follow the cloud provider's pricing model, passed through to your Atlas bill. The key principle: data entering Atlas is free; data leaving Atlas costs money. The cost depends on where the data goes.
| Transfer Type | Direction | Typical Cost (AWS) |
|---|---|---|
| Data in (ingestion) | App → Atlas | Free |
| Same region, same AZ | Atlas → App (same AZ) | Free |
| Same region, cross-AZ | Atlas → App (different AZ) | $0.01/GB each direction |
| Cross-region | Atlas (us-east-1) → App (eu-west-1) | $0.02/GB |
| Internet egress | Atlas → public internet | $0.09/GB (first 10TB) |
| Atlas → S3 (same region) | Backup/export to S3 | Free (same region) |
| Atlas → S3 (cross-region) | Backup/export to S3 | $0.02/GB |
The numbers look small per GB. They are not small at scale. A cluster returning 500GB/day of query results across availability zones accumulates $150/month in cross-AZ fees alone. Add internet egress for APIs, analytics exports, and backup replication — and the transfer bill quietly doubles the cluster cost.
Why Data Transfer Costs Exceed Cluster Fees
There are five common patterns that cause data transfer to balloon beyond the subscription cost. Most teams hit at least two of these.
1. Reading from Secondaries Across Availability Zones
Atlas deploys replica set members across availability zones for high availability. When your application reads from secondaries (readPreference: secondaryPreferred), the secondary is often in a different AZ than your application server. Every read response crosses an AZ boundary and incurs $0.01/GB per direction. For read-heavy AI workloads doing thousands of context retrievals per minute, this adds up fast.
2. Oversized Query Results (SELECT * Syndrome)
The most common and most expensive mistake. Applications query MongoDB without projections — returning entire documents when they only need 3-4 fields. A document with 50 fields and embedded arrays might be 8KB. If the application only needs the name and status fields (200 bytes), it's transferring 40x more data than necessary on every query. Multiply by millions of queries per day.
Projections are the single highest-ROI change for data transfer costs. Adding { projection: { name: 1, status: 1 } } to your queries costs nothing and can cut transfer volume by 80-90%.
3. Multi-Region Replication
Atlas makes it easy to add read replicas in other regions for low-latency global reads. What's less obvious: every write to the primary is replicated to every secondary in every region. If your primary is in us-east-1 and you have read replicas in eu-west-1 and ap-southeast-1, every document write is transferred cross-region twice. For write-heavy workloads (event logging, telemetry, AI context updates), replication transfer can dwarf the query transfer.
4. Analytics and ETL Pipelines
Teams run nightly exports, analytics queries, or ETL jobs that scan entire collections. A 200GB collection scanned nightly for a reporting pipeline generates 6TB/month of data transfer — at $0.09/GB for internet egress, that's $540/month just for the export. If the analytics tool is in a different region, it's still $0.02/GB = $120/month.
5. Change Streams Without Filters
MongoDB Change Streams are powerful for real-time sync (to Redis, to Elasticsearch, to downstream services). But an unfiltered change stream on a busy collection sends every insert, update, and delete — including the full document — over the network. If the consumer is in a different AZ or region, every change event is a data transfer charge.
The Optimization Playbook
Here are the concrete fixes, ordered by impact. The first three typically reduce data transfer by 50-70% with minimal application changes.
Use Projections on Every Query
Never return full documents unless you need every field. Use MongoDB projections to return only the fields the application consumes. For AI context retrieval, you likely need the text content and metadata — not the audit trail, internal flags, or embedded history. This is the fastest fix and often the biggest single reduction.
- Before: db.contexts.find({ agentId: '123' }) — returns full 12KB documents.
- After: db.contexts.find({ agentId: '123' }, { projection: { text: 1, metadata: 1 } }) — returns 1.5KB.
- Reduction: 87.5% less data transferred per query.
Co-locate Application and Atlas in the Same AZ
If your application runs on EKS or EC2, deploy it in the same availability zone as the Atlas primary. Same-AZ data transfer is free. Cross-AZ is $0.01/GB each way. For a cluster handling 1TB/month of read responses, same-AZ placement saves $20/month per TB — and for large clusters, this compounds.
Atlas shows which AZ hosts the primary in the cluster configuration. Pin your application's Kubernetes node group or EC2 placement group to that AZ. If you use readPreference: primaryPreferred (which you should for cost optimization), all reads hit the primary in the same AZ.
Filter Change Streams Aggressively
Use pipeline filters on change streams to only receive the events you need. Specify operationType, restrict to specific fields with $project, and filter by document criteria. A Debezium connector or change stream consumer that only needs inserts on the orders collection should not be receiving updates on every collection in the database.
- Filter by operation: { $match: { operationType: 'insert' } } — ignores updates and deletes.
- Filter by collection: Target specific collections instead of watching the entire database.
- Project fields: { $project: { fullDocument.name: 1, fullDocument.status: 1 } } — reduces payload per event.
- Use updateLookup: false when you don't need the full document on updates — only the delta.
Limit Multi-Region Replication
Only replicate to regions where you have active users who need low-latency reads. If 95% of your traffic is from us-east-1, the eu-west-1 replica is a cost center. Consider removing it and serving EU traffic from the US with slightly higher latency, or use a CDN/cache layer in EU instead of a full Atlas replica.
| Scenario | Replicas | Monthly Transfer Cost (100GB writes/mo) |
|---|---|---|
| Single region, 3-node | 2 replicas, same region | ~$2 (cross-AZ) |
| Two regions | 2 + 2 remote replicas | ~$200 (cross-region replication) |
| Three regions | 2 + 2 + 2 remote replicas | ~$400 (cross-region replication) |
| Single region + Redis cache in EU | 2 replicas + Redis read-through | ~$15 (Redis cache fill only) |
Adding a remote region in Atlas is one click. The data transfer cost of that click can be $200+/month for a moderately active cluster. Always do the math before adding regions.
Move Analytics to Atlas Data Federation or Same-Region S3
Instead of querying your live cluster for analytics (which generates egress), use Atlas Data Federation to query data directly from S3 — where your backups already live. Or export to S3 in the same region (free transfer) and run analytics there. This moves the expensive full-scan queries off your cluster and off your transfer bill.
Enable Compression
MongoDB drivers support wire protocol compression (Snappy, zlib, zstd). Enable compressors in your connection string: mongodb+srv://...?compressors=zstd. The driver compresses data before sending and decompresses on receipt. For text-heavy AI workloads (context chunks, conversation history, embeddings metadata), zstd compression typically reduces wire transfer size by 60-70% with minimal CPU overhead.
Use $limit and Pagination Correctly
Queries without $limit return all matching documents. If a collection has 10 million matching documents and your application only renders 50, you're transferring 199,950 unnecessary documents. Always use $limit in your aggregation pipelines or limit() in find queries. Combine with cursor-based pagination (using _id or an indexed field) instead of skip-based pagination, which still scans and transfers skipped documents internally.
How to Monitor Data Transfer Costs
Atlas doesn't surface per-query data transfer costs directly. But you can track it at three levels:
- Atlas billing dashboard — Shows total data transfer as a line item, broken down by type (same-region, cross-region, internet). Check this monthly.
- Atlas Profiler — Shows slow queries with bytes returned. Sort by bytesReturned to find the most expensive queries. These are your optimization targets.
- Cloud provider cost explorer — AWS Cost Explorer (or GCP/Azure equivalent) shows NAT gateway charges and data transfer by service. If your Atlas VPC peering routes through a NAT gateway, that's an additional $0.045/GB charge on top of Atlas transfer fees.
- MongoDB driver metrics — Track bytes sent/received in your application's MongoDB driver metrics. Most drivers expose connection pool statistics that include transfer volume.
The Atlas Data Transfer Checklist
- 1. Add projections to every query — return only the fields you use.
- 2. Co-locate application and Atlas primary in the same AZ.
- 3. Use readPreference: primaryPreferred for cost-sensitive workloads.
- 4. Enable wire compression (zstd) in your connection string.
- 5. Filter change streams by operation type and project only needed fields.
- 6. Audit multi-region replicas — remove regions that don't justify the transfer cost.
- 7. Move analytics/ETL to Atlas Data Federation or same-region S3.
- 8. Set $limit on every query; use cursor-based pagination.
- 9. Monitor bytesReturned in the Atlas Profiler to find expensive queries.
- 10. Check for NAT gateway charges — VPC peering avoids the $0.045/GB NAT fee.
Atlas cluster sizing gets all the attention. Data transfer optimization gets none. In our experience, fixing transfer costs saves more money than downsizing the cluster — because most teams already picked the right tier but never looked at how much data leaves it.