ObservabilityRedis CloudMonitoringMetricsAlerting

Redis Cloud v2 Metrics: Complete Reference for Monitoring, Alerting, and Production Observability

Polystreak Team2026-04-0414 min read

Redis Cloud v2 metrics provide deep, production-grade observability into every layer of your Redis deployment. Unlike basic INFO-command metrics that most monitoring setups rely on, the v2 metric set includes true histogram latency distributions, memory allocator internals, keyspace distribution by data structure and size, Active-Active CRDT synchronization tracking, and node-level infrastructure metrics. It's the difference between knowing Redis is slow and knowing exactly why.

Redis Cloud exports these metrics natively — no agent required. Connect your monitoring platform using an API key, select your region, and metrics start flowing within minutes. The metrics follow standard conventions: Gauge (current value), Count (monotonically increasing counter), Histogram (latency distribution), and Info (metadata). They arrive pre-tagged with cluster, database, shard, region, and role labels for instant filtering.

Basic Redis monitoring tells you 'latency is high.' v2 metrics tell you 'P99 write latency crossed 8ms, memory fragmentation is at 1.7, the allocator is holding 40% more resident memory than active, and 3 keys in the large sorted-set bucket are blocking the event loop.' That's the depth difference.

What the v2 Metric Set Covers

80+ metrics organized across 12 categories — from Redis process internals to node hardware.

Configuration and metadata — database config state, throughput limits, cluster metadata
Memory — 13 metrics covering usage, limits, allocator internals, fragmentation, and background process indicators
Latency — true histogram distributions for read, write, and other operations (P50/P90/P95/P99, not averages)
Traffic — request/response counts by type, ingress/egress bytes, backpressure indicators
Connections — connected clients, blocked clients, connection churn, proxy disconnections, establishment failures
Network — Redis-level input/output bytes
CPU — process and per-thread CPU consumption
Keyspace — key counts, expiry counts, evictions, hit/miss ratios separated by read and write
Keyspace distribution — key counts by data structure type (String, Hash, List, Set, Sorted Set) and size bucket
Replication and syncer — replication offsets, Active-Active lag, syncer status, byte-level sync tracking
Client tracking — server-assisted caching metrics for Redis 6+ client-side caching
Node-level infrastructure — CPU, memory, network I/O, packet counts on the underlying hardware (Pro subscriptions)

Configuration and Metadata Metrics

These metrics expose database and cluster configuration state. Use them to track config drift, detect unexpected changes, and correlate configuration modifications with performance shifts.

Metric	Type	Description	Unit
db_config	Info	Database configuration metadata — TLS mode, Redis version, port. Track changes over time to detect config drift.	N/A
bdb_max_throughput	Gauge	Maximum configured throughput for the database. If ops/sec approaches this limit, requests will be throttled.	Ops/sec
bdb_data	Info	Database-level metadata and configuration data.	N/A
cluster_data	Info	Cluster-level metadata — account_id, subscription, region, maintenance flags. The identity record for the cluster.	N/A

Memory Metrics (13 Metrics)

Memory is the most critical monitoring category for Redis. These 13 metrics cover the full stack: from how much data you're storing, to how the allocator is managing physical memory, to whether background persistence operations are spiking latency.

Metric	Type	What It Tells You
redis_server_used_memory	Gauge	Total memory consumed by data. The primary capacity metric. Alert at 80% of maxmemory — evictions begin beyond this.
redis_server_maxmemory	Gauge	Configured maxmemory limit. The ceiling. Compare with used_memory for utilization percentage.
db_memory_limit_bytes	Gauge	Database-level memory limit configured in Redis Cloud. Different from maxmemory when overcommit is enabled.
redis_server_used_memory_overhead	Gauge	Memory consumed by Redis internals — buffers, metadata, data structures overhead. Not your data, but still your bill.
redis_server_mem_fragmentation_ratio	Gauge	Ratio of RSS (physical memory) to used_memory (logical data). Above 1.5 means 50%+ waste from fragmentation. Below 1.0 means Redis is swapping to disk — critical.
redis_server_allocator_allocated	Gauge	Bytes allocated from jemalloc. Includes internal fragmentation within allocated pages.
redis_server_allocator_active	Gauge	Bytes in allocator active pages. Includes external fragmentation. Compare with allocated to quantify fragmentation.
redis_server_allocator_resident	Gauge	Resident memory held by allocator. The actual OS-level memory footprint Redis occupies.
redis_server_active_defrag_running	Gauge	1 if active defragmentation is running. Correlate with latency — defrag can cause micro-spikes during compaction.
redis_server_mem_aof_buffer	Gauge	Memory consumed by the AOF (Append-Only File) buffer. Spikes during heavy write bursts as commands queue for persistence.
redis_server_mem_replication_backlog	Gauge	Memory used by the replication backlog. Sized to handle replica reconnection without triggering a full resync.
redis_server_rdb_bgsave_in_progress	Gauge	1 if RDB background save is running. The fork operation can cause latency spikes proportional to dataset size.
redis_server_aof_rewrite_in_progress	Gauge	1 if AOF rewrite is in progress. Another fork-based operation that can spike latency.

Three memory metrics matter most: used_memory (how full you are), maxmemory (the ceiling before evictions), and mem_fragmentation_ratio (how efficiently you're using what you have). Everything else is diagnostic — reach for them when those three raise a flag.

Latency Metrics — True Histogram Distributions (9 Metrics)

The v2 metrics provide true histogram latency — not averages. This is the single most important difference from basic Redis monitoring. Averages hide outliers. A P50 of 1ms and a P99 of 50ms both produce an average that looks fine while 1% of your requests are unacceptably slow. Histograms give you the full distribution: P50, P90, P95, P99.

Latency is broken into three operation categories — read, write, and other — each with count, sum, and bucket metrics.

Metric	Type	Description
endpoint_read_requests_latency_histogram_count	Count	Total number of read latency observations. Rate this for read throughput.
endpoint_read_requests_latency_histogram_sum	Count	Sum of all read latency values (microseconds). Divide by count for average — but prefer percentiles.
endpoint_read_requests_latency_histogram_bucket	Histogram	Read latency distribution across buckets. The raw data for computing P50/P90/P95/P99 read latency.
endpoint_write_requests_latency_histogram_count	Count	Total write latency observations.
endpoint_write_requests_latency_histogram_sum	Count	Sum of write latency values (microseconds).
endpoint_write_requests_latency_histogram_bucket	Histogram	Write latency distribution. The most important metric for AI context store write performance.
endpoint_other_requests_latency_histogram_count	Count	Total other command latency observations (admin commands, Pub/Sub, module commands).
endpoint_other_requests_latency_histogram_sum	Count	Sum of other command latency values.
endpoint_other_requests_latency_histogram_bucket	Histogram	Other command latency distribution. Watch for FT.SEARCH and vector search operations here.

To compute P99 read latency, use the standard histogram percentile formula: histogram_quantile(0.99, sum(rate(endpoint_read_requests_latency_histogram_bucket[5m])) by (le)). The result is in microseconds — divide by 1000 for milliseconds. This works in any monitoring platform that supports PromQL-style queries.

Traffic Metrics (10 Metrics)

Traffic metrics separate reads, writes, and other commands — and further separate requests from responses. This lets you detect asymmetries: if requests consistently exceed responses, commands are being dropped, timing out, or queuing.

Metric	Type	Description
endpoint_read_requests	Count	Total read requests received. Rate this for reads/sec.
endpoint_write_requests	Count	Total write requests received.
endpoint_other_requests	Count	Non-read/write commands — PING, CONFIG, SUBSCRIBE, module commands, etc.
endpoint_read_responses	Count	Responses sent for read requests. Compare with read_requests to detect drops.
endpoint_write_responses	Count	Responses sent for write requests.
endpoint_other_responses	Count	Responses for other commands.
endpoint_ingress	Count	Total bytes transferred into the database. Track for data transfer cost estimation.
endpoint_egress	Count	Total bytes transferred out of the database. The primary driver of data transfer cost.
endpoint_egress_pending	Gauge	Pending outgoing bytes waiting to be sent. Sustained non-zero values indicate network backpressure — the client can't consume fast enough.
endpoint_egress_pending_discarded	Count	Pending bytes discarded because the client disconnected before receiving them. Indicates clients timing out.

Connection and Client Metrics (8 Metrics)

Metric	Type	What It Tells You
redis_server_connected_clients	Gauge	Current connected clients. Alert at 80% of maxclients to prevent connection exhaustion.
redis_server_blocked_clients	Gauge	Clients blocked on BLPOP/BRPOP/WAIT. Sustained non-zero values indicate a consumer bottleneck.
redis_server_instantaneous_ops_per_sec	Gauge	Real-time operations per second. The headline throughput metric.
endpoint_client_connections	Count	New client connection establishment events. High rate means high connection churn — a sign of missing or broken connection pooling.
endpoint_client_disconnections	Count	Client-initiated disconnections. Normal during scale-down or deployment.
endpoint_proxy_disconnections	Count	Proxy-initiated disconnections. Non-zero means the Redis Cloud proxy is actively dropping connections — investigate maxclients or proxy resource limits.
endpoint_client_connection_expired	Count	Connections expired due to idle TTL. Expected behavior for connection lifecycle management.
endpoint_client_establishment_failures	Count	Failed connection attempts. Non-zero means clients are failing to connect — check DNS resolution, TLS certificate validity, maxclients limit, or network connectivity.

The most overlooked connection metric: endpoint_client_connections rate. If you see hundreds of new connections per minute while connected_clients stays low, your application is connecting and disconnecting on every request. You're paying the TCP+TLS handshake cost — 2-5ms — on every single operation. Fix the connection pool.

Network and CPU Metrics

Network (Redis-Level)

Metric	Type	Description
redis_server_total_net_input_bytes	Count	Total bytes received by the Redis process. Rate this for inbound bandwidth utilization.
redis_server_total_net_output_bytes	Count	Total bytes sent by Redis. Rate this for outbound bandwidth. Also the basis for estimating data transfer costs.

CPU (Process-Level)

Metric	Type	Description
namedprocess_namegroup_cpu_seconds_total	Count	Total CPU seconds consumed by the Redis process. Rate this for overall CPU utilization.
namedprocess_namegroup_thread_cpu_seconds_total	Count	CPU seconds per Redis thread. Identifies hot threads — critical for diagnosing I/O thread saturation in Redis 6+ multi-threaded I/O.

Keyspace Metrics (10 Metrics)

Keyspace metrics tell you what's inside your database: how many keys, how many expire, and whether your reads are hitting or missing. The v2 metrics separate hits and misses by read and write — more granular than the combined keyspace_hits/keyspace_misses in basic Redis INFO.

Metric	Type	What It Tells You
redis_server_db_keys	Gauge	Total keys in the database. Track the growth rate to predict when memory limits will be reached.
redis_server_db_expires	Gauge	Keys with expiration set. If db_expires is much less than db_keys, many keys are permanent and will never be reclaimed by TTL.
redis_server_expired_keys	Count	Keys expired by TTL. Expected behavior — rate indicates how fast your data is cycling through.
redis_server_evicted_keys	Count	Keys evicted by the maxmemory eviction policy. Every eviction is data loss. For AI context stores, this means lost agent memories. Alert on any non-zero rate.
redis_server_keys_trimmed	Count	Keys trimmed (stream MAXLEN enforcement). Indicates Redis Streams length management is active.
redis_server_up	Gauge	Database availability: 1 = up, 0 = down. The most fundamental health check. Alert immediately on 0.
redis_server_keyspace_read_hits	Count	Successful read lookups — the key existed. Use with read_misses for read hit ratio calculation.
redis_server_keyspace_write_hits	Count	Successful write lookups — the key existed before the write operation.
redis_server_keyspace_read_misses	Count	Failed read lookups — the requested key did not exist. High miss rate indicates cold cache, wrong keys, or expired data.
redis_server_keyspace_write_misses	Count	Write operations to keys that didn't previously exist (new key creation).

Keyspace Distribution by Data Structure (15 Metrics)

These metrics are unique to the v2 metric set — you won't find them in basic Redis monitoring. They break down your key population by data structure type and size bucket. This is how you find the oversized keys that are silently degrading performance.

Data Structure	Small Bucket	Medium Bucket	Large Bucket
Strings	strings_sizes_under_128M (< 128MB)	strings_sizes_128M_to_512M	strings_sizes_over_512M (> 512MB)
Sorted Sets	zsets_items_under_1M (< 1M items)	zsets_items_1M_to_8M	zsets_items_over_8M (> 8M items)
Sets	sets_items_under_1M (< 1M items)	sets_items_1M_to_8M	sets_items_over_8M
Lists	lists_items_under_1M (< 1M items)	lists_items_1M_to_8M	lists_items_over_8M
Hashes	hashes_items_under_1M (< 1M items)	hashes_items_1M_to_8M	hashes_items_over_8M

All metric names are prefixed with redis_server_. If any 'large' bucket is non-zero, investigate immediately. A single sorted set with 10 million items or a string exceeding 512MB will cause latency spikes on every operation that touches it. Large keys block the Redis event loop during serialization, deletion, and persistence — affecting all other operations on that shard.

The keyspace distribution table is the fastest way to find the keys that will break your system at scale. If the 'Large' column has any non-zero value, you have a problem — even if latency looks fine today. It will degrade as traffic grows.

Replication and Syncer Metrics (9 Metrics)

These metrics cover two replication modes: standard primary-replica synchronization and Active-Active (CRDT) cross-region database synchronization. If you run geo-distributed AI agent deployments with Active-Active databases, the syncer metrics are essential for detecting cross-region lag before it causes stale context retrieval.

Metric	Type	What It Tells You
redis_server_master_repl_offset	Gauge	Replication offset on the primary. Compare with slave_offset to compute lag in bytes.
redis_server_slave_offset	Gauge	Replication offset on the replica. The difference (master_repl_offset - slave_offset) = replication lag in bytes.
database_syncer_dst_lag	Gauge	Lag between the syncer and the destination (milliseconds). The primary health metric for Active-Active sync.
database_syncer_current_status	Gauge	Syncer status indicator. 0 = not running. Monitor for unexpected state transitions.
database_syncer_total_requests	Count	Total write operations delivered to the destination by the syncer. Rate this for sync throughput.
database_syncer_ingress_bytes	Count	Bytes read from the source shard by the syncer.
database_syncer_ingress_bytes_decompressed	Count	Decompressed bytes received by the syncer. Compare with ingress_bytes to measure wire compression effectiveness.
database_syncer_syncer_repl_offset	Gauge	The syncer's own replication tracking offset.
database_syncer_dst_repl_offset	Gauge	The destination's replication offset. Compare with syncer_repl_offset for sync position lag.

Client Tracking and Caching Metrics (4 Metrics)

Redis 6+ introduced server-assisted client-side caching — where the server tracks which keys a client has cached locally and sends invalidation messages when those keys change. These metrics show adoption and correctness of that protocol.

Metric	Type	What It Tells You
endpoint_client_tracking_on_requests	Count	CLIENT TRACKING ON commands issued. Shows how many clients are using server-assisted caching.
endpoint_client_tracking_off_requests	Count	CLIENT TRACKING OFF commands. Clients opting out of tracking.
endpoint_disposed_commands_after_client_caching	Count	Commands disposed due to client caching protocol misuse. Non-zero means a client library bug — investigate.
endpoint_client_expiration_refresh	Count	Client connection expiration TTL refresh events.

Node-Level Infrastructure Metrics (9 Metrics — Pro Only)

Pro subscriptions expose the underlying node hardware metrics — the physical machine running your Redis shards. These are invisible on Essentials plans. When Redis-level metrics look fine but performance is degraded, node-level metrics reveal whether you're hitting hardware ceilings: CPU saturation, memory exhaustion at the OS level, or network interface limits.

Metric	Type	What It Tells You
node_available_memory_bytes	Gauge	Available memory on the node. If this approaches zero, the OS OOM-killer will terminate processes.
node_memory_MemFree_bytes	Gauge	Free (unallocated) memory on the node. Available minus cached/buffered.
node_cpu_seconds_total	Count	Total CPU seconds consumed per mode (user, system, iowait, idle). Rate by mode for CPU utilization breakdown.
node_network_receive_bytes_total	Count	Bytes received on the node's network interface. Rate for inbound bandwidth utilization.
node_network_transmit_bytes_total	Count	Bytes transmitted. Rate for outbound bandwidth.
node_ingress_bytes	Count	Total incoming traffic across all processes on the node.
node_egress_bytes	Count	Total outgoing traffic across all processes on the node.
node_network_receive_packets_total	Count	Network packets received. High packet rate with low byte rate means small-payload inefficiency — batch your operations.
node_network_transmit_packets_total	Count	Network packets transmitted.

Labels and Tags: The Complete Taxonomy

Every v2 metric arrives pre-tagged with rich labels for filtering, grouping, and dashboard segmentation. There are 9 distinct label categories. Understanding them is the difference between a dashboard that shows 'average across everything' and one that shows 'P99 latency on shard-3 of database-prod-context in us-east-1.'

Default System Tags (Auto-Attached to All Metrics)

Tag	Description	Use Case
cluster	Redis Cloud cluster identifier	Filter metrics to a specific cluster
db	Database identifier	Scope dashboards to one database
shard	Shard identifier	Per-shard analysis for clustered databases — find hot shards
region	Cloud region where the database is deployed	Regional dashboards, multi-region latency comparison
role	Node role: master or replica	Compare primary vs replica performance
account_id	Redis Cloud account identifier	Multi-account environments
subscription_id	Subscription identifier	Group metrics by subscription for cost allocation
syncer_type	crdt or replica (present only on syncer metrics)	Distinguish Active-Active from standard replication

Redis Enterprise Core Labels

Category	Label	Description
Identity	cluster	Cluster FQDN
Identity	bdb / bdb_id	Redis Enterprise database ID
Identity	bdb_name	Database name (human-readable)
Identity	db	Database ID (CSE) or Redis logical DB (0-15)
Identity	node	Node identifier
Identity	redis	Shard identifier
Topology	role	master/primary or slave
Topology	slots	Hash slot range assigned to this shard
Topology	status	Shard operational status
Topology	shard_type	ram, flash, or total — indicates storage tier

Active-Active (CRDT) Labels

Available only for Active-Active databases — geo-distributed conflict-free replicated databases across multiple regions.

Label	Description
crdt_guid	Active-Active database GUID — the unique identifier across all participating regions
crdt_replica_id	Replica ID (1-10) — identifies which geographic instance within the Active-Active group
crdt_peer	Peer ID — the remote region this metric relates to
crdt_backlog	Backlog indicator — pending sync data not yet delivered
src_id	Source ID for syncer operations
dst_id	Destination ID for syncer operations

Account and Subscription Labels

Attached via cluster_data metrics. Essential for multi-tenant environments, cost allocation dashboards, and subscription-level capacity planning.

Label	Description
account_id	Redis Cloud account ID
account_name	Account name
cluster_id	Cluster ID
cluster_name	Cluster name (human-readable)
subscription	Subscription ID
vip	Virtual IP assigned to the cluster
shared	Shared cluster flag (true for multi-tenant)
under_maintenance	Maintenance status — flag active maintenance windows

CSE Labels (job=rlec_v2)

Label	Context	Why It Matters
db_port	db_config metric	Database port — useful for multi-database-per-cluster identification
db_version	db_config metric	Redis version running on this database. Track for version drift across databases.
tls_mode	db_config metric	TLS enabled or disabled. Flag any production database without TLS enabled.

Node and Proxy Labels

Label	Description
addr	Node IP address
cnm_version	Cluster Node Manager version
proxy	Proxy ID
endpoint	Endpoint ID
port	Listener port
driver	Storage driver (e.g., speedb)

Alerting Labels (job=rlec_node)

Used for infrastructure alert metrics emitted at the node level.

Label	Description
alertname	Alert name identifier
alertstate	Alert state (firing, pending, resolved)
severity	Alert severity level
cloud	Cloud provider
region	Cloud region
zone_id / zone_name	Availability zone identification
machine_type	Instance type (e.g., r6g.xlarge)
process	Process name
disk_path / directory_path / file_path	Storage path identifiers

Critical: The db vs bdb Naming Difference

There is a naming inconsistency between metric sources that will break your dashboards and monitors if you mix them.

Concept	CSE Metrics (rlec_v2)	Standard Metrics
Database ID	db	bdb
Database Name	db_name	bdb_name

Always check which job the metric originates from before building queries. Mixing db and bdb labels in the same query produces empty results or incorrect aggregations. This is the #1 debugging issue when building Redis Cloud dashboards.

Query Grouping Best Practices

How you group metrics determines the granularity of your monitoring. Choose the right level for each dashboard panel.

Scope	Group By	When to Use
Account-level	sum by (account_id)	Executive dashboards, multi-account cost views
Subscription-level	sum by (subscription)	Subscription cost tracking, capacity planning
Database-level	sum by (bdb) or sum by (db)	Per-database monitoring. Use bdb for standard jobs, db for CSE.
Shard-level	sum by (shard)	Diagnosing hot shards, detecting uneven data distribution
Role-level	sum by (role)	Comparing primary vs replica latency and throughput

Avoid grouping by instance unless you are specifically debugging metric collection issues.
Never aggregate histogram _bucket metrics without preserving the le (bucket boundary) label — you will destroy the distribution and get meaningless percentile values.
Use custom database tags in Redis Cloud (team, environment, service) — they flow automatically into exported metrics as labels for business-context filtering.

Building Production Dashboards

Organize your Redis Cloud monitoring dashboard into six sections, each answering one operational question. For Pro subscriptions, add a seventh.

Section 1 — Health Overview: redis_server_up status indicator, connected_clients gauge, instantaneous_ops_per_sec timeseries, memory utilization percentage. The 10-second glance that answers 'is everything OK right now?'
Section 2 — Latency Distribution: Histogram percentile graphs for read, write, and other latency. Show P50, P90, P95, P99 on the same chart. For AI context retrieval, P99 read latency is the metric that determines user-perceived agent performance.
Section 3 — Memory Deep Dive: used_memory vs maxmemory as a utilization gauge. Fragmentation ratio timeseries. Allocator breakdown (allocated vs active vs resident) to quantify fragmentation. Eviction rate. RDB/AOF save indicators overlaid on the latency chart to correlate fork-induced spikes.
Section 4 — Traffic and Throughput: Read/write/other request rates as stacked area. Ingress/egress bytes for data transfer cost visibility. Request vs response count comparison to detect command drops. egress_pending for backpressure detection.
Section 5 — Connections: Connected clients timeseries. New connection rate (endpoint_client_connections). Establishment failures. Proxy disconnections. Blocked clients. If new connections per minute significantly exceeds connected_clients, connection pooling is broken.
Section 6 — Keyspace Intelligence: Key count and expiry count over time. Eviction rate (should be zero for context stores). Read hit ratio as a computed metric (read_hits / (read_hits + read_misses)). Keyspace distribution by data structure — any non-zero 'large' bucket should be a red indicator.
Section 7 (Pro only) — Node Infrastructure: Node available memory, CPU utilization by mode (user/system/iowait), network packet rates and bandwidth utilization.

Alerting: 6 Monitors That Cover Production Failure Modes

Six alerts. Not twenty. Every additional alert beyond the critical set dilutes team attention and causes alert fatigue — the state where every alert is ignored because most are noise.

Alert	Condition	Severity	Why It Matters
Database down	redis_server_up = 0	P1 Critical	Complete outage. All reads and writes fail. Every second counts.
Memory critical	used_memory > 85% of maxmemory for 5 minutes	P1 High	Evictions are imminent. Context data, session data, and cached embeddings will be dropped.
Evictions active	redis_server_evicted_keys rate > 0	P1 High	Data is actively being lost. For AI agent context stores, this means lost memories and degraded agent quality.
Latency spike	Read or write P95 > configured threshold for 5 minutes	P2 Warning	Context retrieval or write performance degrading. End-user response times will increase.
Replication lag	database_syncer_dst_lag > threshold for 5 minutes	P2 Warning	Replicas serving stale data. Active-Active regions are out of sync. Cross-region reads return outdated context.
Blocked clients	redis_server_blocked_clients > baseline for 10 minutes	P3 Info	Consumer bottleneck on blocking list/stream operations. Investigate consumer throughput.

If your Redis monitoring has 20 alerts, you effectively have zero. The team learned to ignore them two weeks after they were created. Six well-tuned alerts with clear severity levels and runbook links outperform fifty noisy ones every time.

v2 Metrics vs Basic Redis INFO Monitoring

Many teams monitor Redis using the basic INFO command output — 30-40 metrics covering memory, clients, stats, and replication. The v2 metric set exported by Redis Cloud goes significantly deeper.

Capability	v2 Metrics (Redis Cloud Export)	Basic INFO Monitoring
Metric count	80+ metrics across 12 categories	~30-40 from INFO sections
Latency measurement	True histogram buckets — P50/P90/P95/P99	Average only (or none without Slowlog parsing)
Keyspace distribution	By data structure type and size bucket (15 metrics)	Total keys and expires only
Active-Active sync	Full syncer metrics — lag, offsets, bytes, status	Not available
Client tracking	Server-assisted caching metrics	Not available
Node infrastructure	CPU, memory, network at OS level (Pro)	Not available on managed services
Request/response split	Separate read, write, other — requests vs responses	Combined cmdstat only
Memory allocator	jemalloc allocated, active, resident	Basic used_memory and RSS only
Connection lifecycle	Connect, disconnect, expire, proxy drop, establishment failure	connected_clients count only
Labeling depth	9 label categories — cluster, db, shard, role, region, account, CRDT, node, alerting	None (flat metrics)

For production AI workloads running on Redis Cloud, the v2 metric set is the monitoring foundation. Basic INFO monitoring was adequate when Redis was a simple cache. When Redis holds your agent's context, vector indexes, session state, and real-time feature stores — you need the full picture.