All posts
ObservabilityPrometheusRedis CloudRedis Enterprise

Scraping Redis Metrics with Prometheus: Redis Cloud & Redis Enterprise

Polystreak Team2026-04-027 min read

Your Redis cluster is emitting metrics right now — hundreds of them. Memory usage, command latency, connections, replication state, eviction counts. The data is there. The question is how you pull it out, where you store it, and which of those hundreds actually matter.

Redis Enterprise exposes a Prometheus-compatible endpoint on port 8070. One scrape config line and you have visibility into your entire cluster.

How Prometheus Scraping Works

Prometheus is a pull-based monitoring system. Instead of your services pushing metrics to a central server, Prometheus reaches out and pulls them on a schedule. Every 15-30 seconds (configurable via scrape_interval), the Prometheus server sends an HTTP GET request to a /metrics endpoint on your target. The target responds with plain-text metrics in the Prometheus exposition format — lines of metric_name{labels} value.

This pull model is what makes Prometheus so reliable. If a target goes down, Prometheus simply records a failed scrape — it doesn't lose data from other targets. And because targets expose metrics over HTTP, there's no agent to install on the target itself. Redis Enterprise already speaks this protocol natively on port 8070.

Time Series Data in Prometheus

When Prometheus scrapes a target, it stores each metric as a time series — a sequence of timestamped values identified by a metric name and a set of key-value labels. For example, redis_commands_duration_seconds_total{cmd="get", node="redis-1"} becomes a unique time series. Every scrape appends a new data point to that series.

Prometheus stores these time series in its own local TSDB (time series database) on disk, using efficient block-based compression. By default, it retains 15 days of data. For longer retention, teams use Thanos or Cortex as a remote write backend. The key architectural point: Prometheus is both the collector and the store. There's no separate database — the server itself holds the time series and serves PromQL queries.

  • Scrape interval — How often Prometheus pulls. 15s is standard. Shorter intervals give finer resolution but cost more storage.
  • Scrape target — The host:port/path Prometheus hits. For Redis Enterprise: port 8070. For Redis Cloud: available via the Redis Cloud API or Prometheus integration.
  • Labels — Key-value pairs that uniquely identify a time series. Labels let you slice data in Grafana by node, shard, database, or command type.
  • Retention — How long data lives in the local TSDB. Default 15 days. Extend via --storage.tsdb.retention.time flag or use remote storage.

Redis Enterprise: Port 8070

Redis Enterprise exposes a Prometheus-compatible /metrics endpoint on port 8070 of every cluster node. Hit http://your-redis-node:8070/metrics and you'll see metrics for every database, shard, and node — memory, throughput, latency, connections, replication, and more. No exporter needed. No sidecar process. The cluster speaks Prometheus natively.

In your prometheus.yml, add a scrape job pointing to your nodes. If you're running multiple nodes, list them all or use DNS service discovery. For EKS clusters, Kubernetes SD can auto-discover Redis Enterprise pods.

Port 8070 is your single source of truth for Redis Enterprise health. One endpoint, every metric, no middleware.

Redis Cloud: Prometheus Integration

Redis Cloud (the fully managed offering) also supports Prometheus metric export, but the mechanism differs. Instead of scraping a node directly, you enable the Prometheus integration in the Redis Cloud console. Redis Cloud generates a dedicated scrape endpoint that your Prometheus server can pull from. The metrics are the same — command latency, memory, throughput, connections — but delivered through Redis Cloud's managed infrastructure.

For Redis Cloud subscriptions with VPC peering, ensure your Prometheus server can reach the scrape endpoint over the peered network. The endpoint is authenticated — you'll configure the bearer token in your prometheus.yml scrape config.

Which Metrics Matter for AI Workloads

Redis exposes 200+ metrics. For AI agent infrastructure, focus on these:

  • redis_commands_duration_seconds — Command latency. The single most important metric. Break it down by command type (GET, SET, FT.SEARCH) to isolate vector search latency from key-value operations.
  • redis_commands_processed_total — Throughput. Use PromQL rate() to get ops/sec. Track separately for read and write commands.
  • redis_memory_used_bytes / redis_memory_max_bytes — Memory saturation. Alert at 80%. When Redis hits maxmemory, evictions start and your AI agent loses cached context.
  • redis_keyspace_hits_total / redis_keyspace_misses_total — Cache hit ratio. For semantic cache workloads, this directly maps to LLM API cost savings. A 90% hit ratio means 90% of lookups avoid an expensive embedding or completion call.
  • redis_connected_clients — Connection count. Sudden spikes indicate connection leaks or a traffic surge. Approaching maxclients means requests will be rejected.
  • redis_rejected_connections_total — Should always be zero. If it's not, you're dropping client connections.
  • redis_evicted_keys_total — Evictions mean data loss. For AI context stores, an eviction means your agent lost a piece of memory or cache. Track and alert on any non-zero rate.

Once scraping, build Grafana dashboards around these 7-10 core metrics. Use PromQL rate() for throughput, histogram_quantile() for P50/P90/P99 latency, and ratio expressions for cache hit rates. Everything else is noise until these are healthy.

200+ metrics available. 7-10 that matter. The discipline to ignore the rest is what separates a useful dashboard from a wall of numbers.