Instrumenting Your Applications with Prometheus Client Libraries

Eglis Alvarez
Sep 14
2 min read

In our previous article, we explored how to monitor infrastructure metrics with Prometheus and Grafana. That gave us visibility into servers, containers, and databases. But infrastructure alone doesn’t tell the full story.

If your CPU is at 90%, is that necessarily bad? It depends. If it happens while processing thousands of successful user requests, maybe not. If it happens during login failures, then it’s critical.

To truly understand business impact, we need to look inside our applications. This is where Prometheus client libraries come in.

Why Instrument Applications?

System metrics answer “how healthy is my infrastructure?” but application metrics answer “how healthy is my business?”

Example:

Infrastructure metric: CPU usage at 90%.
Application metric: Login API processing 1,200 requests per second with a 20% failure rate.

Only the second one points to an actionable business problem.

By exposing metrics directly from code, we can measure what truly matters: transactions processed, queue lengths, response times, error rates, or even domain-specific KPIs (like payments approved).

Prometheus Client Libraries

Prometheus provides official libraries in multiple languages:

Go → prometheus/client_golang
Java / Scala → simpleclient
Python → prometheus_client
Ruby → prometheus-client

With these libraries, you can define four main metric types:

Counter → Cumulative, only increases (e.g., number of orders processed).
Gauge → Goes up or down (e.g., number of active sessions).
Histogram → Groups observations into buckets (e.g., request latency distribution).
Summary → Calculates quantiles directly (e.g., 95th percentile response time).

Example: A Python Web Service

Let’s walk through a simple Flask service exposing Prometheus metrics:

from flask import Flask
from prometheus_client import Counter, Histogram, generate_latest
import time, random

app = Flask(__name__)

# Define metrics
REQUEST_COUNT = Counter("http_requests_total", "Total HTTP requests")
REQUEST_LATENCY = Histogram("http_request_duration_seconds", "Request latency")

@app.route("/")
def hello():
    start = time.time()
    REQUEST_COUNT.inc()
    time.sleep(random.uniform(0.1, 0.5))  # simulate work
    latency = time.time() - start
    REQUEST_LATENCY.observe(latency)
    return "Hello, world!"

@app.route("/metrics")
def metrics():
    return generate_latest(), 200, {"Content-Type": "text/plain"}

if __name__ == "__main__":
    app.run(host="0.0.0.0", port=5000)

Once running, you can visit:

http://localhost:5000/metrics

Prometheus can scrape this endpoint and store your application metrics.

Visualizing in Grafana

Once the data flows into Prometheus, Grafana can help visualize trends. For example:

Requests per second:
rate(http_requests_total[5m])
95th percentile latency:
histogram_quantile(0.95, rate(http_request_duration_seconds_bucket[5m]))

These visualizations make it easy to correlate system stress with business outcomes.

Best Practices

Expose metrics via a dedicated /metrics endpoint.
Use descriptive, domain-oriented names (e.g., payment_requests_total).
Avoid overloading with labels—stick to the ones that matter (status codes, endpoint, region).
Start small: counters and gauges cover 80% of use cases.

Wrapping Up

By instrumenting your applications with Prometheus client libraries, you bridge the gap between infrastructure health and business performance.

This unlocks true observability: not just “is my server up?” but “is my business running smoothly?”

What’s Next?

In the next article, we’ll go beyond manual metrics and dashboards. We’ll explore how Artificial Intelligence (AI) can supercharge observability—helping detect anomalies, predict failures, and even suggest fixes before users notice an issue.