top of page
Search

What Are AI Agents (and Why They Matter in Operations)

  • Writer: Eglis Alvarez
    Eglis Alvarez
  • Oct 12
  • 3 min read
An ecosystem of AI agents working in harmony,  exchanging data, reasoning collectively, and driving the evolution of intelligent automation.
An ecosystem of AI agents working in harmony, exchanging data, reasoning collectively, and driving the evolution of intelligent automation.

Introduction

A few years ago, “automation” meant running scripts, scheduling jobs, and watching dashboards.Today, something new is happening: systems are beginning to think and act on their own.

These systems are called AI agents.They interpret signals, make decisions, and sometimes even take action — without waiting for human input.In IT operations, this is a big shift: instead of manually responding to alerts, we can now design agents that monitor, reason, and react autonomously.


From Scripts to Agents

Traditional automation is linear. You write a script, define conditions, and it executes — predictable but rigid.

AI agents are different:

  • They observe your environment (metrics, logs, events).

  • They reason about what’s happening.

  • They act — taking the next step safely, following rules or goals.

Think of them as digital teammates that never get tired. They handle repetitive, predictable decisions, so humans can focus on high-value work.


Why They Matter in IT Operations

Modern infrastructures are too dynamic for manual oversight. Even with observability tools, the real challenge is interpreting and prioritizing signals.

AI agents bring:

  1. Proactive detection – spotting anomalies before alerts fire.

  2. Contextual reasoning – explaining why something is happening.

  3. Automated response – triggering safe recovery actions.

  4. Continuous learning – adapting based on feedback or results.

They’re not replacing operators — they’re giving them superpowers.


A Practical Example: The AI Ops Agent

To make this real, I built a small open-source project called AI Ops Agent – Ollama Edition.It follows the Perception → Reasoning → Action loop and runs entirely on your local machine, no API keys, no cloud dependencies.

It works like this:

  • Perception: scans a folder called inbox/ and counts files (simulating workload).

  • Reasoning: asks a local AI model (Mistral) what action to take.

  • Action: restarts a service or sends a notification — safely, in dry-run mode by default.

You can clone the repo from here: https://github.com/eglisal/ai-ops-agent-ollama.git


How to Run It

  1. Install Ollama (free and open source)

    • Go to https://ollama.ai/download

    • Choose Download for Windows, macOS, or Linux

    • After installation, restart your terminal and verify with:

      ollama --version

  2. Pull a model (for example, Mistral):

    ollama pull mistral

  3. Run the agent

    python agent.py

  4. Simulate activity by adding a few files to the inbox/ folder, then re-run.The AI agent will analyze the “load” and decide what to do next.

This tiny example shows how accessible AI-driven automation has become — you can build intelligent operational workflows without relying on external APIs or expensive services.


What to Expect When You Run It

The agent runs in three simple phases:

1. Perception – Observe

It reads signals from its environment. In this example:

  • Counts how many files are in the inbox/ folder.

  • Checks if an outbox/ folder exists.

status = {"queue_length": 12, "has_outbox": True}

This simulates how a real operations agent might pull metrics, check job queues, or read service health data.


2. Reasoning – Decide

Next, the agent decides what to do based on the observed state.It sends a prompt to Ollama (running a local Mistral model) asking for a JSON plan of actions.

Example output from the model:

{
  "plan": [
    {"action": "notify", "params": {"message": "Queue length warning: 12."}}
  ]
}

If the model response fails or is ambiguous, the agent falls back to a safe heuristic:

  • ≥ 50 files → restart service + notify

  • ≥ 10 files → notify only

  • else → report system healthy


3. Action – Execute

Each step in the plan corresponds to a Python action:

  • restart_service (uses sc on Windows or systemctl on Linux)

  • notify (prints messages to stdout for now — easy to extend to Slack, Teams, or email)

Before any action executes:

  • The agent checks an allowlist in config.json

  • If dry_run = true, it logs the intent but doesn’t execute anything

Example log:


ree

Understanding the Loop

Each time you run python agent.py, the agent:

  1. Loads config.json

  2. Observes the environment

  3. Asks the model for reasoning (or applies fallback logic)

  4. Filters allowed actions

  5. Executes or logs the result

It’s a simple, modular architecture — designed so you can plug in:

  • More perception modules (disk usage, HTTP health checks)

  • More actions (restart jobs, send alerts, open tickets)

  • More reasoning options (different models or rule engines)


A New Way to Think About Automation

Once you start designing with agents, the mindset changes:You stop writing static rules and start defining outcomes.You replace scripts with behavior — systems that can reason and act safely within defined boundaries.

The goal isn’t to remove humans — it’s to make operations more autonomous, predictable, and intelligent.


What’s Next

In the next article, we’ll go deeper into the Anatomy of an AI Agent, understanding how perception, reasoning, memory, and action connect, and how to design a safe, observable agent that can handle real infrastructure scenarios.

 
 
 
bottom of page