Agentic DevOps: MCP Servers for Infrastructure APIs

Part 2 of 5 | ← Part 1 | Part 3 → | Part 4 → | Part 5 →

How MCP Servers Connect AI Models to Infrastructure APIs

An MCP server is a lightweight service that exposes infrastructure data and capabilities to AI agents through the Model Context Protocol, a standardized interface developed by Anthropic. It acts as a translation layer between your systems and the LLM, transforming natural language intent into structured API calls.

Instead of giving an agent raw kubectl access, you build an MCP server that exposes carefully scoped tools: get_pod_logs, list_deployments, restart_deployment. The agent discovers these tools dynamically and invokes them via JSON-RPC.

The Three Primitives of MCP Servers

Every MCP server provides three primitives that define the agent’s capabilities:

Resources: Read-only data the agent can query (logs, metrics, configs, documentation)
Tools: Functions the agent can invoke (restart pods, scale deployments, query databases)
Prompts: Pre-written templates that guide the agent’s reasoning and ensure consistent behavior

The protocol is transport-agnostic. MCP servers can run locally via stdio, remotely over HTTP/SSE, or inside your cluster as sidecars. This flexibility makes them ideal for connecting AI agents to existing infrastructure without major architectural changes.

Why MCP Servers Matter for Agentic DevOps

Without MCP servers, every AI agent needs custom integration code for every API it touches. With MCP servers, you write the integration once and any compatible agent can use it. This decoupling is what lets agentic devops grow across large, heterogeneous environments.

For example, if you’re deploying Ollama on Kubernetes, an MCP server can expose GPU metrics and model serving status to your AI agent, enabling it to detect when a model endpoint is failing and trigger a rolling restart.

AI SRE Architecture: Four Layers of Agentic Systems

A production agentic devops system has four layers that separate concerns and enable safe autonomous operations.

Layer 1: The Observability Plane

Your existing stack (Prometheus, Grafana, Loki, Jaeger) provides structured data about system state. You don’t replace it; you expose it through MCP server resources. This layer answers the question: “What is happening right now?”

Layer 2: The MCP Server Layer

MCP servers translate your infrastructure APIs into protocol-compliant tools. You might have a Kubernetes MCP server for pod operations, an AWS MCP server for EC2 queries, and a PagerDuty MCP server for incident management. Each is independently deployable and scoped. This layer answers: “What can the agent do?”

Layer 3: The Agent Runtime

The orchestration layer manages the LLM, tool calling, memory, and planning. Options include Claude Code, LangGraph, CrewAI, or custom implementations using the OpenAI Agents SDK. A good runtime maintains context across turns. It handles failures gracefully and escalates intelligently when things go sideways. This layer answers: “How does the agent think and decide?”

Layer 4: The Governance Layer

Policies, audit logs, rate limits, and approval queues live here. Every tool invocation should be logged with full context: who triggered it, what the LLM reasoning was, and what the outcome was. This layer answers: “Is this safe and compliant?”

Building Your First MCP Server for Kubernetes Log Queries

Let’s build something real. The following Python MCP server exposes tools to query Kubernetes logs and list pods. It uses the official MCP Python SDK and the Kubernetes Python client.

Installation

# Create a virtual environment for the MCP server project
uv init k8s-mcp-server
cd k8s-mcp-server
uv venv
source .venv/bin/activate

# Install the MCP SDK and Kubernetes client
# mcp[cli] provides the FastMCP framework for building servers quickly
# kubernetes is the official Python client for the K8s API
uv add "mcp[cli]" kubernetes

The Server Code

Create server.py:

# server.py. A Kubernetes log query MCP server for Agentic DevOps
# This server exposes safe, read-only tools that let AI agents inspect
# cluster state without requiring direct kubectl access or elevated permissions.

from typing import Optional

from kubernetes import client, config
from mcp.server.fastmcp import FastMCP

# Initialize the FastMCP server with a descriptive name.
# This name appears in the agent's tool discovery interface.
mcp = FastMCP("k8s-logs")

# Load kubeconfig once at startup to avoid repeated filesystem calls.
# For in-cluster deployment, replace with config.load_incluster_config()
config.load_kube_config()
v1 = client.CoreV1Api()


@mcp.tool()
async def get_pod_logs(
    pod_name: str,
    namespace: str = "default",
    tail_lines: int = 100,
    container: Optional[str] = None,
) -> str:
    """Get recent logs from a Kubernetes pod.

    This tool enables AI agents to diagnose failures by reading application
    logs without needing shell access to the cluster nodes.

    Args:
        pod_name: Name of the pod to query (e.g., "nginx-7c4b9c6f5-x2v1p")
        namespace: Kubernetes namespace where the pod runs (default: default)
        tail_lines: Number of log lines to return (default: 100)
        container: Container name (required only if pod has multiple containers)
    """
    try:
        # Build kwargs dynamically so we only pass 'container' when provided.
        # This avoids API errors from passing None values.
        kwargs = {
            "name": pod_name,
            "namespace": namespace,
            "tail_lines": tail_lines,
        }
        if container:
            kwargs["container"] = container

        # Call the Kubernetes API to fetch logs.
        # This is a read-only operation with no side effects on the cluster.
        logs = v1.read_namespaced_pod_log(**kwargs)
        return logs
    except client.exceptions.ApiException as e:
        # Return structured error messages so the agent can reason about failures.
        return f"Kubernetes API error: {e.status}: {e.reason}"
    except Exception as e:
        return f"Unexpected error fetching logs: {str(e)}"


@mcp.tool()
async def list_pods(namespace: str = "default") -> str:
    """List pods in a namespace with their status and restart counts.

    This tool gives AI agents a snapshot of workload health,
    enabling them to identify crashing or unstable pods quickly.

    Args:
        namespace: Kubernetes namespace to list (default: default)
    """
    try:
        # Query the K8s API for all pods in the specified namespace.
        pods = v1.list_namespaced_pod(namespace=namespace)

        # Format output as a table so LLMs can parse it easily.
        # Restart counts help agents spot recurring crash loops.
        lines = [f"{'POD':<40} {'STATUS':<12} {'RESTARTS':<10}"]
        lines.append("-" * 64)
        for pod in pods.items:
            # Sum restart counts across all containers in the pod.
            restarts = sum(c.restart_count for c in (pod.status.container_statuses or []))
            lines.append(
                f"{pod.metadata.name:<40} {pod.status.phase:<12} {restarts:<10}"
            )
        return "\n".join(lines)
    except client.exceptions.ApiException as e:
        return f"Kubernetes API error: {e.status}: {e.reason}"


if __name__ == "__main__":
    # Run the server with stdio transport for local testing.
    # In production, switch to HTTP/SSE or deploy as a sidecar.
    mcp.run(transport="stdio")

Connecting to Claude Desktop

Add this to your Claude Desktop configuration (~/Library/Application Support/Claude/claude_desktop_config.json on macOS or %APPDATA%\Claude\claude_desktop_config.json on Windows):

{
  "mcpServers": {
    "k8s-logs": {
      "command": "uv",
      "args": [
        "run",
        "--cwd",
        "/absolute/path/to/k8s-mcp-server",
        "python",
        "server.py"
      ]
    }
  }
}

Restart Claude Desktop. You can now ask: “Show me the last 50 lines of logs from the nginx pod.” Claude will discover the get_pod_logs tool, invoke it through the MCP server, and present the results.

💡 Tip: For in-cluster deployment, replace config.load_kube_config() with config.load_incluster_config() and run the server as a pod with a ServiceAccount that has read-only log permissions.

FAQ

What is an MCP server in the context of Agentic DevOps?

An MCP server is a lightweight service that exposes infrastructure tools and data to AI agents through the Model Context Protocol. It acts as a secure translation layer between an LLM and your systems, instead of giving an agent raw kubectl access, you provide scoped tools like get_pod_logs or restart_deployment.

What are the three primitives every MCP server provides?

Every MCP server provides Resources (read-only data like logs and metrics), Tools (functions the agent can invoke), and Prompts (pre-written templates that guide reasoning). These three primitives define exactly what the agent can see, do, and how it should behave.

How do MCP servers improve security compared to direct API access?

MCP servers enforce least-privilege access by design. You define narrow, scoped tools instead of giving the agent broad credentials. Each invocation is auditable, and the protocol supports approval workflows and rate limits out of the box.

Can I run MCP servers inside my Kubernetes cluster?

Yes. MCP servers can deploy as sidecar containers or standalone pods. For in-cluster operation, use config.load_incluster_config() and attach a ServiceAccount with scoped RBAC permissions. This keeps the agent runtime outside your cluster while the MCP server operates inside it.

Parts in this series: ← Part 1 | Part 3 → | Part 4 → | Part 5 →