Table of contents

Just for you

MCP vs API: how to understand their relationship

3 steps to build an MCP server from scratch

AI agent observability: Here’s what you need to know

Jon Gitlin

Senior Content Marketing Manager

@Merge

As you begin building AI agents for your product or your internal workflows, you’ll need a comprehensive and robust approach to monitoring their activities.

Otherwise, you risk sensitive data leaks, persistent issues, and missed opportunities for improvement.

To help you manage any of your AI agents effectively, we’ll break down how, exactly, you can go about observing them over time.

But to start, let’s align on a definition for AI agent observability and the benefits it provides.

What is AI agent observability?

It’s the specific measures you put into place for monitoring and managing your AI agents.

This can include several components:

Fully-searchable logs to track the tool calls your AI agents make to specific MCP servers over time

Rules to block or redact sensitive data sent to or from your AI agents

Audit trails to see specific actions your teams take on AI agents, such as adjusting a rule

Custom alerts to notify your team in real-time when an AI agent performs a suspicious, unauthorized activity

Access controls to enforce granular data permissions for the users making the underlying requests

Related: What is agentic RAG?

Why AI agent observability is important

There are several benefits to observing your AI agents. Here are just a few worth calling out.

Prevents AI agents from accessing and using sensitive data

Using an observability platform for AI agents, you can set up a rule that blocks AI agents from receiving social security numbers, phone numbers, addresses, and other personally-identifiable information (PII).

You can also establish rules that block the AI agents from sharing data externally—or at least redacts certain parts of the data. For instance, you can implement a rule that allows AI agents to share customers’ credit card information but the credit card number itself is redacted—with the exception of the last few numbers.

Enables your team to debug issues quickly

Since your AI agents are making decisions on the fly, a wide range of issues can come up.

An AI agent can call the wrong tool, pass invalid or malformed parameters to a tool, fail to handle tool errors or third-party API outages gracefully—and the list goes on.

An AI agent observability tool can help your engineers detect and diagnose these issues through its logs, enabling your team to then address the issue as quickly as possible.

Builds trust and credibility with prospects

More and more companies will offer AI agents in their products.

Case in point: The market size for AI agents is set to grow at a compound annual growth rate of 45% through 2034.

One of the best ways for your AI agents to stand out from competitors' is by demonstrating the observability measures you've implemented for them.

This assures prospects that regardless of their agentic use cases with your product, their data will remain secure and compliant with frameworks like GDPR.

Provides insights into areas for improvement

Even when your agents aren’t experiencing issues, your observability tooling can help surface optimization opportunities.

You can refine an AI agent’s logic so that it calls a more efficient tool for a given task, discover unnecessary tool calls, see which tools lead agents to experience slow response times—and potentially reroute the tool calls or space them out—and more.

Helps you comply with key data privacy and protection frameworks

By maintaining records of what data was accessed, the actions performed on that data, which agent performed those actions and when they did so, you can easily demonstrate compliance with a wide range of data privacy and protection frameworks, like GDPR, SOC 2 Type II, and ISO 27001.

How to perform AI agent observability

Here are all of the ways that you can observe and manage your AI agents. For the best results, you’ll want to combine these approaches.

Set rules that keep sensitive information secure

If your AI agents are part of dynamic workflows, they can make on-the-fly decisions that put sensitive data at risk. For example, they can decide to add an employee's social security number to their profile in an HRIS solution—where it's visible to anyone at the organization.

To prevent harmful or unintended actions, you can configure rules for all AI agents, specific groups of agents, or individual ones to control how they receive and share sensitive information.

For example:

Global rules: All AI agents are prohibited from sharing credit card information with third-party applications, and credit card numbers are redacted before the agents can access them

Sales and marketing agents: These agents can’t share contract values with third-party applications, and bank account and routing numbers are redacted before being passed to them

Employee experience agent: This agent can’t share employee PII, and social security numbers are redacted before they get sent to the agent

The Rules you can set up in Agent Handler to govern what data AI agents can receive and send — *Merge Agent Handler* *will let you set rules globally or for specific agents. You’ll also be able to apply these rules to specific regions, use default rules, implement your own, and test any*

Leverage logs to track every action your AI agents take

Logs let you review all of the tool calls that have been made, whether they were successful, and—if they were—how long the AI agent took to make them.

An AI agent observability solution, like Agent Handler, can even provide filters that let you drill down on specific tool calls.

To help your team debug issues and optimize the AI agents’ performance, you can filter the logs by:

Failed statuses over the past month to troubleshoot recent issues

A specific agent to isolate issues or compare behavior between agents

The name of a tool (e.g., “Jira Create Ticket”) to see usage patterns or errors for a particular integration

A screenshot of Agent Handler's searchable logs — *A snapshot of the filters Agent Handler supports*

Implement alerts to detect security and performance issues in real-time

If and when your AI agents violate any of the rules you set, your team needs to know as soon as possible so that they can work to remedy the issue before it impacts your business and your customers.

And to keep your AI agents performing smoothly, you’ll also need to build alerts for critical performance issues, such as tool failures.

To account for both scenarios, you can set up alerts for any rule violation and for certain performance issues (e.g., a tool call failing) across your AI agents.

You should also set up the rules that govern how a particular alert gets sent. This can include who on your team receives the alert, what the alert says, and where they receive it.

How Agent Handler lets you filter security alerts — *Like logs, Agent Handler lets you review historical alerts and filter them down*

Use an audit trail to help admins address potential issues on time

As your admin(s) monitors the AI agents through a 3rd-party tool(s), they’ll likely give colleagues access to the tool to manage the AI agents relevant to them.

To see if these additional users perform harmful activities and to help the admin address it on time, the admin can use an audit trail that describes who performed an action on an agent(s), what that action was, where they performed it from, and when.

How Audit trail can look in Agent Handler — *How Agent Handler displays activities in its Audit trail*

‍

AI agent observability FAQ

In case you have any more questions on AI agent observability, we’ve answered several below.

What are some common tools for performing AI agent observability?

You can use tools that are custom built for monitoring AI agents, like Agent Handler (powered by Merge), Langfuse, and Arize AI. And you can use more general purpose platforms, like Datadog, New Relic, and Dynatrace.

Your best option will depend on factors like your specific observability requirements and how well a vendor meets them, your budget constraints, and the ease of integrating the tool into your existing tech stack and workflows.

What are the challenges associated with observing and managing AI agents?

There are several challenges. Here are just a few worth highlighting:

As you scale your AI agents, it becomes increasingly difficult to monitor each of their activities over time—let alone resolve issues quickly when they arise

Using multiple apps to observe AI agents often leads to excessive context switching, which is time-consuming and inefficient for engineering teams

Predicting AI agents’ future tool calls and preventing harmful, unintended ones can prove difficult

AI agent observability tools, like Agent Handler, help solve these challenges by providing a centralized platform that surfaces all the insights needed to monitor, debug, and optimize your AI agents.

What’s the difference between integration and AI agent observability?

Integration observability is a more holistic term. It encapsulates AI agent observability and it can refer to monitoring other types of integrations, whether they’re driven by APIs, files, or custom scripts.

Jon Gitlin

Senior Content Marketing Manager

@Merge

Jon Gitlin is the Managing Editor of Merge's blog. He has several years of experience in the integration and automation space; before Merge, he worked at Workato, an integration platform as a service (iPaaS) solution, where he also managed the company's blog. In his free time he loves to watch soccer matches, go on long runs in parks, and explore local restaurants.