Artificial Intelligence

Tracing OpenAI Agent Responses utilizing MLFlow

July 15, 2025

[ad_1]

MLflow is an open-source platform for managing and monitoring machine studying experiments. When used with the OpenAI Brokers SDK, MLflow routinely:

Logs all agent interactions and API calls
Captures instrument utilization, enter/output messages, and intermediate choices
Tracks runs for debugging, efficiency evaluation, and reproducibility

That is particularly helpful whenever you’re constructing multi-agent techniques the place totally different brokers collaborate or name capabilities dynamically

On this tutorial, we’ll stroll by way of two key examples: a easy handoff between brokers, and using agent guardrails — all whereas tracing their conduct utilizing MLflow.

Establishing the dependencies

Putting in the libraries

pip set up openai-agents mlflow pydantic pydotenv

OpenAI API Key

To get an OpenAI API key, go to https://platform.openai.com/settings/group/api-keys and generate a brand new key. For those who’re a brand new person, you might want so as to add billing particulars and make a minimal cost of $5 to activate API entry.

As soon as the hot button is generated, create a .env file and enter the next:

Exchange with the important thing you generated.

Multi-Agent System (multi_agent_demo.py)

On this script (multi_agent_demo.py), we construct a easy multi-agent assistant utilizing the OpenAI Brokers SDK, designed to route person queries to both a coding professional or a cooking professional. We allow mlflow.openai.autolog(), which routinely traces and logs all agent interactions with the OpenAI API — together with inputs, outputs, and agent handoffs — making it simple to observe and debug the system. MLflow is configured to make use of a neighborhood file-based monitoring URI (./mlruns) and logs all exercise underneath the experiment title “Agent‑Coding‑Cooking“.

import mlflow, asyncio
from brokers import Agent, Runner
import os
from dotenv import load_dotenv
load_dotenv()

mlflow.openai.autolog()                           # Auto‑hint each OpenAI name
mlflow.set_tracking_uri("./mlruns")
mlflow.set_experiment("Agent‑Coding‑Cooking")

coding_agent = Agent(title="Coding agent",
                     directions="You solely reply coding questions.")

cooking_agent = Agent(title="Cooking agent",
                      directions="You solely reply cooking questions.")

triage_agent = Agent(
    title="Triage agent",
    directions="If the request is about code, handoff to coding_agent; "
                 "if about cooking, handoff to cooking_agent.",
    handoffs=[coding_agent, cooking_agent],
)

async def predominant():
    res = await Runner.run(triage_agent,
                           enter="How do I boil pasta al dente?")
    print(res.final_output)

if __name__ == "__main__":
    asyncio.run(predominant())

MLFlow UI

To open the MLflow UI and consider all of the logged agent interactions, run the next command in a brand new terminal:

This may begin the MLflow monitoring server and show a immediate indicating the URL and port the place the UI is accessible — often http://localhost:5000 by default.

We are able to view all the interplay move within the Tracing part — from the person’s preliminary enter to how the assistant routed the request to the suitable agent, and at last, the response generated by that agent. This end-to-end hint supplies useful perception into decision-making, handoffs, and outputs, serving to you debug and optimize your agent workflows.

Tracing Guardrails (guardrails.py)

On this instance, we implement a guardrail-protected buyer help agent utilizing the OpenAI Brokers SDK with MLflow tracing. The agent is designed to assist customers with normal queries however is restricted from answering medical-related questions. A devoted guardrail agent checks for such inputs, and if detected, blocks the request. MLflow captures all the move — together with guardrail activation, reasoning, and agent response — offering full traceability and perception into security mechanisms.

import mlflow, asyncio
from pydantic import BaseModel
from brokers import (
    Agent, Runner,
    GuardrailFunctionOutput, InputGuardrailTripwireTriggered,
    input_guardrail, RunContextWrapper)

from dotenv import load_dotenv
load_dotenv()

mlflow.openai.autolog()
mlflow.set_tracking_uri("./mlruns")
mlflow.set_experiment("Agent‑Guardrails")

class MedicalSymptons(BaseModel):
    medical_symptoms: bool
    reasoning: str


guardrail_agent = Agent(
    title="Guardrail test",
    directions="Verify if the person is asking you for medical symptons.",
    output_type=MedicalSymptons,
)


@input_guardrail
async def medical_guardrail(
    ctx: RunContextWrapper[None], agent: Agent, enter
) -> GuardrailFunctionOutput:
    outcome = await Runner.run(guardrail_agent, enter, context=ctx.context)

    return GuardrailFunctionOutput(
        output_info=outcome.final_output,
        tripwire_triggered=outcome.final_output.medical_symptoms,
    )


agent = Agent(
    title="Buyer help agent",
    directions="You're a buyer help agent. You assist prospects with their questions.",
    input_guardrails=[medical_guardrail],
)


async def predominant():
    attempt:
        await Runner.run(agent, "Ought to I take aspirin if I am having a headache?")
        print("Guardrail did not journey - that is surprising")

    besides InputGuardrailTripwireTriggered:
        print("Medical guardrail tripped")


if __name__ == "__main__":
    asyncio.run(predominant())

This script defines a buyer help agent with an enter guardrail that detects medical-related questions. It makes use of a separate guardrail_agent to guage whether or not the person’s enter accommodates a request for medical recommendation. If such enter is detected, the guardrail triggers and prevents the principle agent from responding. The whole course of, together with guardrail checks and outcomes, is routinely logged and traced utilizing MLflow.

MLFlow UI

To open the MLflow UI and consider all of the logged agent interactions, run the next command in a brand new terminal:

On this instance, we requested the agent, “Ought to I take aspirin if I’m having a headache?”, which triggered the guardrail. Within the MLflow UI, we will clearly see that the enter was flagged, together with the reasoning supplied by the guardrail agent for why the request was blocked.

Take a look at the Codes. All credit score for this analysis goes to the researchers of this challenge. Prepared to attach with 1 Million+ AI Devs/Engineers/Researchers? See how NVIDIA, LG AI Analysis, and prime AI firms leverage MarkTechPost to succeed in their target market [Learn More]

I’m a Civil Engineering Graduate (2022) from Jamia Millia Islamia, New Delhi, and I’ve a eager curiosity in Knowledge Science, particularly Neural Networks and their software in varied areas.

[ad_2]

Establishing the dependencies

Putting in the libraries

OpenAI API Key

Multi-Agent System (multi_agent_demo.py)

MLFlow UI

Tracing Guardrails (guardrails.py)

MLFlow UI

RELATED ARTICLESMORE FROM AUTHOR

Context Graph vs RAG vs Uncooked Context

Sensible SQL Methods Each Knowledge Scientist Ought to Know

The Obtain: AI bottleneck debates, and BCI trials take off

The Milky Approach Was Rewired by a Cataclysmic Collision Billions of...

RELATED ARTICLES MORE FROM AUTHOR