Skip to main content

Prerequisites

Install dependencies:
pip install "quotientai>=0.4.6" "openai-agents>=0.1.0" "openinference-instrumentation-openai-agents==1.1.1"
Set environment variables:
export OPENAI_API_KEY=your-openai-api-key
export QUOTIENT_API_KEY=your-quotient-api-key

Sample Integration

quotient_trace_openai_agents.py
from openinference.instrumentation.openai_agents import OpenAIAgentsInstrumentor

from quotientai import QuotientAI

quotient = QuotientAI()
quotient.tracer.init(
    app_name="openai-agents-search-app",
    environment="dev",
    instruments=[OpenAIAgentsInstrumentor()],
)

import asyncio
from agents import Agent, Runner

@quotient.trace('haiku-agent')
async def main() -> None:
    agent = Agent(
        name="haiku-assistant",
        instructions="You only respond in haikus.",
    )

    result = await Runner.run(agent, "Tell me about recursion in programming.")
    print(result.final_output)

if __name__ == "__main__":
    asyncio.run(main())

Sample Integration for Streaming

For agents that are wrapped in a backend API framework (e.g. FastAPI), async + streaming responses are handled differently than in a standalone agent application in that tokens are streamed via an async Python generator. When this is the case, you need to use the use_span context manager to ensure that all operations within the context are part of the same trace. Below you can find an example of how to do this with an OpenAI Agent in a FastAPI streaming application.
pip install "quotientai>=0.4.10" "openai-agents>=0.1.0" "openinference-instrumentation-openai-agents==1.1.1" "fastapi>=0.115.6" "uvicorn>=0.34.0"
import asyncio
import json

from agents import Agent, Runner
from openai.types.responses import ResponseTextDeltaEvent

from openinference.instrumentation.openai_agents import OpenAIAgentsInstrumentor
from opentelemetry.trace import get_tracer, use_span
from quotientai import QuotientAI

from fastapi import FastAPI
from fastapi.responses import StreamingResponse

import uvicorn

# Initialize Quotient AI with OpenAI Agents instrumentation
quotient = QuotientAI()
quotient.tracer.init(
    app_name="joker-agent",
    environment="dev",
    instruments=[OpenAIAgentsInstrumentor()]
)

agent = Agent(
    name="Joker",
    instructions="You are a helpful assistant.",
)

app = FastAPI()

@app.post("/generate-jokes")
async def my_endpoint():
    # 1️⃣ start the root span
    root_span = quotient.tracer.start_span('joker-agent')

    # 2️⃣ make the span current for all operations within the context
    stream_ctx = use_span(root_span, end_on_exit=False)

    async def my_generator():
        # 3️⃣ capture all the returned events within the context
        with stream_ctx:
            # Your AI agent logic here, in this case we are using an OpenAI Agent
            # This could be calling OpenAI, LangChain, or any other AI service
            my_stream = Runner.run_streamed(agent, input="Please tell me 5 jokes.")
            async for event in my_stream.stream_events():
                if event.type == "raw_response_event" and isinstance(event.data, ResponseTextDeltaEvent):
                   yield f"data: {json.dumps({'content': event.data.delta})}\n\n"
        
        # 4️⃣ end the span and flush the trace after the context is exited
        root_span.end()
        quotient.force_flush()

    return StreamingResponse(
        my_generator(),
        media_type="text/plain"
    )

if __name__ == "__main__":
    uvicorn.run(app, host="0.0.0.0", port=8000)

Key Components:

  1. Root Span: quotient.tracer.start_span() creates the main span that will contain all child operations
  2. Context Management: use_span(root_span, end_on_exit=False) makes the span current for all operations within the context
  3. Streaming: The generator function handles the streaming response while maintaining trace context
  4. Cleanup: root_span.end() and quotient.force_flush() ensure traces are properly closed and sent

Notes

  • The instrumentor emits spans for tool calls, actions, and final responses so you can replay the agent’s reasoning.
  • Use asyncio.run (as shown) or integrate directly into your existing event loop.

Next: LangGraph