Monitoring

Effective monitoring is crucial for understanding user behavior and optimizing copilot performance in production environments. Continual offers a comprehensive set of tools and metrics to help you track, analyze, and improve your copilots.

The monitoring features are currently under active development. We welcome your feedback and suggestions for enhancing the monitoring capabilities.

Copilot

The copilot overview page provides a high-level dashboard displaying key performance indicators (KPIs) and usage metrics for your copilot over the last 7 days:

Sessions: The total number of user sessions with the copilot.
Unique Users: The count of distinct users who interacted with the copilot.
Daily Message Volume: The average number of messages exchanged between users and the copilot per day.
Median Response Time: The median time taken for the copilot to respond to a user's message.
Average Prompt Length: The mean character count of user prompts.
Average Response Length: The mean character count of copilot responses.
Total Feedback Received: The sum of all user feedback ratings.
Average Feedback Score: The mean rating value provided by users.

The overview also includes a summary of the knowledge bases and tools integrated with your copilot, along with their usage frequency over the past week. Use this information to identify heavily-used resources and potential optimization targets.

We are continuously expanding our monitoring features. Stay tuned for updates and enhancements!

Threads

The Conversations tab displays a chronological list of all user-copilot interactions. Click on a conversation to view its details, including:

The full message exchange between the user and copilot
Conversation metrics such as duration, message count, and feedback rating
A link to access the conversation's trace for deeper analysis

Review the conversation history to identify common user queries, assess copilot response quality, and spot areas for improvement.

Traces

The Trace view offers a granular, hierarchical breakdown of a copilot's processing flow, structured around the key concepts of Threads, Runs, and Steps.

Runs

Within each Thread, a Run encapsulates a single execution of the copilot in response to a user prompt. It represents the entire processing flow, from receiving the user's input to generating and delivering the copilot's response.

In the Trace view, Runs are displayed as child elements of their parent Thread. Expand a Run to view its Steps and understand the copilot's decision-making process.

Steps

Steps are the individual actions performed by the copilot during a Run. They can be categorized into two main types:

Tool Call: These steps represent the copilot querying a knowledge base, calling an external API, or interacting with an integrated tool. The Trace view displays the input parameters and the received response for each tool invocation.
Message Creation: In these steps, the copilot generates a message using an LLM. The Trace view shows the generated message content.

Steps are presented as a timestamped, sequential list within each Run. By analyzing the Steps, you can:

Verify the copilot is retrieving relevant information from knowledge bases and tools
Identify performance bottlenecks or errors in the processing flow
Evaluate the quality and coherence of the copilot's generated messages

Models Evaluation