Observability

The Observability section provides comprehensive monitoring and analysis of AI system performance, usage patterns, and operational health. It includes detailed dashboards and analytics to track runtime behavior, costs, and system efficiency.

Observability Dashboard

1. AI Storyboard

AI Storyboard provides a visual, comprehensive overview of your AI operations across multiple dimensions. It serves as the central observability dashboard, displaying key metrics and insights in an organized, easy-to-understand format.

Projects Overview - Displays the total count of AI projects in your organization. Shows project distribution by name, indicating which projects have the highest activity (e.g., Staging, Agent Creator, New Relic). Displays the breakdown of projects by lifecycle stage (Development vs. Production).
Clients Analysis - Aggregated count of all requests, broken down into User Requests (direct user interactions) and Agent Requests (automated agent-driven requests). Lists the most active agents with their usage counts (e.g., Human in-loop RCA Agent, Lab Health Analysis Agent, Agent Evaluator). Shows user activity by email address and their associated roles (e.g., AIOPS, Lab Admin, RCA, Dynamic Widget builder, Agent creator).
LLM Usage and Cost - Shows overall expenditure across all LLM models, total output tokens generated, and total input tokens consumed. Displays usage distribution across different models (e.g., claude-sonnet-4, gpt-4o, gpt-4.1, claude-sonnet-4.5, gpt-5). Shows cache token statistics indicating how caching reduces token consumption.
MCP Tool Calling & Instructions - Aggregate count of all MCP (Model Context Protocol) tool invocations. Shows tool usage across different domains including common tools, context cache operations, visualization tools, AIOps automation, network automation, lab administration, and agent creator tools. Provides detailed view of individual tool calls (e.g., common-list_prompt_templates_by_persona, common-get_conversation_history).
Prompt Templates (Instructions) - Lists all prompt templates with their usage counts, organized by domain (e.g., lab_administration, create_toolset, incident_remediate_recommend_template). Helps identify which instruction sets are most commonly used across the platform.

Note

AI Storyboard enables operational monitoring to quickly identify which projects, agents, and users are most active. It supports cost management by tracking LLM usage and costs, performance analysis to understand tool call patterns, and resource planning for data-driven decisions about resource allocation and scaling.

2. Conversations

The Conversations section provides a comprehensive log and management interface for all AI interactions and conversations within the platform. It enables administrators to monitor, search, filter, and analyze conversation data for operational insights, compliance, and troubleshooting.

Conversation Information - Displays all conversations with details including Timestamp (when the conversation occurred), Conversation Label (user-defined or system-generated conversation names), User ID (email address of the user who initiated the conversation), Persona (the AI persona/agent used, e.g., AI Observability, New Relic Insights Agent), and Token Count (total tokens consumed for the conversation). Each conversation entry provides access to additional operations and details.
Time Range Filtering - Filter conversations by time periods such as "All Time", "Today", "Last 7 Days", "Last 30 Days", or custom date ranges. This helps focus analysis on specific timeframes for trend analysis or incident investigation.
Advanced Search and Filters - Search capabilities to quickly find specific conversations by keywords, user IDs, personas, or conversation labels. Additional filters allow narrowing results by AI Project, LLM model, User, or Persona type.
Conversation Details - Access detailed information for each conversation including the full conversation history, messages exchanged, tool calls made, token usage breakdown, and associated metadata. This detailed view helps understand the complete context of each interaction.

Conversation Drill-Down Dashboard

When drilling down into a specific conversation, a detailed dashboard provides comprehensive insights into the conversation's execution, performance, and resource usage.

2.1 Conversations Overview

Conversation Flow - Displays the complete chronological list of the conversation showing user prompts, tool calls made in response, and the final LLM response after all tool calls are completed. This provides a clear view of the entire conversation execution sequence.
Tool Call Details - When expanding individual tool calls, displays the tool input (parameters and data sent to the tool) and tool output (response received from the tool). This detailed view helps understand what data was passed to each tool and what results were returned.
Conversation Metadata - Displays conversation identification information including conversation label, user ID, persona used, timestamp, and project association.
Token Usage Summary - Shows total token consumption for the conversation, broken down into input tokens, output tokens, and cache tokens. Provides cost information associated with the conversation.
Execution Metrics - Displays performance metrics such as conversation duration, response times, and completion status.

2.2 Message Sequence

Shows the complete flow of how messages are processed during a conversation. Displays how user messages are received and forwarded through the system, the communication between MCP (Model Context Protocol) client and server, the chronological order of tool invocations, and how the AI model interacts with various tools and services including conversation history retrieval, prompt template management, and context caching. Tracks the complete path from user input through tool calls and AI model processing to the final response delivery.

2.3 Canvas

Shows canvas widgets if available for that conversation. Provides visualization where data is generated in live interactive widgets, allowing users to explore and interact with conversation data through dynamic visual representations.

2.4 Output

The Output tab displays the final consolidated result generated at the end of a conversation with Fabaio or any Fabrix AI agent. This is the authoritative answer after all reasoning steps, tool calls, decision branches, and workflow executions have completed. It represents exactly what the user sees as the final response, but with richer structure and formatting for documentation, reporting, and auditability.

2.5 Tool Call Log

The Tool Call Log provides a complete, chronological record of every tool the Fabrix AI agent or Fabaio invoked during the conversation. This log is essential for transparency, debugging, auditing, and understanding how AI reaches its final conclusions.

Each entry shows:

Tool Name - The name of the tool that was invoked.
Input Arguments - The parameters and data passed to the tool.
Returned Output - The response received from the tool.
Execution Sequence Number - The order in which the tool was called.
Optional Metadata - Additional information such as timestamps and content size.

This is the deepest level of visibility into how the agent thinks and operates during task execution.

2.6 Ratings

The Ratings tab provides a consolidated view of feedback submitted by users for a specific conversation or agent interaction. This feature allows organizations to measure user satisfaction, agent performance, accuracy and helpfulness of responses, and areas that require improvement. Ratings form the primary feedback loop that helps AI continuously learn, improve, and refine agent behavior.

2.7 Cache Documents

The Cache Documents tab shows all context documents that were created, updated, or referenced during the execution of a conversation or agent run. These documents represent the working memory of the agent - where it stores intermediate results, summaries, tool outputs, or configuration data that it needs to reason over. This tab allows users to inspect everything the agent saved to the scratchpad, giving complete transparency into how the agent analyzed the data.

2.8 Evaluation Details

The Evaluation Details tab displays a structured quality assessment report for the selected conversation or agent execution. This report helps users, admins, and AI engineers understand how well the agent performed, whether the response was complete and helpful, what gaps or issues were detected, what improvements can be made, and how the conversation aligns with quality dimensions such as clarity, accuracy, and relevance. Users can run evaluations on any conversation, and the results appear here.

Note

Conversations enable audit trail tracking for compliance and auditing requirements, troubleshooting by reviewing specific conversations to diagnose issues or understand user behavior, and usage analysis to understand conversation patterns, user engagement, token consumption, and identify optimization opportunities. The detailed conversation logs help organizations maintain governance and provide transparency into AI system interactions.

3. AI Spend Limits

AI Spend Limits Dashboard

The AI Spend Limits module provides a centralized way to define, monitor, and enforce cost boundaries for AI and LLM usage across your organization. This feature helps prevent unexpected AI bills, ensures fair usage across teams, and enforces governance-level controls around how much an agent, LLM, or user can spend.

Spend Limit Rules - Displays all configured spend-control rules. Each row represents a spend-limit rule with configurable thresholds, duration, and enforcement actions.
Rule Configuration - Define rules that apply to Users, User Groups, Projects, Personas, AI Models, or Request Types with specified duration periods (Day, Week, Month) and spend limits.
Real-Time Monitoring - Tracks current spend against limits in real-time, showing usage consumption and remaining budget for each rule.
Enforcement Actions - Configure rules to either Notify (send alerts without blocking) or Block (immediately block usage when threshold is exceeded) with automatic or manual block clearance.
Filtering and Search - Filter rules by time period, LLM model, Request Type, User, Persona, or Project. Search functionality for Rule IDs and subjects.
Rule Management - Create new rules, apply default governance templates, or edit/delete existing rules to manage organizational cost controls.

The main dashboard displays spend-limit rules with the following fields:

Column	Meaning
Rule ID	Unique name of the spend-limit rule
Subject Type	What the rule applies to: User, User Group, Project, Persona, AI Model, Request Type
Subject ID	The specific entity being monitored (username, group name, model name, etc.)
Duration	The period the rule applies to: Day, Week, Month
Spend Limit ($)	The maximum allowed cost for the duration
Current Spend ($)	Real-time usage against the limit
Action	What happens when limit is reached: Block or Notify
Block Clearance	Determines how blocks are lifted: Automatic or Manual
Status	Whether the rule is currently Active, Paused, or Expired
Created At	When the rule was created
Updated At	Last time rule was modified

4. Token Usage

The Token Usage dashboard provides a detailed, real-time view of how many LLM tokens are being consumed across your organization. This module is crucial for cost transparency, governance, debugging, and performance monitoring.

Fabaio Token Usage - Token consumption for Fabaio (Copilot) conversations, showing usage per user, persona, project, and LLM model.
Agent Token Usage - Token consumption for automated agent executions, tracking usage across different agents, projects, and models.
Usage Breakdown - Detailed breakdown of token usage per user, persona, project, agent, and LLM model for comprehensive visibility.
Input Tokens - Tokens consumed for prompts and context.
Output Tokens - Tokens generated by LLM responses.
Cache Tokens - Tokens served from cache, reducing costs and improving efficiency.

The Token Usage dashboard contains two tabs: Fabaio (token usage for Copilot conversations) and Agents (token usage for automated agent executions). Both views share similar structure and provide detailed analytics on token consumption across all LLM interactions.

Note

Token Usage analytics help optimize prompt engineering to reduce token consumption, identify high-cost operations, and plan capacity and budget allocation.

5. Reviews

The Reviews dashboard provides an organization-wide view of all ratings and feedback submitted by users across conversations, agents, and Fabaio interactions. This module aggregates user sentiment and turns it into powerful insights for improving agent performance, user experience, and operational quality.

Total Ratings Overview - Shows the cumulative number of reviews submitted within the selected time range, with filtering options for All Time, AI Project, Persona, and User.
Ratings by User - Displays how many ratings each user has submitted, helping identify which teams or users provide the most feedback.
Ratings Breakdown by Project - Shows how many ratings belong to each project, identifying which projects use agents the most and where satisfaction levels vary across environments.
User Ratings Summary - Provides quick insight into overall quality trends showing highest rated, least rated, and average ratings across all conversations.
Ratings Distribution - Detailed star-count summary showing the distribution of satisfaction levels (1-5 stars).
Detailed Review Table - Searchable, filterable table of individual ratings with columns for Timestamp, Conversation Label, Project Name, Rating, and Review Details. Allows inspection of specific low-rated conversations, validation of improvements after updates, and tracking feedback across different agents.

6. LLM Cost Analysis

The LLM Cost Analysis dashboard provides a complete financial and operational overview of all Large Language Model (LLM) usage across the Fabrix.ai platform. It helps administrators, FinOps teams, and platform owners understand where AI spend is happening, who is using the system, and which models or projects drive the most cost.

Key Performance Indicators - Overview metrics including Total Cost, Total Requests, Total Tools Called, and Total Tokens (Input and Output).
Cost Breakdown by Project - Spending distribution across all projects to identify high-spend business areas.
Cost Breakdown by User - Individual consumption across all end users to identify heavy users and detect anomalies.
Cost Breakdown by LLM - Cost across all models used in the system for cost-performance comparison.
Cost Breakdown by Persona - Spending grouped by Persona (agent role) to understand which workflows are most AI-intensive.
Request Volume Over Time - Total number of AI requests per day/hour showing usage patterns and peak activity periods.
Total LLM Cost Over Time - How AI expenses evolve over time showing seasonal patterns and cost anomalies.
Request Type Distribution - Distribution between API Requests, Agent Requests, and Copilot Requests.
LLM Provider Usage - Usage breakdown by provider (Anthropic, OpenAI, others) for reliability and cost-efficiency comparison.
Token Usage Breakdown - Token consumption patterns including total input/output tokens, per model and per project usage.
Success Rate - Successful vs failed requests with failure reasons including provider timeouts, incorrect tool calls, invalid parameters, or rate limits.
Recent Activity Metrics - Real-time requests and cost per LLM for the last 24 hours to track sudden changes.

Note

LLM Cost Analysis supports budget planning and forecasting, model selection optimization, cost allocation and chargeback, and ROI analysis for AI initiatives.