I’m bullish on agents. And for analytics, agents change everything: the data we analyze, how we analyze it, and what we can do with the insights.
Think about software before the AI-era. It was built with navigation, screens, and buttons, in a way optimized for a human to do a job, like sending marketing emails, or contacting sales prospects.
If we assume a lot of that work will be replaced by an agent, the interface doesn’t need to look the same anymore (if it exists at all). The workflows triggered will also change. The value of products is no longer measured by how much useful work a human can do, but rather by how productive the agents can be. And the analysis process? Agents are there to help with that too.
And because agents work differently to humans, much of what we’ve figured out for how we analyze agents will be different to traditional analytics.
Let’s look at how this shift transforms the three core pillars of analytics: data engineering, generating insights, and taking action.
Data Engineering
The Data Looks Different
The most obvious thing about analytics data for agent products is that it will look very different.
Before we had users, sessions, events, and properties.
Now we have traces and spans, with metadata about models, tokens, and tools.
For some agents, there will be a human user with an associated session. For others there won’t.
A span with metadata resembles an event with properties in some ways, but the critical difference is hierarchy: traditional analytics flattens data into a linear list of events, destroying the parent-child context that explains why an agent took a specific step.
Insights at the Moment of Tracking
An advantage of agents, is that we can instruct them to annotate the data that they log. I believe new tactical patterns will emerge around this.
For example, a support agent that can’t answer a question, might be able to reflect on whether the information was missing, or the question was unclear. This can be sent along in the span for more informed inspection and analysis later.
Insights and Intelligence
Goodbye Pirates
The AARRR pirate metrics have been a reliable framework for thinking about KPIs and metrics for SaaS products. But they’re not a fit for agents.
Now the use cases will look more varied, and as such the metrics more nuanced.
Evals will be a first-class consideration, and might end up directly replacing the concept of a KPI. But how will they be mapped to business outcomes? Which will we focus on and why? I believe the long-term frameworks for thinking about Evals are yet to be established.
Model cost becomes a concern of the product builder. Infinite scaling of SaaS is gone - now every session spends dollars. So we’ll get used to looking at token efficiency, and quality/cost trade-offs.
Finally I believe “time” will make a comeback. Session length sucked as a KPI: it was unreliable, hard to interpret, and easy to game. But with agents, we genuinely want to understand how fast they are - because time costs money.
Sidenote: do “Safety” metrics become the responsibility of the analyst too?
Goodbye Funnels
Since the metrics have changed, and so has the data, it makes sense we’re going to be armed with a completely new suite of favourite chart types.
Funnels were great for linear user journeys, but less fitting for agentic traces.
Waterfalls are becoming established as the UI for trace viewing.
Clustering tables will be helpful in diagnosing categories of failure.
Pareto fronts will be there, when you need to examine the trade-off between eval scores and token cost.
Hello Analysis Agent
Here’s the good news. You’re not going to be alone getting to grips with all these new metrics and charts.
Because the UI isn’t going to be a chart builder anymore. Now you get a prompt box.
Your personal analysis agent will build your chart for you. It’ll build your dashboard. It will even generate a first-pass interpretation of the results.
And that’ll leave you as the expert in your business - strategizing, feeding in domain knowledge, guiding the agent what to look into, and figuring out how to action the findings (more on that in a minute).
Your analysis agent can work at night too - always checking the latest trends, and fuelling your analysis backlog.
Actionability and Experimentation
The biggest challenge in analytics has always been turning insights into actions. Changing your product based on what you learn from the data. For agents, I expect this to become easier.
Model Changes
Firstly, the AI models agents use are constantly being updated and improved. Every time an agent uses a new model, this is effectively an experiment that needs to be measured. You can’t get away from this decision: you’ll always be needing to make a choice about which model to use.
A typical implementation of model choice is the “model router”: a step in the agent where it will decide whether to use a simpler and cheaper or more complex and expensive model. So we will always be analyzing whether the router configuration is getting us value for money, and adjusting it accordingly.
Evals Offline
Evals also change the way we think about taking action, because they can be run offline on historical or synthetic data. This means for some agents, we will be able to change a prompt, and run an offline counterfactual analysis of how the eval would have changed, before deploying the updated prompt. This loop reduces the friction to action a change - we have a playground for trying out ideas.
Self Improving Agents
The most exciting concept within actionable agent analytics is the idea that an agent could improve itself.
Conceptually this requires a loop between previous agent runs and future decisions.
This loop will probably include human or agentic analysis as a middle step, before feeding the learnings back in, probably via a coding agent.
But there’s also an approach where an agent will reference previous data online and adapt accordingly when making decisions, cutting out some or all of the analysis steps.
Likely we’ll see a hybrid approach, but it’ll be interesting to see where the lines are drawn.
What’s Next?
In this substack, I’ll dig into these topics in detail, and I’ll share the best playbooks I come up with for analyzing the agent that you’re building.
I’m Timothy Daniell, founder of TwoTail.AI, an analytics tool built for analyzing AI Agents. If you’re working on an agent, I’d love to talk to you. You can reach me here: https://www.linkedin.com/in/timothydaniell/