Data chats

A guide: What’s the ROI of AI tooling?

ROI = Returns/Investments x 100%

You’ve been asked what the ROI is of your AI tooling. 

Maybe it’s for an exec or a board meeting; maybe they want to know how the cost of AI compares to the benefit. Or maybe they’re wanting to see if the organization is getting as much as it can out of AI tooling.

Whatever the reason, you’re now trying to measure the ROI of your AI tooling, and your searches or a friend led you here. Welcome. I have good news and bad news: The bad news is that there’s no easy answer; the good news is that we’ll share some principles you can use to come up with a good answer in your context. 

And yes, we have a calculator at the end of this – but the principles are what matter most, so please don’t skip this upfront section!

To be clear: My goal is that this article supports your thinking about measuring AI ROI – but I’m not promising you a magic answer because there isn’t one.

Principles for measuring ROI

ROI means return on investment. It’s a measure of how much benefit you got from an initiative over and above the costs. Here’s the formula:

$$ ROI = \frac{\text{Returns}}{\text{Investments}} \times 100\% = \frac{\text{Benefits} - \text{Cost}}{\text{Cost}} \times 100\% $$

There are variations on this – for example, if you made an investment in one year and then had to wait more than a year to get returns on it – but the simple formula above is enough for calculating AI impact, since we’re certainly hoping to get benefits from AI within the same year where we rolled it out. Investopedia has a good introduction to ROI for those wanting more.

Whether you’ve done ROI calculations before or not, let’s start with some foundational principles so we’re all on the same page.

3 key principles for measuring ROI

  1. Your outputs are only as good as your assumptions.
  2. The investment should include money AND time costs
  3. The returns should come from the investment.

Let’s go through each in detail:

1. Your outputs are only as good as your assumptions.

This is the business world’s equivalent of “garbage in, garbage out.” When we’re in the fuzzy realm of projections, there’s no right answer – so the best we can do is:

  • Make good assumptions
  • Clearly spell them out so others can critique them

If someone gives you an ROI number without telling you the assumptions it’s based off, it’s time to ask a LOT of questions – because who knows where that number came from or if it’s actually useful.

If you’re building your own financial model to calculate ROI, the key is to build it in a way that is transparent and makes it easy to change assumptions. 

Spell out your ROI assumption


2. The investment should include money AND time costs

The financial costs of an initiative are usually pretty clear – but that’s not the only kind of cost to consider. A second one to consider is the cost of your time. 

You always have choices with how you spend your time – and as they say, it’s the one thing you can never get more of. Especially when you’re rolling out something new, and it carries the steep learning curve that AI does, that time investment is not trivial. 

So we recommend that your AI ROI model should estimate not only the financial costs but also the time that went into making it happen. 

$$ \text{Investment} = \text{Time cost} + \text{Money cost} $$


3. The returns should come from the investment

The investment you made – the money and time you spent – was because you were hoping to get value from that investment

When we calculate the returns, they should be based on changes that came from the investment. So if we want the ROI of AI, we need to look at changes that happened because of AI

When you’re looking at returns from buying shares, linking returns back to the investment is simple – but when you’re looking at organizational initiatives, this can get tricky because lots of things are always changing in your organization. Maybe alongside your AI tooling rollout there was also a major delivery deadline, so it looks like all your metrics improved massively because of AI, but actually you’re just seeing the impact of people on your team working a bunch of evenings and weekends to meet the delivery deadline. If you don't account for this, you risk crediting AI for improvements that were actually driven at least in part by your team putting in longer hours.

The only way to truly account for all confounding factors is to run a randomized controlled trial – but that’s impractical for most organizations. 

Fortunately, there are fallback methods that get us closer to measuring the impact of a specific initiative, like an AI rollout, even if the result won’t be 100% perfect. These include things like:

  • First, focusing on changes for your high AI adopters. If there was a change for low AI adopters (who aren’t using AI much), then we shouldn’t claim that the change came from AI. 
  • Using data for the same group of people before vs after an AI intervention – so at least we’re accounting for what the metrics looked like before they started using AI. Simply comparing the metrics for low vs high AI adopters can be misleading, because those groups likely didn’t start out with the same metrics. I’ve written more about how not controlling for initial group differences can lead to wrong conclusions.
  • If you want to get really sophisticated, you could even use metric changes for low AI adopters as a control (to account for changes over time while the AI intervention was happening) and net those out of the changes you see for high AI adopters pre-vs-post.
    • For example: Estimated impact = (High adopters: Post value − Pre value) – (Low adopters: Post value − Pre value) 

This approach is inspired by an econometrics method called differences-in-differences.

\( \text{Returns} = \text{The value of changes that happened BECAUSE of AI} \)

As a note on all the above – I’ve focused here on the ROI of rolling out AI tooling, but these principles work for the ROI calculation of any AI intervention – including human ones like starting regular AI demos or doing an AI hackathon. 

Measuring the ROI of AI

We’re clear on our foundational principles, so now: How might we apply that to measuring the ROI of AI?

Now we can build out our measurement approach:

Investment

Remember our principle that we want to look at money and time costs →

  • The money cost is what you spent on AI tooling for your team.
  • The time cost comes from both your rollout efforts and the time spent to support people with learning and using the tooling. (There’s also the cost of the time that each individual developer needs to spend on the tools to get up to speed; because that can vary widely, I’ll instead focus on the cost of organizational support for that learning.) You can approximate this by estimating the number of people supporting the AI rollout, multiplying by the percent of their time they spend on supporting others with AI usage, and then multiplying by the full-time equivalent salary cost.

That gives us our formula:

$$ \text{AI Investment} = \text{AI tooling cost} + \text{Cost of time spent rolling it out} $$


Returns

We want to look at the benefits we’ve created by rolling out AI tooling. Let’s first specify what kind of benefit we’re hoping for with AI – what’s the why? 

There are lots of things you’re probably hoping for – including a better developer experience, making it easier to deliver faster, or even helping you as a leader get more building done in your limited time. And in the “benefits” calculation, we also need to consider when the change means that things get worse – the big worry with AI is how it’s impacting quality, and if speed today means we’ll end up spending more time fixing customer bugs, maintaining our codebase, or responding to AI incidents later. 

To simplify things here, I’m going to focus specifically on benefits (or the downside) for the customer. Why? If someone's asking about ROI, they ultimately care about financial outcomes – and engineering drives those by delivering customer value and minimizing customer issues

I’m also going to guess that you want to use existing metrics as much as possible, so you don’t have to spend a bunch of time getting data or running calculations to answer the ROI question, so I’ll be leaning on the DORA metrics as a way to approximate benefits. The DORA metrics come from a focus on customer impact and as a result are linked to better financial performance for companies.

How we can calculate the change in customer impact using DORA metrics:

  • Customer value created: If AI were to help your teams move faster (= a decrease in Change Lead Time) and ship more to customers (= an increase in Deployment Frequency), then that’s an increase in customer value created. To calculate this:
    • Determine change in value delivered – By multiplying change in Change Lead Time and change in Deployment Frequency, you can estimate a change in value delivered 
    • Get your typical value delivered – maybe last month’s MRR, or if you can get this specific, the revenue from features that your high AI adopters shipped last month
    • Customer value created – Multiply the change in value delivered by the dollar value of typical value delivered, and that’s your customer value created.
  • Customer issues reduced: Again focusing on DORA, for this one we’ll look at the cost of downtime. We recommend using the Change Failure Rate (CFR) and Mean Time to Recovery (MTTR) for just P1 incidents, because this one can quickly balloon out if you bring in all failures. Ideally you would also have the CFR and MTTR just for AI-related incidents, but if you don’t have that, you can use the overall CFR and MTTR metrics as long as you caveat them to say that other factors besides AI will have impacted them. Now, calculate:
    • How much downtime did we have before – Using the Change Failure Rate and Deployment Frequency from before the change, we can calculate the number of failures before the AI rollout. Then using MTTR from before the change, we can calculate the time spent in downtime (formula below). Just make sure the number isn’t so high that it goes above the number of hours in a day/week/month; this calculation assumes incidents happened sequentially but most likely happened in parallel.
    • How much downtime did we have after – Now using the same metrics but for after the AI rollout, we can calculate the amount of downtime post AI
    • Change in downtime – Look at the difference between downtime before and after. If downtime went up, this is a negative return. If downtime went down, it’s a benefit. 
    • Value from change in downtime – Now get the cost per hour of downtime and multiply it by the change in downtime.

$$ \text{Downtime} = \text{Change Failure Rate} \times \text{Deployment Frequency} \times \text{MTTR (hours)} $$ $$ = \text{Hours spent in recovery per month} $$

$$ \text{AI Returns} = \text{Customer value created} - \text{Customer issues created} $$

If DORA isn’t what works for your audience, you can always substitute in the customer value metrics that you use – maybe you think about value delivered based on features or epics, and maybe you want to include all customer-reported bugs as the proxy for customer issues. 

And there you are. To summarize, the key formulas to use are:

ROI

$$ ROI = \frac{\text{Returns}}{\text{Investments}} \times 100\% = \frac{\text{Benefits from initiative} - \text{Cost of initiative}}{\text{Cost of initiative}} \times 100\% $$

Investment

$$ \text{AI Investment} = \text{AI tooling cost} + \text{Cost of time spent rolling it out} $$

Returns

$$ \text{AI Returns} = \text{Customer value created} - \text{Customer issues created} $$

If you want to see what this looks like in practice – and play around with some of your own numbers – try out our calculator here:

ROI calculator for AI tooling
Contributor
Lauren Peate
Lauren Peate
Founder, CEO, Multitudes
Lauren Peate
Support your developers with ethical team analytics.

Start making data-informed decisions.