Ready to get started with DevStats?

No credit card required

Full access to all features

2 minute setup

Measuring and Improving Software Delivery with DORA Metrics

By 

James Walker

02 Mar 2023

9

 min Read

SHARE THIS POST

DORA (DevOps Research and Assessment) metrics are key indicators of software delivery performance. DORA is a research organization founded to improve understanding of the factors that contribute to effective software delivery. The metrics were discovered by analyzing how successful teams work to identify their common capabilities and methodologies.

Research showed that four main values are sufficient to convey overall developer performance:

  • Change lead time: the total amount of time required to produce a change and deliver it to production
  • Deployment frequency: the number of changes successfully deployed in a given time frame
  • Change failure rate: the proportion of deployed changes that result in errors in production
  • Mean time to recovery: the average time required to restore service to an acceptable level after a failure occurs

Tracking trends in these metrics allows you to analyze developer activity and assess whether your business targets are being met. This can help you spot development bottlenecks to improve your development throughput and stability.

This article explains what DORA metrics are, why they're helpful, and how you can start monitoring them using DevStats—a powerful developer analytics platform.

Why Measure DevOps Performance with DORA Metrics?

Developers and team leaders frequently struggle to understand whether their workflows are effective. It's unclear how well developers are progressing toward goals midway through a project or why tasks are taking too long to complete. DORA metrics address this by defining standard performance benchmarks that are proven to be reliable indicators of success.

Tracking DORA metrics allows you to identify potential areas of improvement and then measure the effects of changes you make. They guide you toward optimizations that result in higher delivery throughput without compromising on quality or stability.

The DORA values are hard for individual developers to manipulate because they're based on end-of-process outcomes: deployments created, errors that occurred, and the time then taken to respond. There's also a lag between when changes are made and when the result shows up in your data, helping ensure the values are representative of actual trends.

How to Improve Software Delivery Using DORA Metrics

DORA comprises four different metrics that fall into two main groups: throughput and stability. Whereas throughput represents the quantity of changes you deliver, stability indicates whether the changes achieve a consistent quality standard. Let's look more closely at these values and how they relate to your software delivery performance.

Throughput Metrics

Throughput concerns the velocity with which you're able to build changes and deploy them to customers. Successful teams should aim for high throughput so important features and improvements can be quickly and efficiently delivered to users.

Change Lead Time

Change lead time measures the time taken from committing a code change to that change entering production. It reveals how efficiently your delivery pipeline operates. Lower lead times mean higher throughput, but there are several reasons why an extended lead time could occur:

  • Delays in the development process
  • Extended code review sessions
  • Problems identified during review that require manual resolution
  • Toolchain inefficiencies, such as slow CI/CD pipelines and clunky test suites

Addressing these problems is important so you can deliver value within the required timeframes. Maintaining a low lead time means fixes and vulnerabilities reach customers more quickly, limiting your exposure to threats. It also reduces your time-to-market for bigger launches, providing a competitive advantage.

Deployment Frequency

Deployment frequency measures how often you deploy changes to production. It can indicate the responsiveness of your process as well as your toolchain's maturity. Shipping changes frequently implies you've invested in automation for key workflow tasks, whereas only deploying occasionally can point to internal roadblocks that impede your ability to deliver value.

Potential causes of low deployment frequency include the following:

  • Overly restrictive internal processes that require changes to pass multiple reviews
  • Missing tooling that prevents developers from releasing changes to production environments
  • Lack of focus on iterative development

It's important to realize that the ideal deployment frequency can vary significantly by team and project. For example, a small startup that's building a consumer-grade web app will typically deploy more frequently than an enterprise team developing regulated payment or healthcare systems. The specialist security and compliance requirements of the regulated system will unavoidably extend the development lifecycle. It's therefore particularly important to assess deployment frequency in terms of trends, consistency, and alignment with your internal targets—the absolute values you measure won't necessarily be comparable to benchmarks published by other organizations.

Stability Metrics

The stability of your software delivery lifecycle refers to how reliably you can deploy changes without introducing new failures. High stability minimizes the risk that you'll have to deal with costly incidents and ensures users experience a consistent quality standard.

Change Failure Rate

Change failure rate reports the percentage of deployments that cause failures in production. What counts as a failure can be subjective, but it generally includes any deployment that causes an incident, requires a rollback, or is immediately followed by a hotfix. Keeping your failure percentage low is crucial because this metric directly reflects the quality standard that users experience.

A high change failure rate can be caused by factors including the following:

  • Inadequate code testing prior to production deployment
  • Missing enforcement of code quality and security standards
  • Manual deployment processes that are prone to error
  • Improper coordination between teams, such as poor communication of a change's deployment prerequisites

A good change failure rate is a strong sign that your team has high DevOps maturity. Although some production issues are inevitable at scale, correctly implemented delivery systems should allow you to detect most problems before they reach the deployment stage.

Mean Time to Recovery (MTTR)

Mean time to recovery, sometimes referred to as failed deployment recovery time, is the average time taken to fully restore service after a failure. Because not all failures are avoidable, tracking this value is important so you can anticipate your resilience when faced with an incident. Well-equipped teams should have low recovery times, but many problems can cause delays:

  • Missing observability for production systems, preventing you from accurately tracking metrics and errors
  • Unclear or incomplete system logs, traces, and audit events
  • Lack of direct engineer access to monitoring data
  • No clearly defined plan that guides your response to failures

Calculating recovery time requires a record of when each incident began and the time at which it was confirmed resolved. The recovery time is simply the duration that elapses between these two timestamps. Because some incidents will be trickier than others, you shouldn't look at these values in isolation—it's the mean resolution time across multiple incidents that's important.

High-performing teams can struggle to capture this data if a relatively small number of incidents is encountered. You can address this by regularly rehearsing incident responses using simulated failure techniques such as chaos testing. It allows you to assess your recovery time and drive optimizations even when your service is running smoothly.

Balancing Throughput and Stability

Software delivery throughput and stability can seem contradictory: how can you continue shipping fast without introducing more failures? Teams that are new to DORA metrics often worry that optimizing one value will cause regressions in others.

DORA's research reveals that this trade-off doesn't actually occur. Making improvements to any of the metrics requires systemic change to your DevOps workflow that positively impacts the other metrics too. DORA affirms that "top performers do well across all four metrics, and low performers do poorly."

Put simply, achieving high throughput requires effective tooling, clear communication, and efficient iterative development—all factors that also contribute to stability. Instead of focusing on individual metrics, it's best to approach DORA more holistically to begin with. If your scores are low, implementing high-level workflow changes, such as increased use of automation, is the best way to improve your overall DevOps maturity.

The Benefits of Using DORA Metrics

As you've seen, DORA metrics provide meaningful data that enables better decision-making around developer performance. They drive a productive DevOps culture by providing a benchmark for evaluating and improving the effectiveness of your workflows.

Excellent DORA scores indicate your team is performing well; if organizational goals are being met, no further changes are required. But low scores or consistent decreases over time can indicate that your team's struggling to realize its full potential. This isn't always obvious without the DORA data, so by tracking your metrics, you can identify potential issues earlier and keep your team's productivity high.

Monitoring DORA Metrics with DevStats

DevStats is the simplest way to do developer analytics. The platform allows you to measure key engineering metrics—including DORA values—in minutes, letting you check how fast your team is shipping, spot bottlenecks, and predict future trends.

DevStats connects to your Git repositories to extract data that reveals your performance. You can then view your DORA metrics as a dedicated report with clear visual cards for your PR cycle time, daily deploy frequency, change failure rate, and MTTR:

Screenshot of the DevStats DORA metrics report

Filtering the view using the options at the top of the screen allows you to view your stats by team or feature branch, enabling easy performance comparisons. You can see which teams are attaining the best results and how different types of development activity affect throughput and stability.

DevStats also helps you see exactly where your developer time is going. The investment profile screen displays a breakdown of activity by deliverable type, letting you track trends in the number of features, enhancements, bugs, and hotfixes you build. The report supplements the DORA metrics screen by providing a bigger picture of your DevOps outcomes, ready to communicate to stakeholders.

The DevStats investment profile screen

Seeing a large number of features with a comparatively small number of bugs indicates you're making a strong ROI on your development activity. Conversely, increases in bugs or hotfixes suggest falling stability that could require more DevOps workflow optimization.

Conclusion: Using DORA Metrics to Measure and Improve Software Delivery

The four DORA metrics are useful indicators of how well DevOps teams are performing. Change lead time, deployment frequency, change failure rate, and mean time to recovery are clear values that collectively reveal your team's throughput and stability. Successful teams frequently deploy changes with a low failure rate, implying that they're meeting the required customer, product, and organizational outcomes.

Collecting the data needed to generate your metrics is the hardest part of DORA. DevStats simplifies this process by connecting to your repositories, extracting activity insights, and presenting your metrics on a clear visual dashboard that lets you easily track your progress. You can then use your data to inform future optimizations to your workflow, such as by supporting developers with better tooling or breaking tasks down into blocks that can be implemented as a series of smaller changes.