Reducing Deployment Time

Ideally deploying to prod takes 5 minutes and is just a few steps. There is high observability, useful alerts, no manual QA, master commits deploy straight to prod and it's easy to rollback.

But at so many companies it is a whole lot longer with a whole lot more steps, say 20 steps and 1 hour. The question then is, how to reduce deployment time. Let's define a few things before we get to that.

How Often Should You Deploy and Why?

One of the DORA (DevOps Research and Assessment) metrics is frequency of deployments.

Various sources all point to the following:

  • High-performing teams can deploy changes on demand, and often do so many times a day.
  • Lower-performing teams are often limited to deploying weekly or monthly.

But Why

Deploying frequently has many benefits, let's list just a few here:

  • smaller changes means less risk, and incremental QA, and teams being agile to break work down
  • no merge conflicts or other dependencies that come with larger waterfall releases

So the answer is - ideally, you should deploy many times a day because it is agile and reduces risk.

One Deployment per Day per Team - Reality or Just Theory?

The Industry

I have asked people I know if teams deploy once per day, and the answers I got were:

  • people at large tech companies said yes, they deploy at least once a day per team. They have no manual QA and commits to master are deployed to prod automatically.
  • but most people said 1-2 deployments every 1-2 weeks.

I personally have been in both kinds of teams above.

How Not To Reduce Deployment Time - KPIs

Many companies set up things like KPIs to deploy to prod every day, especially if their deployment times are long, to drive down the deployment time.

Good Intent - But Often Achieves the Opposite

While this KPI has good intent, because it is a KPI, this prioritises people to meet the KPI today, tomorrow, and the next day, and every day of their working lives - and it prioritises that over actually reducing deployment time. That is the core issue with such KPIs. The KPI may thus actually achieve the opposite of what it intended to achieve.

Additionally, people might not just deploy daily, but they may game the system to achieve more releases - such as deploying code without value, striving to deploy only things which are microservices and likely to have more deployments (over a monolith), having more people in the team meaning there are more deployments per team - etc.

So often what happens is that people do whatever it takes to meet the KPI, even if this includes wasting time. That wasted time could instead have been spent reducing deployment time. As for how to do that - it's in the next section!

Positives and Negatives of the KPI

Let's imagine if the KPI was abolished. Teams may then no longer care about deployment time. So there is reason to the KPI - to force people to be frustrated to the point that they take action themselves. This however can lead to employee disengagement and burnout.

So it's a lose-lose situation. If you keep the KPI, teams are wasting time. If you abolish the KPI, teams may not reduce deployment time.

Perhaps what's needed is other goals - to simply reduce deployment time.

How To Actually Reduce Deployment Time

Other Company's Journeys

Various online journeys include:

  • HubSpot - they removed any manual steps
  • Monzo - they optimised engineering culture, tooling, and architecture

None of these journeys mention a KPI.

They mention metrics - but those are app metrics, not team metrics.

Quick Wins

A team alone can get some quick wins, which a KPI would drive, including:

  • automating messages to chat channels when release happens, or a pr is raised
  • having feature flag toggles that take effect instantly
  • optimising build pipelines, for example to run steps in parallel, to cache relevant things, and to make tests run faster

Longer Term Gains

A team alone can't achieve the larger cultural and process shifts needed for big reductions in deployment time - a KPI likely won't drive teams to this - unless maybe a few individuals step up and take ownership in addition to their normal work:

  • commits in master deploying to prod automatically
  • no manual QA
  • a DevOps culture - closer collaboration and a shared responsibility between development and operations for the products they create and maintain

If It's Not Working

If teams are trying to reduce deployment time but it's not working, ask:

  • are devs not automating tests, and why
  • what is the complexity of system
  • are there any integration tests
  • is the culture such that people think deployments need a "human touch"? Such as letting people know of  a big release hours/days in advance - this can still happen with daily deployments, via usual communication and collaboration.

Closing Notes

All of the above needs to be taken in to context - sometimes you don't deploy to prod for weeks, because maybe you are designing something, and sometimes you deploy to prod several times a day. Ideally though, things are at least set up for frequent deployments.