Measuring Engineering Teams:
actionable metrics for managers
Why measure engineering teams?
"You can't game what you don't measure.", said Raffi Krikorian, former Uber's Advanced Technologies Center's director and Twitter's VP of Engineering. Humans tend to change the working system to achieve better numbers when managers charge based on metrics.
You may ask yourself if they tend to manipulate the process, so the numbers improve, what's the point of measurement?
As stated, humans tend to change their behavior when measured. Well, this human aspect may sound dangerous, and it is if you don't measure the right thing. That's why metrics must align with business goals. You can then use the metrics as an indicator of progress.
If you are still stuck with those old productivity metrics, like LOCs (Lines of Code) produced or story or function points delivered, be careful. They drive your team to a non-healthy environment.
Instead, measuring what matters for the business can make a significant change in your team's mindset. As metrics make goals and objectives clearer, they work as an alignment tool. By doing so, your team gains focus, avoiding wastes and reworks.
Another crucial benefit of useful engineering metrics is the ability they give to managers to mitigate risks and take actions before problems get too messy. Managers can quickly tell whether a team needs help, is facing a bottleneck, or even near a burnout.
Advantages of measuring engineering teams
In summary, the advantages of measuring engineering teams include:
- Tracking team and individuals' progress
- Risk mitigation
- Finding bottlenecks and spotting process improvements
- Big picture of how your team is working
Types of Software Metrics
You can measure different things depending on which part of the software development workflow you want to understand. At SourceLevel, we segregate metrics in three different types.
Agile metrics always focus on delivery. The value comes from the final users' use of delivered features, and their experience with the product. It means that managers must measure not only the velocity but also the quality of deployed work.
Some known agile metrics are:
- Sprint Burnout
- Epic Burnout
- Team Velocity
- Lead Time
- Cumulative Flow Diagram
- ROI per features
- Escaped bugs
Software or Application Performance Metrics
Performance metrics for software or applications are very coupled with technology. They are handy for developers and engineers to understand the application's behavior, find bottlenecks, and spot problems.
Tools in the category of APM (Application Performance Monitoring) are great providers of these metrics. Usually, a library automatically collects data from running applications and plots in organized dashboards for consumption.
Examples of metrics extracted from APMs are very technical:
- Request Response Time
- Hard disk I/O
- Error rates
- Most time-consuming transactions
- Transactions Breakdown
- Time spent on Database calls
Product teams have a direct impact on end-users, but this premise is not always true for engineering teams. Engineers usually focus their work on building the foundations for developers.
The business value of engineering is not straightforward. At first sight, there is no value in a fast and efficient search engine API by itself. It's common to think that the value only comes when there is an interface that allows end-users to interact with the search results. But that's partially true.
Engineering teams unlock potential and make developers more productive by creating solutions that reduce bugs, accelerate development, increase security, lessen rework, and provide technical assets for business evolution. That's their value.
The challenge here is how to conciliate engineering work with business goals. Efforts of the team should reinforce the company's objectives. As said, sooner or later, engineering teams have to interact with product teams. So, it is required for managers to not only measure relevant metrics for achieving business success but also the collaboration with peers.
Here is an example: if a product team needs a performant and efficient search engine for end-users, they should get along with an engineering team to create this service considering the known business and technical constraints. Examples of metrics include:
- Lead Time
- Lead Time Histogram
- Pull Request Control Chart
- Time to Code Review
- Time to Merge
- Commit to Deploy
- Weekly Pull Request Throughput
- Collaboration Matrix
Actionable Metrics for Engineering Managers
Actionable metrics are metrics used as indicators. It means that they should show managers the big picture of the progress of their team's success. Acting as early as possible, engineers managers avoid issues to become big problems.
Nicole Forsgren, in her article "Measuring Tech Performance: You're Probably Doing It Wrong," says that although there is not a silver bullet for engineering metrics, managers should follow these two principles:
- Use measures that focus on outcomes, not output
- Use measures that optimize for global or team outcomes, not local or individual outcomes
What you measure shapes your team behavior, so measuring outcomes aligns individuals' and teams' efforts to expected outcomes. Still, it raises the questions, what should managers measure then?
It's natural to just-transitioned Engineering Managers to don't know what to measure. Even experienced managers find it challenging. That's why you need to continually improve your metrics to reflect better and adapt to the status quo of your team.
But, as a start, Engineering Managers benefit mostly with engineering metrics, as described above. Here are some of the relevant metrics and a brief explanation of them.
Throughput vs. Lead-time Percentiles
As an alternative to measure your team velocity with story or function points, you can use throughput. The Weekly Pull Request Throughput is the number of pull requests closed in a given week. As a manager, you want this number to vary as few as possible because it means that the team is working in a healthy and sustainable pace.
If there are weeks in which throughput is high and others in which throughput is almost zero, managers should investigate whether it is a characteristic of the business or due to burnouts, for instance.
This metric also can be used to understand the impact of adding more engineers to the system. Sometimes, adding more people to the system reduces throughput because they add complexity to the system. Very few managers are aware of the hidden cost of coordination. That's why it's crucial to compare weekly throughputs over time.
In this chart, each bar represents the throughput of past weeks. The lines represent lead-time percentiles with 50%, 75%, and 95% of confidence.
As the lead time tells how many days each pull request remained open, the primary information of this chart is the correlation between the amount of work done by the time it took.
It's not about finding who ships faster, or whose performance could be an issue. It's about understanding the reasons for the variation over time, and educating your team to open smaller and faster-to-review pull requests.
Pull Request Control Chart
A control chart is a tool for visually monitor processes parameters and making sure their distribution is under control. Usually, three limits determine the minimum, medium, and maximum limits (statistically speaking).
In the case of Pull Requests, the control chart plots the lead time for every open and closed pull request in a period. It's an excellent tool to visually understand how many days pull requests take to be closed or merged.
Alternatively, you can draw limits to determine, for stance, 50%, 75%, and 95% of total pull requests. With these limits, you can easily spot the number of days taking to close or merge pull requests. Better than that, as control charts also show open pull request data, you can find and investigate outliers.
This matrix is fascinating, but the wrong usage may cause pressure and discomfort among team members. Think of a spreadsheet. In the A column, each team member fills one line. At line 1, the same team members get placed, one in each column. It's merely a matrix of team members by team members.
Let's say, at line 6, there is Jessica, an engineer that opened a pull request. Every time Jack comment, approve or react to her pull request, the cell that matches Jessica and Jack increase in 1 its number. By the end of the day, the matrix can show how your team interacts among its members.
A good manager would use the Collaboration Matrix for spotting knowledge silos and acting accordingly. It's a vital asset for engineers managers to inspect teams dynamic without being present in daily activities.
Lead Time Histogram
Histograms are an accurate representation of the distribution of numerical data. In practice, it's a bar chart that shows the number of closed pull requests (Y-Axis) grouped by the number of days they stayed open (X-Axis). The higher the bars in the left, the faster pull requests get merged.
Let's say you have a high lead-time for your team's pull request. It means they remain open for too much time. With the Lead-Time Breakdown, you can drill down this information. It shows the average days from a pull request open to its first interaction (comment, review, or approval), and the average days elapsed from the last interaction to the merged time.
Analytics for Engineering Teams
We provide some of the listed metrics and are developing other features for engineers managers better understand how their team works.
Get to know more about SourceLevel Metrics!