In the realm of technology, leadership loves numbers. They’re comfortable. They’re understandable (hopefully). And, most alarmingly, they are used as a soft, warm, cuddly security blanket that seemingly communicates to leaders exactly how often their engineers are performing at peak proficiency at all times. I have a bad feeling about this…
In the search for ever increasing sources of productivity data, savvy managers have started tapping into source control statistics from GitHub and other sources as a way to condense down a developer’s engineering contributions to something easy to chart on a productivity graph. The problem with this approach is quite simple, really, and it’s something that’s been addressed in many social and mathematical contexts much more elegantly than I could ever hope to describe.
The good folks Stephen Dubner and Steven Levitt were onto something big when investigating a bunch of unexpected social and economic outcomes for their “Freakonomics” books, in which experiments that seemed airtight on paper ended up rendered completely ineffective or worse due to the presence of unexpected results fueled by human behavior on interacting circumstances. Unfortunately, as software developers, we still aren’t immune to the unintended, messy effects of human psychology on evaluation by data. Those very interactions can have an incredibly profound impacts on the quality and reliability of the software we create.
Let me give you some examples. Ever since entering my role as a software engineering coach, I’ve heard countless horror stories of developers being held hostage to a number of different metrics related to source control, including but not limited to lines of code (LOC), number of commits per day/week/month, and issues resolved (velocity), among others. While each of these metrics provides a data point into how and how often a developer interacts with a source control platform, they fail to provide any indication of the developer’s productivity and their overall contributions to the goals of the organization. Why? Let’s take a look at each of a few common source control metrics in detail, and discuss exactly how they could potentially undermine leadership’s intentions of improving developer contributions and product quality.
Lines Of Code
Clearly the engineer committing the most code on a product team is the most productive, right? Well…not quite. As any software engineer can tell you, great code is measured in quality, not quantity. Often, refactoring (or rewriting) sloppy or inefficient code can actually lead to a reduction in total lines of code contributed for a particular developer. More broadly, the metric in and of itself is unclear. What constitutes a “line of code”, anyway? Literal lines of code terminated by specific characters (Source)? The number of statements ignoring formatting and newline characters (Logical)? The number of machine code instructions produced by a particular section of code (Instructions)? This can get messy, VERY quickly.
Implementing this metric as a measure of developer productivity is guaranteed to provoke aggressive changes in coding style. An engineer could be tempted to copy paste entire sections of code instead of reusing them, add unnecessary fields or methods in classes (such as unneeded getters or setters), or specifically choose implementation patterns that favor longer blocks of code regardless of whether their presence serves the wider purpose of the product. Leaders thinking “Oh, don’t worry about problems like that. The lead engineers will catch and eliminate those kinds of issues in code reviews!” should do so with caution. Any skilled engineer that catches wind of evaluation based on a metric like this is probably already halfway out the door, as it telegraphs a blatant misunderstanding at the leadership level of how software is designed and implemented.
Intended as a way to measure how often each team member contributes to the wider project, this metric instead encourages developers to commit code constantly. A deluge of these infantisimally small, rapid-fire commits can make discovering the source of a problem during a high-impact triage much more difficult. Teams forced into this pattern may also may run into “The Curious Case of the Moving Code,” in which functionality is refactored, refactored, and refactored again into different locations in order to pad commit count. Any developer can give you 100 commits per day if you ask them to, but doing so certainly doesn’t improve the developer’s skill or the quality of your product’s code.
I should mention that of the source control metrics I’m discussing here as toxic to an organization’s developers and products, commit count is probably one of the least egregious. The reason for this is because software developers — particularly those early in their careers — don’t commit often enough. Either they’re waiting until a feature is complete before pushing it up, or they’re petrified of Git itself. Therefore, encouraging more commits isn’t necessarily a bad thing in and of itself. I always tell the teams I coach to “commit early and commit often”, and not be afraid of making a mistake on Git. It’s called source control for a reason, and all but the most blatant of Git mistakes can be easily rolled back with a few commands (or clicks). Still, expectations around commits should be handled on a team-by-team basis and certainly not as an edict from leadership.
Issues Resolved (a.k.a. Velocity)
A velocity metric is, without a doubt, a cowboy coder’s best friend. Often a kind of ‘hero’ developer that thrives on pushing features to production as fast as possible, being empowered by the presence of a metric based on delivery can easily turn your most critical systems into a digital version of Dodge City. A culture encouraging delivery at the expense of all else leads to developers encouraged to exclude testing from ‘definition of done’ and continue to pile on technical debt. An organization built around velocity will be able to deliver features quickly, but their products will inevitably become brittle and break down, especially in unexpected conditions. I’ve seen this happen first-hand, and it isn’t pretty. Depending on the circumstances, surprise failures in brittle products due to rapid feature deployment could lead to losses of millions of dollars. YEEEEE-HAW indeed.
Measuring developer productivity is hard. I’m not saying it isn’t. What I am trying to say is remember “Freakonomics”. If you’re a leader responsible for figuring out developer performance, be mindful of the data you plan to collect and how it may impact the behavior of your engineers in unexpected ways. Not doing so could be dangerous to your engineers, your software, and — perhaps most importantly — your bottom line.