Measuring Code Quality: 3 Suggestions On How To Start And Why They Are Important

June 17, 2019 in Code Review

If I ask you to tell me how much quality your code has, how would you answer it? There’s not a correct answer, and responses vary depending on seniority or past experiences of each one. As there is no consensus of what’s quality, here’s a list of aspects I think your code must have to be considered good (not necessarily in this order):

  • It must work. Whatever the feature is, it should only do what it was designed to do. No more, no less.
  • There must be automated tests to ensure it is working and to describe how it works.
  • Code is well documented (at least as much as it needs).
  • The code is clear and shows its intention by having well-chosen variables, methods, and class names.
  • It is easy to maintain, or in other words, it is easy to change.
  • It has no security breaches.
  • It is fast (at least fast enough for its usage).

Different features may be added depending on the kind of software you’re building. You may need your code to meet some legal issues if you work in a bank or an insurance firm, for instance.

OK, we got attributes of a good code. Why should I measure it?

In my opinion, measuring is the best way of looking at the situation as it is. It is essential to listen to the feelings of developers and your senior developers. They may have a very accurate sense of codebase health status. However, you can’t rely just on it.

Developers are humans, and as humans, they are influenced by emotions, by egos (have you ever heard lines like “my code is beautiful, I’m an artist”?) and also by their experience.

This last aspect, the experience is the most crucial one. Depending on how much messy code has a developer dealt with, its definition of quality changes. That’s why measuring is so important.

You get a standardized view of your code health.

Another great thing about metrics is looking at its history to make questions. For instance, if your number of code smells in September is 33 and October it goes down to 22, you could tell your team “what have we done? Let’s keep doing this because it improved the overall code quality”! Alternatively, in the worst scenario, asking your team, “why are code smells growing so much this month?” may raise important issues, like your team being pressured by the product team to deliver faster and faster without looking to quality.

These numbers allow you to ask questions, understand your process and the impacts of your changes.

OK, I agree that measuring is essential. What should I measure?

Many companies look to the number of bugs as an indicator of quality, but the fact is that it relates more to the quality of the process than to the health of your codebase. Of course, bad code leads to more bugs, however not every bug is due to bad code.

Well, for me, it is. Moreover, it tends to happen when there’s no one looking for quality in the team. We don’t usually think about quality before writing a new feature, nor we think about quality when done. We only think about quality during the development of a feature where we find a lot of… Let’s say, “what the fuck.”

That’s the reason the most common metric for software development is “WTF/minute,” as illustrated in the cartoon below by Thom Holwerda.

WTF/minutes metric

Despite being funny, it raises the question how should I measure quality then? Well, there isn’t a right answer to that. However, we’re going to present some ideas.

Here is a list of useful metrics that you may measure. They are simple to calculate and there are automated tools to calculate all of them (there are plenty more if you google for it).

Measuring code complexity

How easy it is to change is directly related to its quality. So, the first health metrics should be related to its complexity. Calculating complexity can be hard, but there are a lot of open-source tools available for this purpose.

It consists of calculating the number of possible lines executed in all possible arrangements for a given piece of code (or throughout your entire code base).

The formula is Cyclomatic Complexy = Edges – Nodes + (Nodes with exit point * 2). Let’s see it in practice. Given the following code, written in Ruby using ActiveModel API:

post_changes = post.changes || {}
if post_changes.present?
  post.save!
  ChangeHistory.create!(entity: 'Post', entity_id: post.id, changes: post_changes)
  if post_changes[:slug].present?
    Redirect.create!(
      path: post_changes[:slug].first,
      redirect_to: post_changes[:slug].last
    )
  end
end

The following image shows the code as nodes and edges of a graph that represents the path of its execution. Notice that it ignores the fact that bang methods (methods ended with an exclamation mark) may raise an exception and it would stop the execution.

In this case, the complexity would be:

Number of edges (E) = 9
Number of nodes (N) = 8
Nodes with exit points (P) = 3

Applying the formula: N - E + (P * 2)
Cyclomatic Complexity = 9 - 8 + (3 * 2)
Cyclomatic Complexity = 7

OK. Now, let’s try to reduce this number by refactoring the code. A good strategy is to reduce the number of nested ifs, and we could achieve this by the following piece of code:

post_changes = post.changes || {}
post.save!

if post_changes.present?
  ChangeHistory.create!(entity: 'Post', entity_id: post.id, changes: post_changes)
end
if post_changes[:slug].present?
  Redirect.create!(
    path: post_changes[:slug].first,
    redirect_to: post_changes[:slug].last
  )
end

Just by looking at this code, one could say it is easier to understand and more comfortable to change. Its intention is way more straightforward: save the post, create a ChangeHistory record if post attributes changed and then, if slug has changed, create a redirect rule. Here’s the graph:

Now let’s do the math:

Number of edges (E) = 8
Number of nodes (N) = 8
Nodes with exit points (P) = 2

Applying the formula: N - E + (P * 2)
Cyclomatic Complexity = 8 - 8 + (2 * 2)
Cyclomatic Complexity = 4

As we see, the second piece of code, although it does the same job, is less complicated than the first one. That’s how measurement can help your code to be easier to understand: by knowing where are the most complicated points. So you and your team can work on it until it is acceptable for your rules. Yeah, acceptable is the word, since there is not a magic number. To determine it, you need first to understand your context.

A good practice is: doesn’t matter which number, keep the job on lessening it.

Measuring code smells

Back in history, when we would go to the woods to collect food, and there was no expiration date impressed on it, we had to use our senses. If you didn’t want to get food poisoned, you would smell it first.

That’s the same analogy to the code. It doesn’t necessarily point to a bug or a rotten piece of code. It just raises the smell that something may not be good. It can be an anti-pattern, duplicated code, bad practices, anything that could be an alert for developers.

Static Analysis can find those smells in an automated way. There are open-source linters that do an excellent job looking for them. The metric here is the absolute number of them found in your code. Very easy to calculate.

Measuring security issues

Good code is secure. That’s why it is crucial running an automated tool that looks for SQL injection, data leaks, and other potential security issues that are common in the daily development routine.

Automated tools are not perfect, and some times, they trigger false-positive alerts. However, it is no excuse for not running them. If your codebase has too many false-positive alerts, you totally should start taking these alerts more seriously.

Static analysis software looks for known patterns (like using eval or not sanitizing your data before inserting in the database) and sums them up. That’s the metric. As simple as that.

Conclusion

The concept of quality may vary depending on the company, on the product stage or even by each personal view. To keep or improve the quality of your code, you need to define what is quality for your team and measure it. There are a lot of other metrics available. Keep your quality definition in mind and look for metrics that ensure your code matches it. There already are many tools designed to maximize and engage discussions about quality. Look for them!

In this blog post, I suggested three metrics, all of them (and much more) are present in our solution. SourceLevel runs automatically for every pull request open and keeps the history of the master branch so you can raise questions about your process looking to data, not opinions.

Sign up for free and enjoy our trial for 14 days!

What is a linter and why your team should use it?

May 31, 2019 in Code Review

More and more teams have adopted the usage of linters and other static tools in their development process. Some integrated them in the IDE of their preference, others automated by running them as an additional step in their CI. Also, there are those who run both ways.

What ‘s a linter, then?

According to Wikipedia, linter

is a tool that analyzes source code to flag programming errors, bugs, stylistic errors, and suspicious constructs.

The first linter was written by Stephen C. Johnson in 1978 while working in the Unix operating system at Bell Labs. After that, many other linters have been written for different purposes and languages, not only C.

First linters used to check the source code and find potential optimizations for compilers. But, over the years, many other checks and analysis would be included in the process of linting:

The usage of linters has also helped many developers to write better code for not compiled programming languages. As there is not compiling time errors, finding typos, syntax errors, uses of undeclared variables, calls to undefined or deprecated functions, for instance, helping developers to fix it faster and reduce bugs prior to execution.

Linters have evolved

Linters have evolved. They started with those simple checks, but nowadays they are getting more and more sophisticated. They perform Static Analysis, enforce configuration flags, check for compliance with a given style-guide or security rule, and a lot more.

Let’s explore some of these checks and how they can be useful for you.

Static Analysis

Static Analysis means that automated software will run through your code source without executing it. It statically checks for potential bugs, memory leaks, and any other check that may be useful.

If you’re a Python developer, you may already know Radon. It can count the source lines of code (SLOC), comment lines, a blank line, and other raw metrics, but also, it can calculate a “Maintainability Index”, which may be very important in some projects.

That’s just an example. There are plenty of other linters that perform Static Analysis checks.

Code standardizing

From https://xkcd.com/1285/

Standardizing your code is a great way to move the conversation to a more productive level. Having a guideline and running linters against the codebase will avoid aesthetical changes in your pull request, like replacing all tabs for spaces, indenting a variable assignment or even line breaks after a given number of characters.

Maximizing meaningful changes will take your discussion to topics that really matter, like architectural decisions, security issues, or potential bugs.

By the way, security issues and potential bugs also can be avoided by linters!

Security Issues

If you’re into Rails, you have probably heard about Brakeman. It’s a Static Analysis Security Tool. It’s very handy to find potential security issues. For instance, it runs checks looking for SQL Injections when using ActiveRecord #find_or_create_by and friends. It also adds checks for XSS, config options and much more.

Ruby is not the only language with this kind of engine. Ebert supports more than 40 engines for different languages. Brakeman included.

Potential bugs

JavaScript has some tricks. One of the most famous is the difference between == and ===. It’s a good practice, and it avoids a lot of debugging time, to always use ===. If you enable, for instance, ESLint to check for that, it can tell you what part of your code is using == and even replace it for you.

Performance improvements

Every experienced developer knows not only the importance of performing software but a lot of tricks that improve it. The problem is: what about newcomers? How can you pass this knowledge forward? Even senior programmers can miss a technique or two. So, why not let an automation software do it for you?

Did you know that in CSS the universal selector (*) may slow down a page loading time? Or that unqualified attribute selectors have the same performance characteristics as the universal selector? Avoiding them is good practice.

Many linters include performance check. They can add different kinds of performance improvements for experienced and newcomers developers. CSSLint is just an example.

And many other aspects

To infinity and beyond! There are lots and lots of linters for different programming languages, configuration files, and even for software integrations. Any check that matters and can be automated may turn into a linter.

If you work in a very specific scenario you may have to write your own, but that’s not too likely. Checks for HTML Accessibility features, Internalization potential errors, grammar errors, and many others are already there, open-sourced, waiting for you to download, configure and start using.

Benefits of linting

According to Ferit T., linting improves readability, removes silly errors before execution and code review. But, as mentioned, linting may do more complex jobs, like detecting code smells or performing static analysis of your codebase.

But, in practice, what are the advantages of linting?

It improves code review discussion level

If your Pull Request has no typos, nor unused variables, and is compliant with the style guide, the conversation tends to be focused in the architecture point of view. That’s it. Pull Request is a great place to point performance issues, securities vulnerabilities or suggesting better abstractions. You don’t need to get in that single or double quotes or tab vs spaces discussion.

This is a productivity gain for sure.

Makes code look like written by a single person

To the medium and long term having a solid code base that looks like written by the same person is good. Maintainability and evolution are easier because everyone tends to understand what’s written faster and clearer. It prevents bugs, makes the job more joyful for developers and accelerates the time to market of new features.

Gives visibility of your codebase health

Is your code healthy? You won’t know until you measure it. A good way of doing so is to adding a step in your CI/CD pipeline to measure the evolution of your code health status. Better than that, you can take actions as soon as possible when you see its health to decay. Such actions may imply on creating technical debt cards in your board or even raise the issue during your agile retrospective or architecture committee meeting.

Spreads awareness and ownership over code quality

Experienced developers can look into a couple of files and tell how easy is a software to change. They probably know where to look: are variables names descriptives? how many lines does a method take? Is there any superclass?

Well, not every employee will have this kind of look over the code. Especially newcomers. Having a linter to tell developers where code smells are is a good way to spread this knowledge and make the team responsible for the changes.

Conclusion

Linters will help you get more productive and save you time and money. They will drive your team to better decisions (those oriented by data) and share ownership over the quality.

SourceLevel allows you to configure almost 30 different linters to automatically run for each pull request opened by your team. It also provides you a nice dashboard with charts showing the results throughout your product’s lifecycle.

Interested? Check all supported linters and give it a try for 14 days!