3 important things I overlooked during code reviews

Code review is a popular practice in software development. By incorporating human reviews before merging code into the main branch, it facilitates knowledge sharing among team members and improves software quality.

As a reviewer, I have participated in many code reviews. Certain aspects have always stood out to me, such as whether the design considers edge cases and whether the code has adequate unit test coverage. However, there were other seemingly trivial aspects that I did not pay enough attention to until recently, when I began to realize their importance.

Here are three important things that I had previously overlooked.

1. Naming

A young girl, Chihiro, enters a bathhouse run by the witch Yubaba and designed for the gods. To work at the bathhouse, Chihiro signs a contract with Yubaba, but the contract is not the key point. Instead, it is a seemingly trivial action - Yubaba renames Chihiro from "千寻" to "千". Once Chihiro loses her original name, she also loses the ability to escape the bathhouse's otherworldly realm, and is condemned to eternal servitude under Yubaba.

-- The movie "Spirited Away"

Programmers' focus on "naming" seems to follow an "inverted U-shaped" curve. When they lack experience, their attention to naming is very low, resulting in code filled with inaccurate and imprecise names that fail to effectively describe various abstract concepts.

The following code snippet illustrates several naming problems:

def get_var(u):
    """Get the environment variables."""
    data1 = UserVarsManager.get(u)
    data2 = SiteVarsManager.get(u.site)
    return data1 + data2

With more experience, more attention is paid to naming conventions in projects. Names are becoming more descriptive, and vague names are gradually disappearing. At the very least, names should no longer be a barrier to understanding the code.

At this stage, the code evolves to look something like this:

def list_environment_vars(user):
    """Get the environment variables."""
    items_user = UserVarsManager.get(user)
    items_site = SiteVarsManager.get(user.site)
    return items_user + items_site

In the vast majority of reviews, this would definitely qualify as satisfactory code, unlikely to cause disputes over naming issues.

After this point, most programmers' focus on naming follows a "Inverted U-shaped" curve: they no longer pay as much attention to naming as before, satisfied as long as the names are somewhat descriptive and unambiguous. I was one of them.

However, we should not stay at this stage for too long. As code reviewers, we should continually hone our sensitivity to naming. For example, for the code mentioned earlier, the following review suggestions might be considered:

def list_environment_vars(user): # 1
    """Get the environment variables."""
    items_user = UserVarsManager.get(user) # 2
    items_site = SiteVarsManager.get(user.site)
    return items_user + items_site
  • Comment 1: The project uses a standard abbreviation for "environment variables" as env_variables / env_vars, and should be consistent here, using list_env_variables or list_env_vars.
  • Comment 2: The naming of UserVarsManager.get can be optimized, as Manager is a "catch-all" term that fits in various contexts, but sacrifices the precision of the name (equivalent to "responsibility"). Consider using a more precise name, such as UserEnvVariables.get(user); the same goes for `SiteVarsManager'.

Although these are just two minor improvements, many small improvements can add up to a significant effect.

Every code review inevitably involves many new names. However, not all names are created equal; not all names deserve our time and focus. It's best to focus on the most commonly used names that are closest to the user, such as resource names in URL paths, database models and field names, utility function (class method) names, and so on.

In addition, domain-specific vocabulary that is directly related to the business is extremely important. During reviews, each key domain term deserves careful consideration and repeated scrutiny. For example, if you are developing a movie review feature, what should be the names for "user rating," "media rating," and "average rating"? You certainly wouldn't want to see movie_score in one file and movie_rating in another.

Though naming may seem trivial, the larger the project and the longer its duration, the more likely minor differences in name quality can accumulate into a significantly impactful outcome.

2. Guide comments

“Charlotte had written the word RADIANT, and Wilbur really looked radiant as he stood in the golden sunlight. Ever since the spider had befriended him, he had done his best to live up to his reputation. When Charlotte's web said SOME PIG, Wilbur had tried hard to look like some pig. When Charlotte's web said TERRIFIC, Wilbur had tried to look terrific. And now that the web said RADIANT, he did everything possible to make himself glow.”

-- "Charlotte's Web" by E.B.White

Regarding comments, I have always believed Uncle Bob's view in Clean Code: "The proper use of comments is to compensate for our failure to express ourself in code." That is, good code should always convey its intent clearly, without the need for redundant comments. Comments should only be used to describe information outside the code, such as explaining "why".

Therefore, comments should always be used sparingly. If a piece of code is hard to understand, the first reaction should not be to add comments, but to try to rewrite it in a more understandable way.

Over time, however, I have come to realize that things are not so black and white. "Guide comments", or what some might criticize as "almost restating the intent of the code," also play an irreplaceable role.

The creator of Redis, antirez, is a strong supporter of guide comments. He once wrote an article detailing the role of guide comments in the Redis project. The following code snippet from the Redis source code contains several "guide comments":

    /* Call the node callback if any, and replace the node pointer
     * if the callback returns true. */
    if (it->node_cb && it->node_cb(&it->node))
        memcpy(cp,&it->node,sizeof(it->node));

    /* For "next" step, stop every time we find a key along the
     * way, since the key is lexicographically smaller compared to
     * what follows in the sub-children. */
    if (it->node->iskey) {
        it->data = raxGetData(it->node);

        return 1;
    }

In this snippet, the two comments add no new information beyond the code itself. So what's the benefit?

The most direct benefit is that these comments make the code easier to understand, significantly reducing the mental effort required to read the code. The same piece of code without directive comments might take 10 minutes to fully understand, but with comments, that time might be reduced to 5 minutes or even less.

When new developers join the project, these directive comments can help them get up to speed more quickly.

For this reason, when reviewing a piece of code, I often comment on a segment of complex logic: "Nit: Consider adding a brief directive comment to clarify the code's behavior."(Nit=nitpick, indicating a suggestion for improvement that is not a strong demand).

In addition, if a section of code has sparked some in-depth discussions during the review process, those discussions may be well suited to be reprocessed and added into the code as guide comments. Sometimes they work wonders for understanding the code.

However, in pursuing "directive comments," it's also important to avoid the following pitfalls:

  • Comment restate code: While guide comments serve as an aid to understanding the code, they should not simply restate the code. Simply put, consider the stylistic difference in the information conveyed: if the code is a dense, authoritative science textbook, then the guide comments are like a concise science primer aimed at middle school students.
  • Obsession with comment ratio: Do not set rigid targets for "code comment ratio". The quality of comments matters, and blindly chasing quantity can be counterproductive.
  • Outdated comment: Outdated comments can be more harmful than the code itself. Be sure to modify or delete outdated comments in a timely manner.

In summary, you can think of directional comments as targeted "teaching text" for the code. When reviewing code, if you find a piece of logic difficult to understand and there is little room for optimization in the code itself, do not hesitate to express your need for this "teaching text"!

3. Ways to communicate

“I suffered so much over Ruth and Sarah leaving us. And I suffered all the more because I believed I was alone."

"Really, Miss Kenton …" I picked up the tray on which I had gathered together the used crockery. "Naturally, one disapproved of the dismissals. One would have thought that quite self-evident.”

-- "The Remains of the Day" by Kazuo Ishiguro

To this day, many still view software development as a solitary endeavor. A programmer, armed only with a keyboard, can seemingly produce endless streams of code without the need for anyone else. However, the era of the lone programmer is long gone. Modern software development has evolved into a collaborative affair involving multiple participants. As such, a programmer's day is filled with various forms of communication, including participation in code reviews.

During code reviews, the reviewer's task might seem simple and straightforward: point out defects in other people's code. Sounds easy, right? I once thought the same-that a review could be neatly summarized by the following "123" steps:

  1. Identify all the points that can be optimized (Stick to the facts!)
  2. Wait for the submitter to make the changes, or confirm to keep the original after discussion (Discuss the matter at hand!)
  3. Merge the code ( Mission accomplished!)

Yet reality often falls short of the ideal. Code reviews rarely go as smoothly as described above. Because once you get into interpersonal communication, especially when one party is pointing out flaws in the work of another, how can things remain simple?

Humans are a fascinating, intelligent species; when reading a text, they can not only extract information, but also sense the emotions embedded in the lines. Sometimes these emotions can overshadow the information itself and influence their judgments. Therefore, when participating in reviews, remember this: stay humble, respect others, regardless of their experience or background. Excellent expression ensures that even in criticism, the other party feels respected.

Let me give you an example. A newcomer to the team submitted a PR using Python, a language he was not very familiar with. As the reviewer, you noticed a lengthy loop in the code and wrote the following comment:

The code is too verbose, use a list comprehension instead.

While your point is valid, the way it's expressed could be reconsidered. Here’s an alternative version of this comment:

This loop only involves filtering and transformation, which makes it a good candidate for a list comprehension to simplify the code. Here's an example:

items = [to_item(obj) for obj in objs if obj.is_valid()]

Ref: https://docs.python.org/3/tutorial/datastructures.html#list-comprehensions

Compared to the first comment, the latter is less likely to provoke a defensive reaction from the newcomer and is more likely to be accepted. This reflects the old adage, "The way you express something is as important, if not more important, than what you express."

In addition to being humble and respectful, here are some other worthwhile communication tips for reviews:

  • An example is worth a thousand words: Instead of writing a long explanation, it's sometimes more effective to write a few lines of code that provide a practical example.
  • Tailor your language to your audience: You should (and can) treat developers who have been on the team for a month differently from those who have been there for a year; with inexperienced newcomers, be cautious with your language to avoid making them feel disrespected or overly frustrated; with seasoned colleagues, you can afford to be more straightforward and concise, without being overly verbose.

I believe that most people fundamentally agree that code reviews should focus on the code, not the person, and that criticism of the code should not be seen as a personal attack. However, advocating for better communication does not conflict with this. If the communication makes both parties feel satisfied, it's more likely to be effective. Therefore, improving the way we communicate can enhance work efficiency. Why not embrace it?

Summary

Code reviews, as a vital means of ensuring software quality, are an indispensable part of large-scale software development. This article summarizes my reflections as a review participant in the areas of naming, guide comments, and communication styles, with the following key points:

  • Programmers should continually improve their sensitivity to naming during code reviews.
  • Two techniques for checking names: maintaining consistency with similar nouns, and replacing "one-size-fits-all" names with more precise terms.
  • Treat names differently, paying extra attention to the most important ones.
  • For any project, domain-specific (business-related) names are critical and deserve careful consideration and scrutiny.
  • Guide comments, while not providing much information beyond the code, play an irreplaceable role.
  • Guide comments greatly reduce the mental effort required to understand the code.
  • Be aware of several pitfalls with guide comments: merely repeating the code, pursuing a "comment ratio," and neglecting outdated comments.
  • During review, be proactive in requesting additional guide comments for complex code logic.
  • The ideal review is straightforward and focused on the issue at hand, but because it involves interpersonal communication, the reality often differs from the ideal.
  • Text not only conveys information, it also carries emotions, which can affect the effectiveness of communication.
  • Always stay humble and respectful during reviews, regardless of the other person's experience or background.
  • Adopt different communication styles with reviewers who have different levels of experience.

Code review is a complex, multi-participant, nuanced task. High-quality reviews can significantly improve both quality and team atmosphere, while poor-quality reviews can degenerate into formalism or even cause discord and dissatisfaction within the team.

The factors that affect the quality of reviews are often hidden in seemingly insignificant details and minor issues. I hope my reflections on the "little things" may inspire your work.