A few notes on the code review process, and the steps I go through when reviewing.
I previously jotted down some principles for code reviews, and did not discuss what actually happens in a code review.
While there are plenty of other code review checklists (here's one from Fog Creek for example), I prefer those which are prioritized - it tends to reflect the thought process I go through as I am reviewing.
So here's what I look for in code reviews, in order. Note that I mostly review Python code, so there are no steps for checking compilers etc :)
The first thing I consider is at the requirements level.
Clearly it is always better to have checked that we actually want to build this particular feature before we start implementation, but sometimes this is not always the case. My first step therefore, is to take a step back and think through both:
- Is this implementation solving a problem which needs solving? Some "problems" are the classic feature-not-bug cases which prevent or serve some other purpose. Most of software engineering involve making tradeoffs, so let's check we actually do want to make this particular tradeoff.
- Is this problem best solved through code development? As an example, is this implementation adding a lot of complexity for only a small number of users - which could be addressed manually or through an external process? Alternatively, for code paths which rely on integrating with third parties - perhaps a simple phone call to the third party could solve the problem.
- Are there any missing requirements? Try to run through scenarios which the requirements writer may not have envisioned, perhaps an attack vector for a security hole, unmentioned error states, and so on.
2. System Design
Next, I look at how this implementation fits in to the existing system design, and whether this implementation is the best approach to solving the problem. I'm looking at:
- Is this being implemented with the correct approach? eg synchronously vs job queue vs cron job vs manual trigger
- Is this being implemented in the right place within the existing codebase? ie checking if there is a more suitable module or class for this code to be in.
- Are any existing classes / methods being reused intelligently? Adhering to the principle of "Don't Repeat Yourself" means reducing duplicated code.
3. Implementation Details
Now we get to the heart of the code review, where I look more closely at the code which has been written itself. Elements I'm looking for include:
- Does the implementation meet the requirements? At its most basic level, check that the code solves the problem identified in the requirements.
- Have all user experience scenarios been considered? eg error messages, empty "no data" states, non logged in users, etc.
- Is the user experience acceptable on both mobile and desktop? Note: I don't normally check all browsers.
- Are classes, methods, objects and variables logically named? I check this to keep code readability high, which helps with maintainability throughout the life of the codebase. I am particularly checking that names are consistent with any existing usage, does not clash with language-specific names, and distinct enough that searching for a particular keyword in the codebase does not return thousands of results.
- Are classes and methods sufficiently concise? I find that shorter classes and methods are less prone to error now and in the future, and I'm wary of any methods which extend past a few hundred lines. In these cases, my code review comments will be about decomposing the large class / method into smaller helpers or functions.
- Is there a logical hierarchy? Both at the package/class level and the class/method level, I'm looking for indicators that one class / method is either called first (entry points), is more foundational (base classes), or more "important" than another. This should be reflected in the placement (eg top-level modules vs subpackages in Python) and in the naming (eg use of "_" prefix to indicate a "private" method in Python).
- (extensibility) Are the correct values set as constants, parameters, or config values rather than hardcoded? ie build for some idea of expected changes
- (readability) Are there sufficient comments and samples / fixtures for larger data structures?
- (cleanup) Are there any unused variables / constants / methods which can be deleted? Self explanatory.
4. Testing, Monitoring, and Security
Assuming everything listed above is passable, I'm now mentally getting this code ready for production use.
- Are there sufficient unit tests? I'd love to say we always have 100% code coverage but realistically I aim for any critical, larger, or complex methods to be covered.
- Do the tests cover all reasonable scenarios / codepaths? A useful practice has been to think through what are the possible scenarios for a user when encountering this feature (ie with incomplete data, invalid data, etc) and look for tests to cover these.
- Are there functional tests? Definitely useful for critical user flows.
- Is there sufficient logging? We send logging calls through a number of different channels and log levels, I want to ensure somebody gets an sms or email only when it is sufficiently important but that we log all exceptions / error cases.
- Are we protecting from user input? eg SQL injections, etc
- Are we exposing any filesystem vulnerabilities?
- Are any configuration values encrypted correctly? Same with environment variables, access tokens, client secrets, etc.
- Are there any additional steps for deployment? Most deployment steps are already automated, but perhaps we need to notify other business users or partners, update documentation, and so on.
5. Future Considerations
- Are there any scenarios where the implementation falls short? Highlight or create tickets.
- Does this implementation imply any future opportunities or problems? ie for product requirements, user experience, business processes, etc