Reviewpad: Living above Bitbucket, GitHub, and GitLab
There are many pains associated with code reviews, from GitHub’s web interface that only displays a textual diff between files, to the inadequate processes that cause you to review highly complex changes.
Reviewpad is a new code review tool with a single purpose: to help developers by simplifying the process. This must happen regardless of the code host, the number of changes, the maturity of the codebase, or the expertise of the reviewer. We wholeheartedly believe in this mission because code reviews are too important to be neglected.
In this post, we will describe Reviewpad’s key design decisions, architecture, and some lessons learned while coliving with GitHub, GitLab, and Bitbucket.
Key Design Features
Reviewpad is a web code review tool for git repositories.
It provides seamless integration with code hosts such as Bitbucket, GitHub, and GitLab (including their self-hosted versions) with a dedicated interface for code reviews featuring new code visualisation methods and static code analysis technology.
Reviewpad was developed with the following key design features:
- On-premise deployments. You can set up Reviewpad in a single developer environment, in a server for dozens of developers to use, or even scale it with a Kubernetes cluster for hundreds of developers. You can monitor all of Reviewpad's requests for maximum security.
- Zero configuration required. You don't need to configure your team members or review settings. Reviewpad mirrors the permission system from your code hosts so that users only have access to the same information as they would have there. The same is applied to actions: you can't merge a pull request on Reviewpad unless you can do it on the code host.
- Smooth team adoption. Reviewpad integrates seamlessly with code hosts. That means that in a team of 10 developers, you don't need all 10 developers to use Reviewpad in their review process. Because all the actions in Reviewpad are propagated back and forth between code hosts, a subset of the team can safely use it for reviews without breaking the existing review process.
- Security first. It has been designed to ease security concerns that are connected with any tool that handles sensitive and proprietary information such as code, comments, and personal information. A running instance of Reviewpad does not communicate with Reviewpad.com servers or any non-disclosed third party to send any information.
Complete abstraction from code hosts
So what is Reviewpad? Reviewpad can be seen as an abstraction layer on top of existing git-based code hosts.
Reviewpad Conceptual Overview
The integration with a code host is so that communication is performed both ways. Reviewpad users will connect to a code host through personal access tokens or via OAuth apps. Actions on Reviewpad (e.g. submission of a user review) will then be sent to the code host using their REST or GraphQL API. Actions on the code host are propagated back to Reviewpad through Webhooks.
To be able to achieve this feature in a modular and scalable way, we needed to overcome two major technical challenges:
- Converting data from git repositories and code reviews from code hosts into a single internal data representation.
- Unifying the code hosts’ APIs so that Reviewpad core services can be code host agnostic.
The first challenge was much harder than we anticipated. We weren’t expecting so many differences between code hosts.
Here’s an example: when it comes to commenting features, GitHub and Bitbucket Cloud behave very differently - the concept of user reviews where you can aggregate a set of comments into the same review doesn’t exist on Bitbucket. A notification will be sent to the PR author for each review comment. Another surprising factor was the API differences between the SaaS and the enterprise versions.
To overcome our second challenge, we brought in our experience-designing program analysers. Akin to LLVM IR (low level virtual machine intermediate representation) when it comes to translation of high-level programming languages, we have designed an intermediate API layer that unifies the code hosts’ APIs. When you are designing an IR for multiple source languages, the first main decision is whether to take the intersection or the union of the language constructs. In our case, we decided to do a fine-tuned intersection where we support the best of multiple code hosts.
Reviewpad is composed of two main components:
- Front-end service: React application that implements the review interface. This is the main service that users interact with in their browser. In terms of network, the front-end service only communicates with the back-end service.
- Back-end service: Microservices application that exposes a REST API.
The following diagram represents an overview of Reviewpad's architecture.
Reviewpad Architecture Overview
As we are effectively importing the metadata from the code hosts and also performing analysis over the code, we use two permanent storage mechanisms:
- PostgreSQL database that stores code host metadata, analysis results, and internal information.
- Docker volume to store git objects.
Reviewpad also comes with a notification system that sends a Slack message or an email similar to the integration that you can already achieve with existing code hosts.
Not depicted in the overview are the services that perform the semantic analysis of the code and the relational analysis of the diff belonging to a pull request.
We have reached a stage where the foundations of Reviewpad allow us to easily implement new features. We have discussed the need to make code reviews independent of pull requests in the past. We now feel like we are in a position to provide the same review experience without them. Even if these pull requests span across multiple repositories in multiple code hosts.
We will be looking to expand on the possibilities of Reviewpad in the next few posts, so please stay tuned! Next week we will be looking at what happens when a repository is imported.
In the meantime, you should try the beta now.