Migration to Google Cloud Platform — gRPC & grpc-gateway

Our guest post today comes from Miguel Mendez of Yik Yak.

This post was originally a part of the Yik Yak Engineering Blog which focused on sharing the lessons learned as we evolved Yik Yak from early-stage startup code running in Amazon Web Services to an eventual incremental rewrite, re-architecture, and live-migration to Google Cloud Platform.

In our previous blog post we gave an overview of our migration to Google Cloud Platform from Amazon Web Services. In this post we will drill down into the role that gRPC and grpc-gateway played in that migration and share some lessons which we picked up along the way.


Building gRPC services with bazel and rules_protobuf

gRPC makes it easier to build high-performance microservices by providing generated service entrypoints in a variety of different languages. Bazel complements these efforts with a capable and fast polyglot build environment.

rules_protobuf extends bazel and makes it easier develop gRPC services.


gRPC at VSCO

Our guest post today comes from Robert Sayre and Melinda Lu of VSCO.

Founded in 2011, VSCO is a community for expression—empowering people to create, discover and connect through images and words. VSCO is in the process of migrating their stack to gRPC.


Why we have decided to move our APIs to gRPC

Our guest post today comes from Dale Hopkins, CTO of Vendasta.

Vendasta started out 8 years ago as a point solution provider of products for small business. From the beginning we partnered with media companies and agencies who have armies of salespeople and existing relationships with those businesses to sell our software. It is estimated that over 30 million small businesses exist in the United States alone, so scalability of our SaaS solution was considered one of our top concerns from the beginning and it was the reason we started with Google App Engine and Datastore. This solution worked really well for us as our system scaled from hundreds to hundreds of thousands of end users. We also scaled our offering from a point solution to an entire platform with multiple products and the tools for partners to manage their sales of those products during this time.

All throughout this journey Python GAE served our needs well. We exposed a number of APIs via HTTP + JSON for our partners to automate tasks and integrate their other systems with our products and platform. However, in 2016 we introduced the Vendasta Marketplace. This marked a major change to our offering, which depended heavily on having 3rd party vendors use our APIs to deliver their own products in our platform. This was a major change because our public APIs provide an upper-bound on 3rd-party applications, and made us realize that we really needed to make APIs that were amazing, not just good.

The first optimization that we started with was to use the Go programming language to build endpoints that handled higher throughput with lower latency than we could get with Python. On some APIs this made an incredible difference: we saw 50th percentile response times to drop from 1200 ms to 4 ms, and even more spectacularly 99th percentile response times drop from 30,000 ms to 12 ms! On other APIs we saw a much smaller, but still significant difference.

The second optimization we used was to replicate large portions of our Datastore data into ElasticSearch. ElasticSearch is a fundamentally different storage technology to Datastore, and is not a managed service, so it was a big leap for us. But this change allowed us to migrate almost all of our overnight batch-processing APIs to real-time APIs. We had tried BigQuery, but it’s query processing times meant that we couldn’t display things in real time. We had tried cloudSQL, but there was too much data for it to easily scale. We had tried the appengine Search API, but it has limitations with result sets over 10,000. We instead scaled up our ElasticSearch cluster using Google Container Engine and with it’s powerful aggregations and facet processing our needs were easily met. So with these first two solutions in place, we had made meaningful changes to the performance of our APIs.

The last optimization we made was to move our APIs to gRPC. This change was much more extensive than the others as it affected our clients. Like ElasticSearch, it represents a fundamentally different model with differing performance characteristics, but unlike ElasticSearch we found it to be a true superset: all of our usage scenarios were impacted positively by it.

The first benefit we saw from gRPC was the ability to move from publishing APIs and asking developers to integrate with them, to releasing SDKs and asking developers to copy-paste example code written in their language. This represents a really big benefit for people looking to integrate with our products, while not requiring us to hand-roll entire SDKs in the 5+ languages our partners and vendors use. It is important to note that we still write light wrappers over the generated gRPC SDKs to make them package-manager friendly, and to provide wrappers over the generated protobuf structures.

The second benefit we saw from gRPC was the ability to break free from the call-and-response architecture necessitated by HTTP + JSON. gRPC is built on top of HTTP/2, which allows for client-side and/or server-side streaming. In our use cases, this means we can lower the time to first display by streaming results as they become ready on the server (server-side streaming). We have also been investigating the potential to offer very flexible create endpoints that easily support bulk ingestion with bi-directional streaming, this would mean we would allow the client to asynchronously stream results, while the server would stream back statuses allowing for easy checkpoint operations while not slowing upload speeds to wait for confirmations. We feel that we are just starting to see the benefits from this feature as it opens up a totally new model for client-server interactions that just wasn’t possible with HTTP.

The third benefit was the switch from JSON to protocol buffers, which works very well with gRPC. This improves serialization and deserialization times; which is very significant to some of our APIs, but appreciated on all of them. The more important benefit comes from the explicit format specification of proto, meaning that clients receive typed objects rather than free-form JSON. Because of this, our clients can reap the benefits of auto-completion in their IDEs, type-safety if their language supports it, and enforced compatibility between clients and servers with differing versions.

The final benefit of gRPC was our ability to quickly spec endpoints. The proto format for both data and service definition greatly simplifies defining new endpoints and finally allows the succinct definition of endpoint contracts. This means we are much better able to communicate endpoint specifications between our development teams. gRPC means that for the first time at our company we are able to simultaneously develop the client and the server side of our APIs! This means our latency to produce new APIs with the accompanying SDKs has dropped dramatically. Combined with code generation, it allows us to truly develop clients and servers in parallel.

Our experience with gRPC has been positive, even though it does not eliminate the difficulty of providing endpoints to partners and vendors, and address all of our performance issues. However, it does make improvements to our endpoint performance, integration with those endpoints, and even in delivery of SDKs.


gRPC Project is now 1.0 and ready for production deployments

Today, the gRPC project has reached a significant milestone with its 1.0 release. Languages moving to 1.0 include C++, Java, Go, Node, Ruby, Python and C# across Linux, Windows, and Mac. Objective-C and Android Java support on iOS and Android is also moving to 1.0. The 1.0 release means that the core protocol and API surface are now stable with measured performance, stress tested and developers can rely on these APIs and deploy in production, they will follow semantic versioning from here.

We are very excited about the progress we have made so far and would like to thank all our users and contributors. First announced in March 2015 with Square, gRPC is already being used in many open source projects like etcd from CoreOS, containerd from Docker, cockroachdb from Cockroach Labs, and by many other companies like Vendasta, Netflix, YikYak and Carbon 3d. Outside of microservices, telecom giants like Cisco, Juniper, Arista, and Ciena, are building support for streaming telemetry and network configuration from their network devices using gRPC, as part of OpenConfig effort.

From the beta release, we have made significant strides in the areas of usability, interoperability, and performance measurement on the road to 1.0. In most of the languages, the installation of the gRPC runtime as well as setup of a development environment is a single command. Beyond installation, we have set up automated tests for gRPC across languages and RPC types in order to stress test our APIs and ensure interoperability. There is now a performance dashboard available in the open to see latency and throughput for unary and streaming ping pong for various languages. Other measurements have shown significant gains from using gRPC/Protobuf instead of HTTP/JSON such as in CoreOS blogpost and in Google Cloud PubSub testing. In the coming months, we will invest a lot more in performance tuning.

Even within Google, we have seen Google cloud APIs like BigTable, PubSub, Speech, launch of a gRPC-based API surface leading to ease of use and performance benefits. Products like Tensorflow have effectively used gRPC for inter-process communication as well. Beyond usage, we are keen to see the contributor community grow with gRPC. We are already starting to see contributions around gRPC in meaningful ways in the grpc-ecosystem organization. We are very happy to see projects like grpc-gateway to enable users to serve REST clients with gRPC based services, Polyglot to have a CLI for gRPC, Prometheus monitoring of gRPC Services and work with OpenTracing. You can suggest and contribute projects to this organization here. We look forward to working with the community to take the gRPC project to new heights.