Skip to main content
  1. Posts/

IO's Arc Reactor is Envoy

·676 words·4 mins
Agent IO
Author
Agent IO
Table of Contents
Here’s why IO builds on Envoy.

How we got here
#

Work on IO started after a thorough evaluation of Google’s Service Infrastructure. That revealed that although Google’s API management product based on Service Infrastructure (Cloud Endpoints) seemed powerful and promising, it had a weak link: the condition of its Envoy-based proxy. Developed in 2019, Google’s ESPv2 predated some key Envoy innovations and required a custom build of Envoy. A fresh look seemed appropriate, particularly focusing on the ways that a local controller might be able to take advantage of Envoy’s ext_proc API and many new filters.

What were the alternatives?
#

Since IO was started to better control Envoy, it’s hard to say that any other proxy was considered as seriously as Envoy. NGINX and Caddy deserve mention, but neither offered the full suite of advantages of Envoy that are listed below. A new proxy could have been written from scratch, but that would have been a big distraction from the developer-centric features that IO emphasizes.

The Advantages of Building on Envoy are Big
#

Envoy has a great feature set
#

Envoy supports a huge number of networking features, only a few of which have been used to build IO. As IO matures, Envoy’s catalog of current and future features provides rich opportunities to add capabilities to IO.

Envoy’s gRPC APIs give it great configurability
#

Envoy’s APIs have a steep learning curve, but once you’ve climbed it, they are clearly the best way to configure Envoy. Strongly-typed configuration data structures eliminate the guesswork of Envoy’s YAML-based configuration which, to be honest, is pretty terrible for usability.

Going further, Envoy’s gRPC APIs are fast, and have proven to be great for communicating with IO.

Envoy is great as an out-of-process component
#

Envoy is really just a binary with only a few runtime requirements. It needs a “bootstrap” YAML file when it starts (IO generates this), and IO’s bootstrap configuration tells Envoy to open a gRPC connection to IO for control and operating information. If Envoy exits (which for me, has never happened due to a problem with Envoy), IO can easily restart it, and since all of the network traffic that IO manages first goes through Envoy, this process separation provides nice protection from accidentally- or intentionally-harmful traffic.

Envoy has been seriously hardened and built for scale
#

A lot of organizations use Envoy, and they’ve put a lot of engineering effort into solidifying it and improving it. In 2023, the CNCF commissioned and released this frankly stunning documentary about Envoy:

The Risks of Building on Envoy are Small
#

Envoy is complex
#

Envoy is big and its complexity can be scary. You really need to be familiar with gRPC and Protocol Buffers, and in my experience, any serious effort to write a controller should work up from the Protocol Buffer descriptions of the Envoy APIs. You should also be prepared to search Envoy issues for bugs and workarounds, and get things working incrementally, one small step at a time. But all that said, if you come prepared to do it, you can do it.

Envoy is driven by someone else’s priorities
#

A possibly scarier risk of Envoy is that we’re not driving it. A first principle of IO is “I will not build a custom Envoy,” and so far, there’s always been a way to use off-the-shelf Envoy to build what’s been needed. It’s possible that something that IO depends on will be deprecated, but as long as we use features that lots of other Envoy users are using, our investment seems safe.

Envoy development could introduce bugs
#

A final risk worth noting is that bugs or security problems could appear in Envoy. It seems impossible, but then it happens. But seriously? This shouldn’t be too worrisome. IO users can control which version of Envoy they use (as a separate executable, Envoy is an easy-to-upgrade dependency) and Envoy’s enormous user base gives us hope that serious problems will be caught and fixed quickly.

Yes, Envoy upgrades really are this easy!
#

Comments via 🦋
#