Reference Architecture: Event-Driven Microservices with Apache Kafka
Last updated 22 May 2020
This architecture shows how to coordinate a set of decoupled, fungible, and independent services on Heroku by using Apache Kafka on Heroku as a scalable, highly available, and fault-tolerant asynchronous communication backbone.
- You have a large number of microservices that need to communicate asynchronously.
- You want your microservices to be decoupled, fungible, and independently maintained.
- You have one or more services that produce events that need to be processed by many services.
- You want to use a microservices communication pattern that is more decoupled than the typical HTTPS approach.
This reference architecture uses Apache Kafka on Heroku to coordinate asynchronous communication between microservices. Here, services publish events to Kafka while downstream services react to those events instead of being called directly. In this fashion, event-producing services are decoupled from event-consuming services. The result is an architecture with services that are scalable, independent of each other, and fungible.
Using Kafka for asynchronous communication between microservices can help you avoid bottlenecks that monolithic architectures with relational databases would likely run into. Because Kafka is highly available, outages are less of a concern and failures are handled gracefully with minimal service interruption. As Kafka retains data for a configured amount of time you have the option to rewind and replay events as required.
Apache Kafka for scalable, decoupled coordination of a set of microservices
- An Apache Kafka on Heroku instance, which will act as the broker
- A set of individual services that will be configured to consume messages from Kafka
- A (potentially overlapping) set of services that will be configured to publish events to Kafka
Setting up the microservices
- Isolate Kafka consumers and producers into their own Heroku apps and scale them as needed.
- If you are transforming a monolith into microservices, see this blog post, which documents one approach to moving to such a system.
- Read the documentation on various client libraries to your apps to communicate with Kafka.
Asynchronous messaging & Kafka setup
- Provision the Apache Kafka on Heroku add-on.
- Be sure to share the same Kafka instance across all of the apps that represent your producers and consumers.
- Don’t be afraid to take a hybrid approach to microservices communication; sometimes it makes sense to use both HTTPS and Kafka messages.
- For more information on configuring Kafka, see the Apache Kafka on Heroku category.
The following architecture diagram depicts a simple event-driven microservice architecture, which you can deploy using this Terraform script.
This particular example is a hybrid system that uses both asynchronous messaging and HTTPS. Events originate from a mock e-commerce application (
edm-ui) and are sent over HTTPS to
edm-relay, which then writes them to their respective Kafka topics. Those messages are consumed by two different apps:
edm-stream apps are part of different Kafka consumer groups so that all events are processed by each service. Each service has a different business purpose driven by events:
edm-streamstreams events directly to
edm-statsrecords statistical information about events, saves that data in a Heroku Postgres database, and provides a simple API for that data.
edm-dashboard is a data visualization UI that initially requests historical statistical data from
edm-stats over HTTPS and receives streaming events from
edm-stream via Socket.io.
Because of the decoupled nature of this architecture, adding additional services to consume events is easy: just create another service that is part of a new consumer group, and it will now be subscribed to the topics of your choice.
Pros / Cons
- Services are decoupled, fungible, and independent.
- Messages are buffered and consumed when resources are available.
- Services scale easily for high-volume event processing.
- Kafka is highly available and fault-tolerant.
- Consumers / producers can be implemented in many different languages.
- This architecture introduces some operational complexity and requires Kafka knowledge.
- Handling partial failures can be challenging.
- Deconstructing Monolithic Applications into Services
- Managing Real-time Event Streams and SQL Analytics with Apache Kafka on Heroku, Amazon Redshift, and Metabase