Talk: Nakadi at Zalando’s first Kafka meetup

Back in November, a few colleagues and I (actually, our entire team) organised our first Kafka meetup. In this meetup we wanted to bring together engineers and devops who run software around Kafka, or maintain Kafka in production, to exchange knowledge and discuss our experiences. We wanted to talk about successes as well as failures and challenges. No sales pitches, just the truth of what we have struggled with.

For this first edition, all speakers were engineers at Zalando, as we didn’t know how much interest there would be from outside. We had short talks (10 minutes each, plus another 5 minutes for questions), and we had seven of them (yes, seven).

After an introduction by our team’s engineering lead, Himanshu Gahlaut, I talked about Nakadi for a bit. My colleague Ricardo de Cillo then talked about operating Kafka on AWS. He talked about choosing the right EC2 instances, the size of the cluster, the amount of disk space to use; failures, and how to recover from them; and configuring Kafka to run smoothly on virtual machines that could get terminated at any moment.

Dmitry Sorokin then spoke about Bubuku, our open source supervisor for running Kafka on AWS. Bubuku is a very interesting supervisor, with a lot of features. Not only can it control individual brokers, but it can also trigger rolling restarts of an entire cluster, calculate a fair distribution of partitions among brokers and trigger the appropriate rebalance operations, and much more.

After the break, Andrey Dyachkov discussed how we upgrade Kafka brokers without losing the broker’s data, and how the same mechanism can deal with a brokers that gets terminated in the middle of the night. Long story short, it comes back with the same storage, without manual intervention.

The next speaker was Max Schulze, from a team we work with very closely. He works on Zalando’s data lake, and talked about some aspects of how it was built.

Daniel Truemper, from yet another team, talked about how they operate Kafka for communication between microservices.

All the talks were recorded. As far as I know the video is not yet available online, but hopefully that will change soon. Unfortunately, something was wrong with the projector, and the slides were displayed with a very strong shade of alien green. I wish I had taken pictures. You’ll see it on the video. It’s really green.

The meetup turned out to be very successful: the room was full, and the feedback we got during and after the event was very positive (we’ll try to have more, and colder, beer next time). So we decided to have another one. We’ll have a bit less talks, to leave more time for discussions, but we will keep the 10-15 minutes per talk format. Watch out for the annoucement, it should be out around the end of February!