Last Month in Nakadi: March 2018

This is the second instalment in the series of blog posts, “Last Month in Nakadi”, where I try to give some more details on new features and bug fixes that were released over the previous month.

March saw an important dependency update, as well as a new feature. The former is thanks to our colleague Peter Liske, who has been working on the issue for quite some time.

JSON-schema validation library now uses RE2/J for regex pattern matching

Peter alerted us about the problem, and fixed it upstream. It turns out that a well-crafted regular expression in a schema could become a regex bomb, when used to evaluate a simple string. Peter demonstrated how easy it would be to “kill”one instance of Nakadi with a single message for several minutes – and kill a whole cluster by sending a sufficient number of messages.

The issue is with the default (PCRE) regex matching library used in Java. Peter swapped it for the RE2/J library in the dependency we use for json-schema validation, and now Nakadi can survive evaluating even the nastiest of regular expressions.

New feature: select which partitions to read from in a subscription

In February, we made the decision to deprecate the low-level API in Nakadi. It is currently still supported, but will be removed in the future. The subscription API did not cover one common use case that the low-level API provided: the ability for a consumer to choose which partitions to consume events from. For some users, it is important to make sure that all events in a given partition will be consumed by the same consumer. Perhaps they do some form of de-duplication, aggregation, or re-ordering, and such a feature makes their job a lot easier.

When we announced the deprecation of the low-level API, we promised to implement that feature in the subscriptions API, to allow users to migrate without issues. This is now done, and users can check out the relevant part of the documentation.

Here is a simple example of how it works. The “usual” way to consume from the subscription API is by creating a stream to get events from a subscription. Nakadi will automatically balance partitions between the consumers connected to the subscription, so that each partition is always connected to exactly one consumer. Given a subscription with ID 1234, it works like this:

GET {nakadi}/subscriptions/1234/events

Pretty simple. Now, if you want to specify which specific partitions you want to consume form, you need to send a “GET with body” (so, a POST) request, and specify the partitions you want to the body. For example, if you want to get partitions 0 and 1 from event type my-event-type, you would do something like this:

POST {nakadi}/subscriptions/1234/events -d '{"partitions": [{"event-type": "my-event-type", "partition": "0"},{"event-type": "my-event-type", "partition": "1"}]}'

Simple. And of course, you can have both types of consumers simultaneously consuming from the same connection. In this case, the rebalanced consumers will share the partitions that have not been requested specifically.

And that’s it for March. If you would like to contribute to Nakadi, please feel free to browse the issues on github, especially those marked with the “help wanted” tag. If you would like to implement a large new feature, please open an issue first to discuss it, so we can all agree on what it should look like. We very much welcome all sorts of contributions: not just code, but also documentation, help with the website, etc.

Last Month in Nakadi: February 2018

I’m experimenting with a new series of posts, called “Last Month in Nakadi”. In the Nakadi project, we maintain a changelog, that we update on each release. Each entry in the file is a one-line summary of a change that was implemented, but that alone is not always sufficient to understand what happened. There is still a fair amount of discussion and context that stays hidden inside Zalando, but we are working on changing that too.

Therefore, I will try, once a month, to provide some context on the changes that we released the month before. I hope that users of Nakadi, and people interested in deploying their own Nakadi-based service, will find this summary useful. Let’s start then, with what we released last month, February 2018.

2.5.7

Released on the 15th of February, this version includes one bug fix, and one performance improvement.

Fix: Problem JSON for authorization issues

A user of Nakadi reported that Nakadi does not provide a correct Problem JSON when authorization has failed.

Improvement: subscription rebalance

We found that, when rebalancing a subscription, Nakadi calls Zookeeper several times, which is costly. This improvement reduces the number of calls to Zookeeper when rebalancing subscriptions, improving the speed of rebalances.

2.5.8

Released on the 22nd of February, this version brings a new feature: the ability to allow a set of applications to get read access to all event types, overriding individual event types’ authorization policies, for archival purposes.

At Zalando, we maintain a data lake, where data is stored and made available to authorised users for analysis. One of the preferred ways to get data into the data lake is to push it to our deployment of Nakadi. Events are then consumed by the data lake ingestion applications, and saved there. Over time, we have noticed that event type owners, when setting or updating their event types’ authorisation policies, would on occasion forget to whitelist the data lake applications, causing delays in data ingestion. Another issue we noticed is that, should the data lake team use a different application to ingest data (they actually use several applications, working together), they would have to contact the owners of all event types from which data is ingested – that’s a lot of people, and a huge burden.

So, we decided to allow these applications to bypass the event types’ authorization policies, such that event type owners would not accidentally block the data lake’s read access. In a future release, we could add a way for the event type owner to indicate that they do not want their data ingested into the data lake.

We also added an optional warning header, sent when an event type is created or updated. We use it to remind our users that their data may be archived, even if the archiving application is not whitelisted for their event type. You can choose the message you want – or no message at all.

And that’s it for February. If you would like to contribute to Nakadi, please feel free to browse the issues on github, especially those marked with the “help wanted” tag. If you would like to implement a large new feature, please open an issue first to discuss it, so we can all agree on what it should look like. We very much welcome all sorts of contributions: not just code, but also documentation, help with the website, etc.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

;