Ops-Talks #05 - Kafka

Kafka Listeners

While developing in a Docker environment it’s important to get a good understanding of Kafka listeners / advertised listeners. This notion is really well explained in this article: Kafka Listeners Explained

You need to set advertised.listeners (or KAFKA_ADVERTISED_LISTENERS if you’re using Docker images) to the external address (host/IP) so that clients can correctly connect to it. Otherwise, they’ll try to connect to the internal host address—and if that’s not reachable, then problems ensue.

You can also use this Github repository to play around with this notion:

Kafka Partitioning

Here is a common rule of thumb for parition sizing in a Kafka cluster

Every use case deserve its own reflexion… Do not apply these settings blindly

Workload Partition Sizing
Common 8 - 16
Big topics 120 - 200
You’re wrong >200

Kafka Tips & Tricks

https://github.com/birdayz/kaf is an amazingly easy Kafka client to use and install

If you feel lazy installing it, run it in docker:

alias kaf="docker run --entrypoint="" -v ~/.kaf:/root/.kaf -it lowess/kaf bash"

Here are a couple of tips and tricks you can do with this cli:

  • Consume the content <TOPIC> and copy it to a file
kaf consume <TOPIC> --offset latest 2>/dev/null | tee /tmp/kafka-stream.log
  • Consume the content <TOPIC> and generate reformat payload to keep url and uuid only
kaf consume <TOPIC> 2>/dev/null | jq '. | {"url": .url, "uuid": .uuid}'
  • Send each line from <FILE> as individual records to <TOPIC>
cat <FILE> | while read line; do echo $line | kaf produce <TOPIC>; done