Docker Kafka Connect Container for AWS MSK cluster
Are you looking for a seamless way to integrate your Apache Kafka cluster on Amazon Managed Streaming for Kafka (MSK) with other data sources and sinks? Look no further! In this article, we'll guide you through the process of setting up a Docker Kafka Connect container on MacOS to work with your AWS MSK cluster.
Create a Docker Compose File
We'll create a Docker compose file that defines our Kafka Connect container.
kafka-connect-docker-compose.yml
version: '3.9'
services:
kafka-connect:
# Apple M1 Chip
# platform: linux/amd64
image: confluentinc/cp-kafka-connect:7.4.0
container_name: kafka-connect
hostname: kafka-connect
restart: always
environment:
CONNECT_BOOTSTRAP_SERVERS: b-1.msk-cluster.laur9p.c4.kafka.ap-southeast-1.amazonaws.com:9092, b-2.msk-cluster.laur9p.c4.kafka.ap-southeast-1.amazonaws.com:9092,b-3.msk-cluster.laur9p.c4.kafka.ap-southeast-1.amazonaws.com:9092
CONNECT_SASL_MECHANISM: SCRAM-SHA-512
CONNECT_SECURITY_PROTOCOL: SASL_PLAINTEXT
CONNECT_SASL_JAAS_CONFIG: "org.apache.kafka.common.security.scram.ScramLoginModule required \
username=\"kafka_admin\" \
password=\"UXyJ6B9THUhATZWB\";"
CONNECT_PRODUCER_SASL_MECHANISM: SCRAM-SHA-512
CONNECT_PRODUCER_SECURITY_PROTOCOL: SASL_PLAINTEXT
CONNECT_PRODUCER_SASL_JAAS_CONFIG: "org.apache.kafka.common.security.scram.ScramLoginModule required \
username=\"kafka_admin\" \
password=\"UXyJ6B9THUhATZWB\";"
CONNECT_CONSUMER_SASL_MECHANISM: SCRAM-SHA-512
CONNECT_CONSUMER_SECURITY_PROTOCOL: SASL_PLAINTEXT
CONNECT_CONSUMER_SASL_JAAS_CONFIG: "org.apache.kafka.common.security.scram.ScramLoginModule required \
username=\"kafka_admin\" \
password=\"UXyJ6B9THUhATZWB\";"
CONNECT_PRODUCER_CONFLUENT_MONITORING_INTERCEPTOR_SASL_MECHANISM: SCRAM-SHA-512
CONNECT_PRODUCER_CONFLUENT_MONITORING_INTERCEPTOR_SECURITY_PROTOCOL: SASL_PLAINTEXT
CONNECT_PRODUCER_CONFLUENT_MONITORING_INTERCEPTOR_SASL_JAAS_CONFIG: "org.apache.kafka.common.security.scram.ScramLoginModule required \
username=\"kafka_admin\" \
password=\"UXyJ6B9THUhATZWB\";"
CONNECT_CONSUMER_CONFLUENT_MONITORING_INTERCEPTOR_SASL_MECHANISM: SCRAM-SHA-512
CONNECT_CONSUMER_CONFLUENT_MONITORING_INTERCEPTOR_SECURITY_PROTOCOL: SASL_PLAINTEXT
CONNECT_CONSUMER_CONFLUENT_MONITORING_INTERCEPTOR_SASL_JAAS_CONFIG: "org.apache.kafka.common.security.scram.ScramLoginModule required \
username=\"kafka_admin\" \
password=\"UXyJ6B9THUhATZWB\";"
CONNECT_REST_ADVERTISED_HOST_NAME: kafka-connect
CONNECT_GROUP_ID: kafka-connect
CONNECT_CONFIG_STORAGE_TOPIC: __connect-config
CONNECT_OFFSET_STORAGE_TOPIC: __connect-offsets
CONNECT_STATUS_STORAGE_TOPIC: __connect-status
CONNECT_CONFIG_STORAGE_REPLICATION_FACTOR: 1
CONNECT_OFFSET_STORAGE_REPLICATION_FACTOR: 1
CONNECT_STATUS_STORAGE_REPLICATION_FACTOR: 1
CONNECT_OFFSET_FLUSH_INTERVAL_MS: 10000
CONNECT_KEY_CONVERTER: io.confluent.connect.avro.AvroConverter
CONNECT_KEY_CONVERTER_SCHEMA_REGISTRY_URL: https://schema-registry.k8s.int.org
CONNECT_VALUE_CONVERTER: io.confluent.connect.avro.AvroConverter
CONNECT_VALUE_CONVERTER_SCHEMA_REGISTRY_URL: https://schema-registry.k8s.int.org
# CLASSPATH required due to CC-2422
CLASSPATH: /usr/share/java/monitoring-interceptors/monitoring-interceptors-7.4.0.jar
CONNECT_PRODUCER_INTERCEPTOR_CLASSES: io.confluent.monitoring.clients.interceptor.MonitoringProducerInterceptor
CONNECT_CONSUMER_INTERCEPTOR_CLASSES: io.confluent.monitoring.clients.interceptor.MonitoringConsumerInterceptor
CONNECT_PLUGIN_PATH: /usr/share/java,/usr/share/confluent-hub-components
CONNECT_LOG4J_LOGGERS: org.apache.zookeeper=ERROR,org.I0Itec.zkclient=ERROR,org.reflections=ERROR
command:
- bash
- -c
- |
confluent-hub install --no-prompt confluentinc/kafka-connect-datagen:0.6.0
confluent-hub install --no-prompt confluentinc/kafka-connect-s3:10.5.0
confluent-hub install --no-prompt confluentinc/connect-transforms:1.4.3
/etc/confluent/docker/run &
sleep infinity
ports:
- 8083:8083
networks:
- connect-network
networks:
connect-network:
driver: bridge
We're using Confluent's official Kafka Connect Docker image, version 7.4.0.
Authentication and Authorization
We're using SCRAM-SHA-512 as the SASL mechanism and setting up a login module with a specific username and password.
CONNECT_BOOTSTRAP_SERVERS: b-1.msk-cluster.laur9p.c4.kafka.ap-southeast-1.amazonaws.com:9092, b-2.msk-cluster.laur9p.c4.kafka.ap-southeast-1.amazonaws.com:9092,b-3.msk-cluster.laur9p.c4.kafka.ap-southeast-1.amazonaws.com:9092
CONNECT_SASL_MECHANISM: SCRAM-SHA-512
CONNECT_SECURITY_PROTOCOL: SASL_PLAINTEXT
CONNECT_SASL_JAAS_CONFIG: "org.apache.kafka.common.security.scram.ScramLoginModule required \
username=\"kafka_admin\" \
password=\"UXyJ6B9THUhATZWB\";"
Modify the Bootstrap Servers, Schema registry & SASL username/password according to your configuration.
Confluent Hub Components
To integrate various data sources and sinks, we'll install Confluent Hub components:
We're installing three Confluent Hub components:
- kafka-connect-datagen: a component for generating test data
- kafka-connect-s3: a component for integrating with Amazon S3
- connect-transforms: a set of transformers for processing and transforming messages
Add other relevant source/sink connectors as per your requirement.
Finally, we can start our Kafka Connect container using the following command:
docker-compose -f kafka-connect-docker-compose.yml up -d
This will start the Kafka Connect container in detached mode (i.e., it runs in the background). We can then test our Kafka Connect cluster setup by deploying source/sink configs using the Kafka Connect API.