Logo AppDev24 Login / Sign Up
Sign Up
Have Login?
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Login
New Account?
Recovery
Go to Login
By continuing you indicate that you agree to Terms of Service and Privacy Policy of the site.
Applications

Change Data Capture from MSSQL Server to Kafka

Updated on Jul 02, 2024

As data becomes increasingly critical to businesses, the need to capture and process changes in real-time has never been more important. In this article, we'll explore how to read changed data from a MSSQL Server and write it to a Kafka topic as event messages using Debezium's SQL Server CDC Source Connector.

Setting Up Environment

Before we dive into the code, make sure you have a Kafka ecosystem set up on your MacOS machine using Docker. You can refer to our previous article on setting up a Kafka container using Docker for more information.

Installing the Debezium Connector

To use the Debezium SQL Server CDC Source Connector, install it using Confluent Hub. Add the below entry inside the docker kafka-connect service under the command.

confluent-hub install --no-prompt debezium/debezium-connector-sqlserver:2.5.4

Simulating MSSQL Source Database

Add a Docker container service to simulate a MSSQL source database in the docker-compose.yml file.

# MSSQL Source Database
  mssql_src:
    # Apple M1 Chip
    # platform: linux/amd64
    image: mcr.microsoft.com/mssql/server:2022-latest
    container_name: mssql_src
    restart: always
    environment:
      ACCEPT_EULA: Y
      MSSQL_SA_PASSWORD: Password1234
      MSSQL_PID: Developer
      MSSQL_AGENT_ENABLED: "true"
    ports:
      - 1435:1433
    volumes:     
      - mssql_datadir:/var/opt/mssql
    networks:
      - kafka-network

Please also remember to add mssql_datadir under the volumes block.

Note: The Debezium connector is based on the change data capture feature that is available in SQL Server. The command "MSSQL_AGENT_ENABLED: true" ensures that SQL Server agent is enabled.

Create Source Table

Create a source table with some data in the SQL Server database for our demo:

Also we need to enable changed data capture for the database & tables.

CREATE DATABASE sales;

USE sales;

CREATE TABLE consultants(
    id INT IDENTITY(1, 1) NOT NULL PRIMARY KEY,
    first_name VARCHAR(50),
    last_name VARCHAR(50),
    email VARCHAR(50),
    rate DECIMAL(8,2),
    status VARCHAR(20),
    created_at DATETIME DEFAULT CURRENT_TIMESTAMP,
    updated_at DATETIME DEFAULT CURRENT_TIMESTAMP
);

INSERT INTO consultants(first_name, last_name, email, rate, status) VALUES 
    ('John', 'Doe', 'john.doe@gmail.com', 3000.00, 'perm'),
    ('Tom', 'Hanks', 'tom.hanks@yahoo.com', 3500.75, 'contract'),
    ('Jane', 'Doe', 'jane.doe@moneybank.com', 3500.75, 'perm'),
    ('Duke', 'Johnson', 'duke@hello.com', 4500.25, 'contract'),
    ('Peter', 'Parker', 'peter@gmail.com', 4500.25, 'contract'),
    ('Rick', 'Nice', 'rick@gmail.com', 4900, 'contract'),
    ('Tommy', 'Hill', 'tommy@gmail.com', 4100, 'perm'),
    ('Jill', 'Stone', 'jill@gmail.com', 4250.50, 'contract'),
    ('Honey', 'Bee', 'honey@gmail.com', 3200, 'perm'),
    ('Bell', 'Doe', 'bell@gmail.com', 34000, 'contract')
;

EXEC sys.sp_cdc_enable_db;
EXEC sys.sp_cdc_enable_table @source_schema = 'dbo', @source_name = 'consultants', 
	@role_name = NULL, @supports_net_changes = 0;

SELECT * FROM consultants;

Kafka Connect Configuration

Create a configuration file for the Debezium SQL Server CDC Source Connector:

{
    "name": "src-mssql-debezium-cdc-consultants",
    "connector.class": "io.debezium.connector.sqlserver.SqlServerConnector",
    "database.hostname": "host.docker.internal",
    "database.port": "1435",
    "database.user": "sa",
    "database.password": "Password1234",
    "database.names": "sales",
    "database.encrypt": "false",
    "table.include.list": "dbo.consultants",
    "include.schema.changes": "false",
    "decimal.handling.mode": "string",
    "time.precision.mode": "connect",
    "topic.prefix": "mssql_src",
    "schema.history.internal.kafka.topic": "dbhistory.mssql-sales",
    "schema.history.internal.kafka.bootstrap.servers": "broker:29092",
    "tasks.max": "1"
}

Once the Kafka Connector is up & running, let us now look into the Kafka Topic.

Capture Changed Data from Database

Modify some data in the MSSQL database to observe the CDC from the database to a Kafka topic.

INSERT INTO consultants(first_name, last_name, email, rate, status) VALUES ('Saurav', 'Mitra', 'saurav.karate@gmail.com', 5000.00, 'perm');

UPDATE consultants set rate=6500.00 where email='saurav.karate@gmail.com';

INSERT INTO consultants(first_name, last_name, email, rate, status) VALUES ('Tim', 'Smith', 'tim.smith@freelance.com', 3500.70, 'contract');

DELETE from consultants where email='tim.smith@freelance.com';

INSERT INTO consultants(first_name, last_name, email, rate, status) VALUES 
    ('Shane', 'Wilson', 'shane.wilson@freelance.com', 5000.00, 'perm'),
    ('John', 'Sinha', 'john.sinha@freelance.com', 9000.00, 'contract')
;

We can now see, that the database events have been streamed in near real-time to the Kafka Topic.

tombstones.on.delete: true is the default setting.
After a source record is deleted, a delete operation is represented by a delete event and a subsequent tombstone event in Kafka Topic. The tombstone event allows Kafka to completely delete all events that pertain to the key of the deleted row in case log compaction is enabled for the topic.

In this article, we've demonstrated how to capture change data from a MSSQL database and write it to a Kafka topic using Debezium's SQL Server CDC Source Connector. With this setup, you can now process changes in real-time and integrate them with your existing data pipelines.

PrimeChess

PrimeChess.org

PrimeChess.org makes elite chess training accessible and affordable for everyone. For the past 6 years, we have offered free chess camps for kids in Singapore and India, and during that time, we also observed many average-rated coaches charging far too much for their services.

To change that, we assembled a team of top-rated coaches including International Masters (IM) or coaches with multiple IM or GM norms, to provide online classes starting from $50 per month (8 classes each month + 4 tournaments)

This affordability is only possible if we get more students. This is why it will be very helpful if you could please pass-on this message to others.

Exclucively For Indian Residents: 
Basic - ₹1500
Intermediate- ₹2000
Advanced - ₹2500

Top 10 Articles