Kafka Simulator for Testing

Rakesh singhania
4 min readApr 29, 2024

Regression testing is an important practice in software development to ensure that new changes or updates to the codebase do not break existing functionality. In the context of systems that interact with Apache Kafka, a Kafka simulator can be extremely useful for performing effective regression testing.

While testing Kafka applications against a live cluster is ideal, it’s not always practical.

Sensitive data, potential disruptions, and resource limitations can make live testing challenging.

This is where Kafka simulators shine.

Photo by ThisisEngineering on Unsplash

A Kafka simulator acts as a mock Kafka cluster, replicating its behavior in a controlled environment.

This allows developers to test their Kafka applications without the drawbacks of live testing. Let’s dive into the benefits of using Kafka simulators for streamlined testing.

Here’s why you might need it:

  1. Testing Kafka Consumer Logic: As a developer, you want to ensure that your Kafka consumer logic works as expected. By spooling data from a Kafka topic to disk and then reading it back using Kafka consumer logic, you can verify that the consumer is able to process messages correctly.
  2. Offline Testing: Sometimes, you may not have access to the actual Kafka cluster or the topic contains sensitive or production data that you cannot use for testing. In such cases, spooling data to disk allows you to perform offline testing using a sample dataset.
  3. Reproducibility: By spooling a specific number of records from the Kafka topic to disk, you can create a reproducible test scenario. This allows you to rerun the test with the same dataset multiple times, which is helpful for debugging and troubleshooting issues.
  4. Isolation: Spooling data to disk allows you to isolate your Kafka consumer logic from external dependencies such as network latency or Kafka cluster availability. This ensures that any issues encountered during testing are related to the consumer logic itself and not external factors.

Below program allows you to specify the number of records you want to bundle from the Kafka topic to disk. You can run the program with the desired number of records as a command line argument. If no argument is provided, it will use the default value of 100 records.

Overall, this program demonstrates a simplified simulation of the following steps:

  1. Consuming data from a Kafka topic (simulated in bundleDataFromKafkaToDisk).
  2. Writing the consumed data to a file.
  3. Reading the data from the file (simulated processing in readDataFromFile).
import org.apache.kafka.clients.consumer.*;
import org.apache.kafka.common.serialization.StringDeserializer;
import java.io.*;
import java.time.Duration;
import java.util.Collections;
import java.util.Properties;

public class KafkaConsumerTest {

private static final String TOPIC_NAME = "your-kafka-topic";
private static final String FILE_NAME = "kafka_data.txt";

public static void main(String[] args) {
int numRecords = 100; // Default number of records to spool

// Check if number of records is provided as a command line argument
if (args.length > 0) {
try {
numRecords = Integer.parseInt(args[0]);
} catch (NumberFormatException e) {
System.err.println("Invalid number of records provided. Using default value.");

// Spool data to disk

// Read data from file using Kafka consumer reader logic

Write Kafka Consumer record to file.

private static void bundleDataFromKafkaToDisk(int numRecords) {
// Kafka consumer properties
Properties props = new Properties();
props.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, "localhost:9092");
props.put(ConsumerConfig.GROUP_ID_CONFIG, "kafka-consumer-test");
props.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "earliest");
props.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class.getName());
props.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class.getName());

// Create Kafka consumer
KafkaConsumer<String, Record> consumer = new KafkaConsumer<>(props);

// Subscribe to Kafka topic

// Spool data to disk
try (BufferedWriter writer = new BufferedWriter(new FileWriter(FILE_NAME))) {
int recordsCount = 0;
while (recordsCount < numRecords) {
ConsumerRecords<String, Record> records = consumer.poll(Duration.ofMillis(100));
for (ConsumerRecord<String, Record> record : records) {
Record data = record.value();
// write data
if (recordsCount >= numRecords) {
System.out.println("Data written successfully to " + FILE_NAME);
} catch (IOException e) {
System.err.println("Error writing data to file: " + e.getMessage());
} finally {

Create a RecordClass to capture kafka message properties.

public static class Record {
private String message;
private String[] headers;

public Record(String message, String[] headers) {
this.message = message;
this.headers = headers;

public String getMessage() {
return message;

public String[] getHeaders() {
return headers;

Read data which you have written.

private static void readDataFromFile() {
try (BufferedReader reader = new BufferedReader(new FileReader(FILE_NAME))) {
String line;
System.out.println("Reading data from " + FILE_NAME + ":");
while ((line = reader.readLine()) != null) {
// Process each line of data as a Kafka record
System.out.println("Consumed record: " + line);
// You can add your Kafka consumer logic here
// For example, process the record or send it to another Kafka topic
} catch (IOException e) {
System.err.println("Error reading data from file: " + e.getMessage());

This testing approach enables developers to identify and address issues proactively, ensuring the robustness and effectiveness of their Kafka consumer logic.



Rakesh singhania

As a student of technology, each day I take a single step forward on the path of learning.