Kafka producer to read data files

24,354

Solution 1

You can read data file via cat and pipeline it to kafka-console-producer.sh.

cat ${datafile} | ${kafka_home}/bin/kafka-console-producer.sh --broker-list ${brokerlist} --topic test 

Solution 2

If there is always a single file, you can just use tail command and then pipeline it to kafka console producer.

But if a new file will be created when some conditions met, you may need use apache.commons.io.monitor to monitor new file created, then repeat above.

Solution 3

Kafka has this built-in File Stream Connector, for piping the content of a file to producer(file source), or directing file content to another destination(file sink).

We have bin/connect-standalone.sh to read from file which can be configured in config/connect-file-source.properties and config/connect-standalone.properties.

So the command will be:

bin/connect-standalone.sh config/connect-standalone.properties config/connect-file-source.properties

Solution 4

The easiest way if you are using Linux or Mac is:

kafka-console-producer --broker-list localhost:9092 --topic test < messages.txt

Reference: https://github.com/Landoop/kafka-cheat-sheet

Share:
24,354
Admin
Author by

Admin

Updated on September 29, 2020

Comments

  • Admin
    Admin over 3 years

    I am trying to load a data file in loop(to check stats) instead of standard input in Kafka. After downloading Kafka, I performed the following steps:

    Started zookeeper:

    bin/zookeeper-server-start.sh config/zookeeper.properties
    

    Started Server:

    bin/kafka-server-start.sh config/server.properties
    

    Created a topic named "test":

    bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic test
    

    Ran the Producer:

    bin/kafka-console-producer.sh --broker-list localhost:9092 --topic test 
    Test1
    Test2
    

    Listened by the Consumer:

    bin/kafka-console-consumer.sh --zookeeper localhost:2181 --topic test --from-beginning
    Test1
    Test2
    

    Instead of Standard input, I want to pass a data file to the Producer which can be seen directly by the Consumer. Or is there any kafka producer instead of console consumer using which I can read data files. Any help would really be appreciated. Thanks!

  • Marko Bonaci
    Marko Bonaci about 8 years
    Or, if you want to read the whole file and then continue tailing for subsequently appended lines, you'd use tail -f -n +1 file_path, instead of cat.
  • WesternGun
    WesternGun about 6 years
    Kafka has built-in File-source connector, which is made for such type of task: read a single file into producer for consumer to suck data. See my answer below.
  • awadhesh14
    awadhesh14 almost 5 years
    Can you give the example of contents of config/connect-file-source.properties and config/connect-standalone.properties
  • awadhesh14
    awadhesh14 almost 5 years
  • NickyPatel
    NickyPatel over 3 years
    I was trying this answer but it was giving error : no files found Then I tried to give the actual path like C:\data\messages.txt but the same error was there. Then I tried ..\ in path which means parent folder but there I got confused so i used tab there to see the files there. Hurrrrraaaayyyyyy ! it worked. It was not able to find the file because it was searching in the same location. i.e. i have given the path c:\data\message.txt it was trying to search c into the current locaiton. so i need to move it with parent folder commant which is ..\