Kafka Producer TimeOutException

27,224

The error indicates that some records are put into the queue at a faster rate than they can be sent from the client.

When your Producer sends messages, they are stored in buffer (before sending the to the target broker) and the records are grouped together into batches in order to increase throughput. When a new record is added to the batch, it must be sent within a -configurable- time window which is controlled by request.timeout.ms (the default is set to 30 seconds). If the batch is in the queue for longer time, a TimeoutException is thrown and the batch records will then be removed from the queue and won't be delivered to the broker.

Increasing the value of request.timeout.ms should do the trick for you.

In case this does not work, you can also try decreasing batch.size so that batches are sent more often (but this time will include fewer messages) and make sure that linger.ms is set to 0 (which is the default value).

Note that you need to restart your kafka brokers after changing any configuration parameter.

If you still get the error I assume that something wrong is going on with your network. Have you enabled SSL?

Share:
27,224
Anuj jain
Author by

Anuj jain

Updated on July 09, 2022

Comments

  • Anuj jain
    Anuj jain almost 2 years

    I am running a Samza stream job that is writing data to Kafka topic. Kafka is running a 3 node cluster. Samza job is deployed on yarn. We are seeing lot of these exceptions in container logs :

     INFO [2018-10-16 11:14:19,410] [U:2,151,F:455,T:2,606,M:2,658] samza.container.ContainerHeartbeatMonitor:[ContainerHeartbeatMonitor:stop:61] - [main] - Stopping ContainerHeartbeatMonitor
    ERROR [2018-10-16 11:14:19,410] [U:2,151,F:455,T:2,606,M:2,658] samza.runtime.LocalContainerRunner:[LocalContainerRunner:run:107] - [main] - Container stopped with Exception. Exiting process now.
    org.apache.samza.SamzaException: org.apache.samza.SamzaException: Unable to send message from TaskName-Partition 15 to system kafka.
            at org.apache.samza.task.AsyncRunLoop.run(AsyncRunLoop.java:147)
            at org.apache.samza.container.SamzaContainer.run(SamzaContainer.scala:694)
            at org.apache.samza.runtime.LocalContainerRunner.run(LocalContainerRunner.java:104)
            at org.apache.samza.runtime.LocalContainerRunner.main(LocalContainerRunner.java:149)
    Caused by: org.apache.samza.SamzaException: Unable to send message from TaskName-Partition 15 to system kafka.
            at org.apache.samza.system.kafka.KafkaSystemProducer$$anon$1.onCompletion(KafkaSystemProducer.scala:181)
            at org.apache.kafka.clients.producer.internals.RecordBatch.done(RecordBatch.java:109)
            at org.apache.kafka.clients.producer.internals.RecordBatch.maybeExpire(RecordBatch.java:160)
            at org.apache.kafka.clients.producer.internals.RecordAccumulator.abortExpiredBatches(RecordAccumulator.java:245)
            at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:212)
            at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:135)
            at java.lang.Thread.run(Thread.java:748)
    Caused by: org.apache.kafka.common.errors.TimeoutException: Expiring 5 record(s) for Topic3-16 due to 30332 ms has passed since last attempt plus backoff time
    

    These 3 types of exceptions are coming a lot.

    59088 org.apache.kafka.common.errors.TimeoutException: Expiring 115 record(s) for Topic3-1 due to 30028 ms has passed since last attempt plus backoff time
    
    61015 org.apache.kafka.common.errors.TimeoutException: Expiring 60 record(s) for Topic3-1 due to 74949 ms has passed since batch creation plus linger time
    
    62275 org.apache.kafka.common.errors.TimeoutException: Expiring 176 record(s) for Topic3-4 due to 74917 ms has passed since last append
    

    Please help me understand what is the issue here. Whenever its happened Samza container is getting restarted.

  • Anuj jain
    Anuj jain over 5 years
    Thanks, Giorgos I will try this property. I am also getting lot of error like org.apache.kafka.common.errors.TimeoutException: Failed to update metadata after 60000 ms. is this also due to to request.timeout.ms
  • Giorgos Myrianthous
    Giorgos Myrianthous over 5 years
    @Anujjain First of all, can you confirm that all of your Kafka Brokers are up and running? Secondly, what number of retries have you set?
  • Anuj jain
    Anuj jain over 5 years
    yes, all the Kafka broker are running. Below is the configuration that we have. batch.size = 16384 buffer.memory = 33554432 linger.ms = 0 max.block.ms = 60000 max.in.flight.requests.per.connection = 1 max.request.size = 14000000 metadata.fetch.timeout.ms = 60000 metadata.max.age.ms = 300000 metrics.sample.window.ms = 30000 receive.buffer.bytes = 32768 reconnect.backoff.ms = 50 request.timeout.ms = 30000 retries = 20 retry.backoff.ms = 1000
  • Giorgos Myrianthous
    Giorgos Myrianthous over 5 years
    @Anujjain Did you try to increase request.timeout.ms?
  • Anuj jain
    Anuj jain over 5 years
    yes, I increase timeout and did not see TimeoutException after that. How I decide which value should good for request.timeout.ms and also I need to know that is increasing retry.backoff.ms will help as producer get more time to send messages. Also I need want to understand this error can u explain a bit about it also Failed to update metadata after 60000 ms
  • Giorgos Myrianthous
    Giorgos Myrianthous over 5 years
    Since this question has been answered and you are currently facing another error I would suggest to ask a new question and include the new errors and configuration in your question.
  • Sachin Mhetre
    Sachin Mhetre over 4 years
    @GiorgosMyrianthous, Does retries work in case of TimeOutException?