Difference between heartbeat.interval.ms and session.timeout.ms in Kafka consumer config

12,128

Solution 1

The heartbeat.interval.ms specifies the frequency of sending heart beat signal by the consumer. So if this is 3000 ms (default), then every 3 seconds the consumer will send the heartbeat signal to the broker.

The session.timeout.ms specifies the amount of time within which the broker needs to get at least one heart beat signal from the consumer. Otherwise it will mark the consumer as dead. The default value 10000 ms (10 seconds) makes provision for missing three heart beat signals before a broker will mark the consumer as dead.

In a network setup under heavy load, it is normal to miss few heartbeat signals. So it is recommended to wait for missing 3 heart beat signals before marking the consumer as dead. That is the reason for the 1/3 recommendation.

Solution 2

The code makes a hard limit that you cannot set heartbeat.interval.ms no less than request.timeout.ms, otherwise Kafka complains "Heartbeat must be set lower than the session timeout".

If you really have these two configs be the same value, a possible situation is network client will never heartbeat anymore because the session timeout nearly always happens before doing heartbeat.

As for the 1/3, I prefer to think it sort of being a heuristic value.

Share:
12,128

Related videos on Youtube

novon
Author by

novon

Updated on September 15, 2022

Comments

  • novon
    novon over 1 year

    I'm currently running kafka 0.10.0.1 and the corresponding docs for the two values in question are as follows:

    heartbeat.interval.ms - The expected time between heartbeats to the consumer coordinator when using Kafka's group management facilities. Heartbeats are used to ensure that the consumer's session stays active and to facilitate rebalancing when new consumers join or leave the group. The value must be set lower than session.timeout.ms, but typically should be set no higher than 1/3 of that value. It can be adjusted even lower to control the expected time for normal rebalances.

    session.timeout.ms - The timeout used to detect failures when using Kafka's group management facilities. When a consumer's heartbeat is not received within the session timeout, the broker will mark the consumer as failed and rebalance the group. Since heartbeats are sent only when poll() is invoked, a higher session timeout allows more time for message processing in the consumer's poll loop at the cost of a longer time to detect hard failures. See also max.poll.records for another option to control the processing time in the poll loop.

    It isn't clear to me why the docs recommend setting heartbeat.interval.ms to 1/3 of session.timeout.ms. Does it not make sense to have these values be the same since the heartbeat is only sent when poll() is invoked, and thus when processing of the current records is done?

  • novon
    novon almost 7 years
    This covers the relationship between heartbeat.interval.ms and request.timeout.ms but I was asking about the relation between heartbeat.interval.ms and session.timeout.ms.
  • amethystic
    amethystic almost 7 years
    No, it is no related with request.timeout.ms here and request.timeout.ms must be greater than session.timeout.ms