In worker, retry forever with an exponential back off when Redis interactions time out #22

nathansobo · 2019-11-07T01:01:59Z

Currently, when we time out talking to Redis, we reconnect and retry the operation. For a fail-over scenario where the Redis server has moved to a new host, this behavior works. For scenarios in which the Redis is still available but is overwhelmed with load, repeatedly reconnecting and retrying operations has the potential to make the situation worse.

In this PR, I introduce Worker#with_exponential_backoff and use it in the Worker instead of with_retries.

When retrying, exponentially back off by powers of 2, up to a maximum of 60 seconds, with 5 seconds of random jitter.
Continue retrying forever until the worker is explicitly shut down. This prevents a scenario where the worker process dies after N attempts only to be restarted by Resqued. This ensures that we continue to retry at a reduced frequency until Redis service health recovers. Restarting the process would cause us to start retrying at a faster rate.

I limit these changes to the worker because backing off and retrying forever in Unicorn processes when enqueuing jobs could cause request timeouts.

I also change the behavior of with_retries slightly so that attempts to reconnect also count as a retry attempt. The existing logic can end up trying to reconnect up to 9 times in certain scenarios.


        Honor retries parameter when interacting with Redis


        Only sleep if we actually want to reconnect


        Retry even if we fail to reconnect

This allows for multiple reconnect attempts before raising, and each one counts as an attempt.


        Rename retries parameter to be more explicit

dbussink · 2019-11-07T13:28:17Z

Sorry, I missed this PR when opening #23 and after @nronas approved it, I already merged it before this change.

Feel free to incorporate some of the further changes here though, #23 was aiming at the most minimal fix I could come up with.


        Exponentially back-off when retrying Redis operations

Co-Authored-By: Nathan Witmer <nathan@zerowidth.com>


        Merge remote-tracking branch 'origin/github' into fix-retries


        Avoid delaying test with exponential back-off


        Only in worker: enable infinite retries with exponential back-off


        Make variable name more precise


        Test shutdown during a retry/backoff loop


        Fix comment


        Make with_retries simpler again


        Use with_exponential_backoff explicitly in Worker

nathansobo and others added 4 commits Nov 6, 2019

Honor retries parameter when interacting with Redis

ce89808

Only sleep if we actually want to reconnect

0bb79a3

Retry even if we fail to reconnect

0e2abe2

This allows for multiple reconnect attempts before raising, and each one counts as an attempt.

Rename retries parameter to be more explicit

fc772ae

Exponentially back-off when retrying Redis operations

e6cfc05

Co-Authored-By: Nathan Witmer <nathan@zerowidth.com>

nathansobo force-pushed the fix-retries branch from ee6c29a to e6cfc05 Nov 7, 2019

nathansobo added 4 commits Nov 7, 2019

Merge remote-tracking branch 'origin/github' into fix-retries

e15027e

Avoid delaying test with exponential back-off

754517e

Only in worker: enable infinite retries with exponential back-off

4df2639

Make variable name more precise

845cda9

nathansobo changed the title ~~Avoid infinite loop in retry logic when exceptions occur talking to Redis~~ In worker, retry forever with an exponential back off when Redis interactions time out Nov 7, 2019

nathansobo marked this pull request as ready for review Nov 7, 2019

nathansobo added 4 commits Nov 7, 2019

Test shutdown during a retry/backoff loop

4e16da1

Fix comment

b8c5164

Make with_retries simpler again

d294c44

Use with_exponential_backoff explicitly in Worker

8416d86

Aug	SEP	Oct
	19
2019	2020	2021

github / resque

In worker, retry forever with an exponential back off when Redis interactions time out #22

In worker, retry forever with an exponential back off when Redis interactions time out #22

nathansobo commented Nov 7, 2019 •

edited

dbussink commented Nov 7, 2019

github / resque

Join GitHub today

In worker, retry forever with an exponential back off when Redis interactions time out #22

In worker, retry forever with an exponential back off when Redis interactions time out #22

Conversation

nathansobo commented Nov 7, 2019 • edited

dbussink commented Nov 7, 2019

nathansobo commented Nov 7, 2019 •

edited