fix(csharp/src/Drivers/Databricks): Fix HTTP handler chain ordering to enable retry before exception#3578
Merged
CurtHagenlocher merged 1 commit intoOct 16, 2025
Conversation
…ry before exception ## Summary Reordered HTTP delegating handlers in DatabricksConnection to ensure RetryHttpHandler processes responses before ThriftErrorMessageHandler throws exceptions. This fixes a bug where 503 Service Unavailable responses with Retry-After headers (e.g., during cluster auto-start) were not being retried. ## Changes - Moved ThriftErrorMessageHandler to be OUTSIDE (farther from network) RetryHttpHandler - Added comprehensive documentation explaining handler chain execution order - Added cross-references in unit tests pointing to the production code documentation ## Why This Order Matters HTTP delegating handlers execute from outermost to innermost on requests, and innermost to outermost on responses. With the corrected order: 1. RetryHttpHandler (inner) processes 503 responses first and retries them 2. Only after all retries are exhausted does ThriftErrorMessageHandler (outer) throw exceptions with Thrift error messages Previous incorrect order had ThriftErrorMessageHandler as the innermost handler, causing it to throw exceptions immediately without allowing retries. ## Test Plan - All existing unit tests pass (11/11 for ThriftErrorMessageHandler, 14/14 for RetryHttpHandler) - The fix will be validated in E2E tests when connecting to Databricks clusters that need auto-start 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
CurtHagenlocher
approved these changes
Oct 16, 2025
| baseAuthHandler = new ThriftErrorMessageHandler(baseAuthHandler); | ||
|
|
||
| // Add tracing handler to propagate W3C trace context if enabled | ||
| // IMPORTANT: Handler Order Matters! |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Reordered HTTP delegating handlers in DatabricksConnection to ensure RetryHttpHandler processes responses before ThriftErrorMessageHandler throws exceptions. This fixes a bug where 503 Service Unavailable responses with Retry-After headers (e.g., during cluster auto-start) were not being retried.
Problem
Previously, the handler chain had ThriftErrorMessageHandler as the innermost handler:
This caused ThriftErrorMessageHandler to process error responses first and throw exceptions immediately, preventing RetryHttpHandler from retrying 503 responses during cluster auto-start scenarios.
Solution
Reordered the chain so RetryHttpHandler is inside ThriftErrorMessageHandler:
Now responses flow: Network → RetryHttpHandler → ThriftErrorMessageHandler
With this order:
Changes
DatabricksConnection.CreateHttpHandler()RetryHttpHandlerTestandThriftErrorMessageHandlerTestpointing to the production codeTest Plan
ThriftErrorMessageHandlerTest: 11/11 tests passRetryHttpHandlerTest: 14/14 tests passRelated Issues
Fixes cluster auto-start retry issues where 503 responses with Retry-After headers were not being retried.