fix(csharp/src/Drivers/Databricks): Correct DatabricksCompositeReader and StatusPoller to Stop/Dispose Appropriately by toddmeng-db · Pull Request #3217 · apache/arrow-adbc

toddmeng-db · 2025-07-29T17:00:04Z

Motivation

The following cases are not properly stopping or disposing the status poller:

If the DatabricksCompositeReader is explicitly disposed by the user
CloudFetchReader is done returning results
Edge case terminal operation status (timedout_state, unknown_state)

In addition:

When DatabricksOperationStatusPoller.Dispose(), it may cancel the GetOperationStatusRequest in the client. If the input buffer has data and cancellation is triggered, it leaves the TCLI client with unconsumed/unsent data in the buffer, breaking subsequent requests (fixed in this PR)

Fixes

DatabricksOperationStatusPollerLogic is now more appropriately managed by DatabricksCompositeReader (moved out of BaseDatabricksReader) to handle all cases where null results (indicating completion) are returned.

Disposing DatabricksCompositeReader appropriately disposes the activeReader and statusPoller

TODO

Follow-up PR - when statement is disposed, it should also dispose the reader (the poller is currently stopped when operationhandle is set to null, but this should also happen explicitly)

Need add some unit testing (follow up pr: #3243)

jadewang-db · 2025-07-29T17:53:18Z

can you confirm, even without this fix, the polling will stop after statement being disposed, right? if not, we need fix there also

toddmeng-db · 2025-07-30T04:04:55Z

@@ -247,6 +247,10 @@ private async Task FetchResultsAsync(CancellationToken cancellationToken)
                _downloadQueue.Add(EndOfResultsGuard.Instance, cancellationToken);
                _isCompleted = true;


From testing:
Small nit but I think we need to avoid this here, since it's possible that DownloadQueue is full, then exception handling would be stuck. Should I modify the Exception handling below, or was there a reason why it it like this? (line 262) @jadewang-db

catch (Exception ex) { try { _downloadQueue.Add(EndOfResultsGuard.Instance, CancellationToken.None); } }

Alternatively, we can create a new CancellationToken with Timeout for this attempt

CancellationToken GetOperationStatusTimeoutToken = ApacheUtility.GetCancellationToken(_requestTimeoutSeconds, ApacheUtility.TimeUnit.Seconds);

thanks, a cancellation token looks good

Oh just saw this comment, let me implement

Actually looks like TryAdd is better suited here

toddmeng-db · 2025-07-30T22:51:37Z

@@ -69,13 +72,18 @@ private async Task PollOperationStatus(CancellationToken cancellationToken)
                    var operationHandle = _statement.OperationHandle;
                    if (operationHandle == null) break;



We need to use a timeout token here, instead of cancelling when canceltoken is triggered; if an interrupt is triggered prematurely, the TCLI client may still have unsent/unconsumed results in the buffers, affecting subsequent calls with that client (which is any future call in the same Session)

are you able to repro this? should we do this to all the thrift rpc calls in the driver?

I think it is because in THTTPTransport (used by SparkHttpConnection -> DatabricksHttpconnection), a new Stream is created when the request is flushed. If cancellation happens before this, that stream doesn't get discarded:
https://github.com/apache/thrift/blob/master/lib/netstd/Thrift/Transport/Client/THttpTransport.cs#L281

Yes, during testing, got some errors. In the proxy logs, I remember seeing requests sent out with both GetOperationStatus and CloseOperationStatus (in the same request) while testing another PR

I think we are safe in HiveServer2Statement, but we might need to adjust CancellationToken in DatabricksReader, CloudFetchResultFetcher, and DatabricksCompositeReader

Actually, I think this depends a bit on how CancellationToken could be used by PBI, too
@CurtHagenlocher will mashup ever trigger cancellationTokens passed into IArrowStreamReader.ReadNextBatchAsync? Do we need to ensure that the connection still remains usable for subsequent statements?

At least for now, I think we can operate this way:

If the user cancels the token passed in to ReadNextBatchAsync, we should not to break the client

Dispose() should not break the client either

@CurtHagenlocher will mashup ever trigger cancellationTokens passed into IArrowStreamReader.ReadNextBatchAsync? Do we need to ensure that the connection still remains usable for subsequent statements?

This is currently unimplemented but we'll need to implement it before GA for parity with the ODBC implementation. What is probably most important for cancellation is query execution, and unless we manage to push forward the proposed ADBC 1.1 API, currently the only way to cancel a running query is to call AdbcStatement.Cancel. There is currently no implementation of this method for any of the C#-implemented drivers :(.

From a Power BI perspective, the most important use of cancellation is for Direct Query because users can generate a lot of queries simply by clicking around in a visual and in-progress queries will need to be cancelled if their output is no longer needed. DQ output tends to be relatively small, so being able to cancel in the middle of reading the output is arguably less important than being able to cancel before the results start coming back.

jackyhu-db · 2025-08-07T01:12:20Z

                request.StartRowOffset = offset;

+                // Cancelling mid-request breaks the client; Dispose() should not break the underlying client
+                CancellationToken expiringToken = ApacheUtility.GetCancellationToken(DatabricksConstants.DefaultCloudFetchRequestTimeoutSeconds, ApacheUtility.TimeUnit.Seconds);


should you respect the connection parameter DatabricksParameters.CloudFetchTimeoutMinutes instead of the default value?

Do you mean I shouldn't create a new constant here?

no, what I meant is if you should check the value of the connection parameter CloudFetchTimeoutMinutes (adbc.databricks.cloudfetch.timeout_minutes) which can be set by the client and customer.

Oh got it, that makes sense, it should be a configurable parameter. To be consistent with the rest of HiveServer2Statement, I'm just using the QueryTimeout parameter (which is what other FetchResultsRequest uses)

I have some changes in a follow-up PR that will make this change easier to do for DatabricksReader, will leave this as a TODO

CurtHagenlocher

Thanks! The linter error needs to be fixed and I made a few small low-priority suggestions.

toddmeng-db changed the title ~~Error handling for operation status poller~~ fix(csharp/src/Drivers/Databricks): Error handling for operation status poller Jul 29, 2025

toddmeng-db force-pushed the toddmeng-db/operation-status-poller-error-handling branch 2 times, most recently from ec41720 to 004a5a7 Compare July 29, 2025 17:50

toddmeng-db commented Jul 29, 2025

View reviewed changes

Comment thread csharp/src/Drivers/Apache/Hive2/HiveServer2Statement.cs

toddmeng-db changed the title ~~fix(csharp/src/Drivers/Databricks): Error handling for operation status poller~~ fix(csharp/src/Drivers/Databricks): Tighten OperationStatusPoller Disposal Jul 29, 2025

toddmeng-db commented Jul 30, 2025

View reviewed changes

Comment thread csharp/src/Drivers/Apache/Hive2/HiveServer2Statement.cs

toddmeng-db commented Jul 30, 2025

View reviewed changes

toddmeng-db force-pushed the toddmeng-db/operation-status-poller-error-handling branch from 9caf8db to 9fd9fea Compare July 30, 2025 04:32

toddmeng-db changed the title ~~fix(csharp/src/Drivers/Databricks): Tighten OperationStatusPoller Disposal~~ fix(csharp/src/Drivers/Databricks): Tighten Statement Disposal Jul 30, 2025

toddmeng-db changed the title ~~fix(csharp/src/Drivers/Databricks): Tighten Statement Disposal~~ fix(csharp/src/Drivers/Databricks): Tighten Statement, Reader, Poller Disposal Jul 30, 2025

toddmeng-db force-pushed the toddmeng-db/operation-status-poller-error-handling branch 9 times, most recently from d55808c to 74c6ee8 Compare July 30, 2025 22:49

toddmeng-db commented Jul 30, 2025

View reviewed changes

toddmeng-db force-pushed the toddmeng-db/operation-status-poller-error-handling branch 4 times, most recently from ecb0771 to 3263cef Compare July 31, 2025 04:52

toddmeng-db changed the title ~~fix(csharp/src/Drivers/Databricks): Tighten Statement, Reader, Poller Disposal~~ fix(csharp/src/Drivers/Databricks): Correct StatusPoller to Stop/Dispose Appropriately Aug 1, 2025

toddmeng-db force-pushed the toddmeng-db/operation-status-poller-error-handling branch 4 times, most recently from 8b88019 to 8e54490 Compare August 1, 2025 16:44

toddmeng-db force-pushed the toddmeng-db/operation-status-poller-error-handling branch from 579e26d to be06c48 Compare August 2, 2025 01:30

toddmeng-db requested a review from jadewang-db August 4, 2025 17:03

alexguo-db reviewed Aug 4, 2025

View reviewed changes

Comment thread csharp/src/Drivers/Databricks/DatabricksCompositeReader.cs

Comment thread csharp/src/Drivers/Databricks/DatabricksOperationStatusPoller.cs Outdated

nit fix

2b5c4f3

CurtHagenlocher changed the title ~~fix(csharp/src/Drivers/Databricks): Correct DatabricksCompositeReader and StatusPoller to Stop/Dispose Appropriately~~ fix(csharp/src/Drivers/Databricks): Correct DatabricksCompositeReader and StatusPoller to Stop/Dispose Appropriately Aug 5, 2025

toddmeng-db added 2 commits August 6, 2025 10:24

nit fix

fca8ce0

stop polling unit test

e5c962c

jackyhu-db approved these changes Aug 6, 2025

View reviewed changes

toddmeng-db requested review from CurtHagenlocher and alexguo-db August 6, 2025 21:34

toddmeng-db marked this pull request as ready for review August 6, 2025 21:35

github-actions Bot added this to the ADBC Libraries 20 milestone Aug 6, 2025

jackyhu-db reviewed Aug 7, 2025

View reviewed changes

Comment thread csharp/src/Drivers/Databricks/DatabricksCompositeReader.cs Outdated

nit fixes

a39f702

toddmeng-db force-pushed the toddmeng-db/operation-status-poller-error-handling branch 2 times, most recently from 9242fd2 to efecc82 Compare August 7, 2025 19:40

use querytimeout parameter for cloudfetchresultfetcher timeout

65f9d0d

toddmeng-db force-pushed the toddmeng-db/operation-status-poller-error-handling branch from efecc82 to 65f9d0d Compare August 7, 2025 19:41

CurtHagenlocher requested changes Aug 8, 2025

View reviewed changes

toddmeng-db force-pushed the toddmeng-db/operation-status-poller-error-handling branch 4 times, most recently from f559692 to 5a48ef2 Compare August 8, 2025 19:54

nit fixes

4130c83

toddmeng-db force-pushed the toddmeng-db/operation-status-poller-error-handling branch from 5a48ef2 to 4130c83 Compare August 8, 2025 19:55

toddmeng-db requested a review from CurtHagenlocher August 8, 2025 20:00

CurtHagenlocher approved these changes Aug 8, 2025

View reviewed changes

CurtHagenlocher merged commit f0f36da into apache:main Aug 8, 2025
7 checks passed

serramatutu mentioned this pull request Aug 12, 2025

Sync upstream dbt-labs/arrow-adbc#52

Closed

		@@ -247,6 +247,10 @@ private async Task FetchResultsAsync(CancellationToken cancellationToken)
		_downloadQueue.Add(EndOfResultsGuard.Instance, cancellationToken);
		_isCompleted = true;

		@@ -69,13 +72,18 @@ private async Task PollOperationStatus(CancellationToken cancellationToken)
		var operationHandle = _statement.OperationHandle;
		if (operationHandle == null) break;

Conversation

toddmeng-db commented Jul 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Fixes

TODO

Uh oh!

jadewang-db commented Jul 29, 2025

Uh oh!

Uh oh!

Uh oh!

toddmeng-db Jul 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

toddmeng-db Jul 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

toddmeng-db Aug 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

toddmeng-db Aug 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

toddmeng-db Aug 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

CurtHagenlocher left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

toddmeng-db commented Jul 29, 2025 •

edited

Loading

toddmeng-db Jul 30, 2025 •

edited

Loading

toddmeng-db Jul 30, 2025 •

edited

Loading

toddmeng-db Aug 1, 2025 •

edited

Loading

toddmeng-db Aug 2, 2025 •

edited

Loading

toddmeng-db Aug 7, 2025 •

edited

Loading