[DRAFT] feat: added client side metric instrumentation to read_rows and mutate_rows#16758
Conversation
There was a problem hiding this comment.
Code Review
This pull request integrates client-side metrics tracking into Bigtable data operations, including ReadRows and MutateRows, for both asynchronous and synchronous implementations. It introduces tracked_retry to capture operation metrics and refactors read_row to utilize operation classes directly, supported by extensive new system and unit tests. Feedback identifies opportunities to improve the efficiency of read_row by using next() or __anext__ on generators instead of converting them to lists, and suggests adding a type ignore comment in the synchronous code for consistency.
| sleep_generator, | ||
| operation_timeout, | ||
| exception_factory=_retry_exception_factory, | ||
| self._operation = lambda: tracked_retry( |
There was a problem hiding this comment.
where did sleep_generator and exception_factory go? will this still wrap timeout exceptions correctly into Deadline Exceeded excetpion?
There was a problem hiding this comment.
| raise close_exception | ||
| except Exception as generic_exception: | ||
| # handle exceptions in retry wrapper | ||
| raise generic_exception |
There was a problem hiding this comment.
Should we try extract the status code from Exception and call end_with_status in this block also? For example, if the stream timed out mid stream, will it be captured in the metrics?
There was a problem hiding this comment.
In this case, the exceptions are being captured by the tracked_retry object. I tried to cover the various tracking strategies here: go/bigtable-csm-python
| bt_exceptions.FailedMutationEntryError(idx, entry, cause_exc) | ||
| ) | ||
| if all_errors: | ||
| raise bt_exceptions.MutationsExceptionGroup( |
There was a problem hiding this comment.
will this affect extracting the grpc status code for the metric since we wrap it in a different exception?
There was a problem hiding this comment.
We have logic for converting exceptions to statuses is here. When we encounter an exception group, we use the last failure as the status
| bt_exceptions.FailedMutationEntryError(idx, entry, cause_exc) | ||
| ) | ||
| if all_errors: | ||
| raise bt_exceptions.MutationsExceptionGroup( |
There was a problem hiding this comment.
Same question here, will this error code get extracted correctly for metrics?
Migration of googleapis/python-bigtable#1256 to the monorepo
Builds off of #16712 to add instrumentation to read_rows and mutate_rows, along with the mutation batcher