[Python] Honor disableCounterMetrics, disableStringSetMetrics, and disableBoundedTrieMetrics experiments by Anuragp22 · Pull Request #38749 · apache/beam

Anuragp22 · 2026-05-30T06:27:04Z

Adds Python SDK support for the disableCounterMetrics, disableStringSetMetrics, and disableBoundedTrieMetrics pipeline experiments. High-throughput Python pipelines previously had no way to opt out of user metric kinds that add pressure to metric backends; Java already honored these three experiments, this PR brings parity to Python.

What changes

sdks/python/apache_beam/metrics/metric.py: new internal MetricsFlag class with idempotent set_default_pipeline_options. DelegatingCounter.inc, DelegatingStringSet.add, and DelegatingBoundedTrie.add now consult the flag and early-return when disabled, so no MetricCell is touched and no monitoring info is emitted.
sdks/python/apache_beam/pipeline.py: calls MetricsFlag.set_default_pipeline_options(self._options) in Pipeline.__init__, right after FileSystems.set_options, so DirectRunner and other in-process runners honor the experiments.
sdks/python/apache_beam/runners/worker/sdk_worker_main.py: calls the same init during harness startup so worker subprocesses (which do not inherit Python class state across fork/spawn) re-read their own options.
sdks/python/apache_beam/metrics/metric_test.py: tests for the flag values and per-kind no-op behavior (asserts no MetricCell is created in the container when the kind is disabled).
CHANGES.md: entry under 2.75.0 New Features / Improvements.

Note on Lineage

disableBoundedTrieMetrics will also suppress Lineage data in Python, because Python's Lineage uses DelegatingBoundedTrie unconditionally. The Java SDK avoids this by routing Lineage through StringSetMetricsLineage unless enableLineageRollup is set, and a Lineage-specific backend split is out of scope for this issue.

Fixes #38746.

R: @Abacn

Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:

Mention the appropriate issue in your description (for example: addresses #123), if applicable. This will automatically add a link to the pull request in the issue. If you would like the issue to automatically close on merging the pull request, comment fixes #<ISSUE NUMBER> instead.
Update CHANGES.md with noteworthy changes.
If this contribution is large, please file an Apache Individual Contributor License Agreement.

…bleBoundedTrieMetrics experiments Mirrors the Java SDK Metrics.MetricsFlag behavior so high-throughput Python pipelines can opt out of user metric kinds that add pressure to metric backends. Adds a process-wide MetricsFlag singleton initialized once at worker harness startup from pipeline experiments. When a flag is set, the corresponding Delegating* class short-circuits its update path so no MetricCell is touched and no monitoring info is emitted. For apache#38746.

Ports Java MetricsTest.testMetricsFlag and adds three smoke tests confirming the disabled Counter / StringSet / BoundedTrie short-circuit without raising when called on a no-current-tracker path. For apache#38746.

For apache#38746.

gemini-code-assist · 2026-05-30T06:30:32Z

Summary of Changes

Hello, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request adds support for three specific pipeline experiments—disableCounterMetrics, disableStringSetMetrics, and disableBoundedTrieMetrics—to the Python SDK. By enabling these flags, users can prevent the emission of specific metric kinds, which is particularly useful for optimizing performance in high-throughput pipelines. The changes ensure that the Python SDK mirrors the existing behavior found in the Java SDK, providing consistent control over metric backends.

Highlights

Metrics Control: Introduced a new MetricsFlag class to manage and honor pipeline experiments for disabling specific metric types (Counter, StringSet, and BoundedTrie).
SDK Parity: Aligned Python SDK behavior with the Java SDK by allowing high-throughput pipelines to opt out of metric emission, reducing backend pressure.
Initialization: Updated the worker harness initialization process to read and apply these flags from pipeline options before metrics are processed.

New Features

🧠 You can now enable Memory (public preview) to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize the Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counterproductive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request introduces MetricsFlag to the Python SDK, allowing it to honor pipeline experiments (disableCounterMetrics, disableStringSetMetrics, and disableBoundedTrieMetrics) to opt out of emitting specific user metric kinds, matching the Java SDK behavior. The feedback suggests optimizing performance on the hot path of metric increments by exposing the disabled flags as public class attributes directly on MetricsFlag instead of using @classmethod getter methods. This avoids method call overhead and aligns with PEP 8 recommendations. The suggestions cover refactoring the MetricsFlag class, updating the delegating metric classes, and adjusting the corresponding unit tests.

gemini-code-assist · 2026-05-30T06:31:44Z

+  _counter_disabled = False
+  _string_set_disabled = False
+  _bounded_trie_disabled = False
+  _initialized = False
+
+  @classmethod
+  def set_default_pipeline_options(cls, options: 'PipelineOptions') -> None:
+    """Initialize the flags from pipeline experiments. Idempotent."""
+    if cls._initialized:
+      return
+    debug_options = options.view_as(DebugOptions)
+    if debug_options.lookup_experiment('disableCounterMetrics'):
+      cls._counter_disabled = True
+      _LOGGER.info('Counter metrics are disabled.')
+    if debug_options.lookup_experiment('disableStringSetMetrics'):
+      cls._string_set_disabled = True
+      _LOGGER.info('StringSet metrics are disabled.')
+    if debug_options.lookup_experiment('disableBoundedTrieMetrics'):
+      cls._bounded_trie_disabled = True
+      _LOGGER.info('BoundedTrie metrics are disabled.')
+    cls._initialized = True
+
+  @classmethod
+  def reset(cls) -> None:
+    """Reset flags. Test-only."""
+    cls._counter_disabled = False
+    cls._string_set_disabled = False
+    cls._bounded_trie_disabled = False
+    cls._initialized = False
+
+  @classmethod
+  def counter_disabled(cls) -> bool:
+    return cls._counter_disabled
+
+  @classmethod
+  def string_set_disabled(cls) -> bool:
+    return cls._string_set_disabled
+
+  @classmethod
+  def bounded_trie_disabled(cls) -> bool:
+    return cls._bounded_trie_disabled


To avoid the overhead of @classmethod descriptor lookup and function calls on the hot path of metric increments, we can simplify MetricsFlag by exposing the disabled flags as public class attributes directly. This aligns with PEP 8 recommendations to avoid redundant getter/setter methods in Python and improves performance.

counter_disabled = False string_set_disabled = False bounded_trie_disabled = False _initialized = False @classmethod def set_default_pipeline_options(cls, options: 'PipelineOptions') -> None: """Initialize the flags from pipeline experiments. Idempotent.""" if cls._initialized: return debug_options = options.view_as(DebugOptions) if debug_options.lookup_experiment('disableCounterMetrics'): cls.counter_disabled = True _LOGGER.info('Counter metrics are disabled.') if debug_options.lookup_experiment('disableStringSetMetrics'): cls.string_set_disabled = True _LOGGER.info('StringSet metrics are disabled.') if debug_options.lookup_experiment('disableBoundedTrieMetrics'): cls.bounded_trie_disabled = True _LOGGER.info('BoundedTrie metrics are disabled.') cls._initialized = True @classmethod def reset(cls) -> None: """Reset flags. Test-only.""" cls.counter_disabled = False cls.string_set_disabled = False cls.bounded_trie_disabled = False cls._initialized = False

References

PEP 8 recommends avoiding accessor/getter/setter methods for simple attribute access in Python, preferring direct attribute access instead. ^(link)

gemini-code-assist · 2026-05-30T06:31:44Z

+    def inc(self, n: int = 1) -> None:
+      if MetricsFlag.counter_disabled():
+        return
+      self._updater(n)


Access the counter_disabled class attribute directly instead of calling a classmethod getter to avoid method call overhead on the hot path.

Suggested change

def inc(self, n: int = 1) -> None:

if MetricsFlag.counter_disabled():

return

self._updater(n)

def inc(self, n: int = 1) -> None:

if MetricsFlag.counter_disabled:

return

self._updater(n)

gemini-code-assist · 2026-05-30T06:31:44Z

+    def add(self, value: str) -> None:
+      if MetricsFlag.string_set_disabled():
+        return
+      self._updater(value)

  class DelegatingBoundedTrie(BoundedTrie):
-    """Metrics StringSet that Delegates functionality to MetricsEnvironment."""
+    """Metrics BoundedTrie that Delegates functionality to MetricsEnvironment."""
    def __init__(self, metric_name: MetricName) -> None:
      super().__init__(metric_name)
-      self.add = MetricUpdater(cells.BoundedTrieCell, metric_name)  # type: ignore[method-assign]
+      self._updater = MetricUpdater(cells.BoundedTrieCell, metric_name)
+
+    def add(self, value) -> None:
+      if MetricsFlag.bounded_trie_disabled():
+        return
+      self._updater(value)


Access the string_set_disabled and bounded_trie_disabled class attributes directly instead of calling classmethod getters to avoid method call overhead on the hot path.

Suggested change

def add(self, value: str) -> None:

if MetricsFlag.string_set_disabled():

return

self._updater(value)

class DelegatingBoundedTrie(BoundedTrie):

"""Metrics StringSet that Delegates functionality to MetricsEnvironment."""

"""Metrics BoundedTrie that Delegates functionality to MetricsEnvironment."""

def __init__(self, metric_name: MetricName) -> None:

super().__init__(metric_name)

self.add = MetricUpdater(cells.BoundedTrieCell, metric_name) # type: ignore[method-assign]

self._updater = MetricUpdater(cells.BoundedTrieCell, metric_name)

def add(self, value) -> None:

if MetricsFlag.bounded_trie_disabled():

return

self._updater(value)

def add(self, value: str) -> None:

if MetricsFlag.string_set_disabled:

return

self._updater(value)

class DelegatingBoundedTrie(BoundedTrie):

"""Metrics BoundedTrie that Delegates functionality to MetricsEnvironment."""

def __init__(self, metric_name: MetricName) -> None:

super().__init__(metric_name)

self._updater = MetricUpdater(cells.BoundedTrieCell, metric_name)

def add(self, value) -> None:

if MetricsFlag.bounded_trie_disabled:

return

self._updater(value)

gemini-code-assist · 2026-05-30T06:31:44Z

+  def test_metrics_flag(self):
+    """Mirrors Java MetricsTest.testMetricsFlag for the three disable* experiments."""
+    MetricsFlag.reset()
+    self.assertFalse(MetricsFlag.counter_disabled())
+    self.assertFalse(MetricsFlag.string_set_disabled())
+    self.assertFalse(MetricsFlag.bounded_trie_disabled())
+
+    options = PipelineOptions(['--experiments=disableCounterMetrics'])
+    MetricsFlag.set_default_pipeline_options(options)
+    self.assertTrue(MetricsFlag.counter_disabled())
+    self.assertFalse(MetricsFlag.string_set_disabled())
+    self.assertFalse(MetricsFlag.bounded_trie_disabled())
+
+    MetricsFlag.reset()
+    options = PipelineOptions(['--experiments=disableStringSetMetrics'])
+    MetricsFlag.set_default_pipeline_options(options)
+    self.assertFalse(MetricsFlag.counter_disabled())
+    self.assertTrue(MetricsFlag.string_set_disabled())
+    self.assertFalse(MetricsFlag.bounded_trie_disabled())
+
+    MetricsFlag.reset()
+    options = PipelineOptions(['--experiments=disableBoundedTrieMetrics'])
+    MetricsFlag.set_default_pipeline_options(options)
+    self.assertFalse(MetricsFlag.counter_disabled())
+    self.assertFalse(MetricsFlag.string_set_disabled())
+    self.assertTrue(MetricsFlag.bounded_trie_disabled())
+
+    MetricsFlag.reset()
+    options = PipelineOptions([
+        '--experiments=disableCounterMetrics',
+        '--experiments=disableStringSetMetrics',
+        '--experiments=disableBoundedTrieMetrics',
+    ])
+    MetricsFlag.set_default_pipeline_options(options)
+    self.assertTrue(MetricsFlag.counter_disabled())
+    self.assertTrue(MetricsFlag.string_set_disabled())
+    self.assertTrue(MetricsFlag.bounded_trie_disabled())
+
+    MetricsFlag.reset()


Update the tests to assert on the public class attributes directly instead of calling the removed getter methods.

def test_metrics_flag(self): """Mirrors Java MetricsTest.testMetricsFlag for the three disable* experiments.""" MetricsFlag.reset() self.assertFalse(MetricsFlag.counter_disabled) self.assertFalse(MetricsFlag.string_set_disabled) self.assertFalse(MetricsFlag.bounded_trie_disabled) options = PipelineOptions(['--experiments=disableCounterMetrics']) MetricsFlag.set_default_pipeline_options(options) self.assertTrue(MetricsFlag.counter_disabled) self.assertFalse(MetricsFlag.string_set_disabled) self.assertFalse(MetricsFlag.bounded_trie_disabled) MetricsFlag.reset() options = PipelineOptions(['--experiments=disableStringSetMetrics']) MetricsFlag.set_default_pipeline_options(options) self.assertFalse(MetricsFlag.counter_disabled) self.assertTrue(MetricsFlag.string_set_disabled) self.assertFalse(MetricsFlag.bounded_trie_disabled) MetricsFlag.reset() options = PipelineOptions(['--experiments=disableBoundedTrieMetrics']) MetricsFlag.set_default_pipeline_options(options) self.assertFalse(MetricsFlag.counter_disabled) self.assertFalse(MetricsFlag.string_set_disabled) self.assertTrue(MetricsFlag.bounded_trie_disabled) MetricsFlag.reset() options = PipelineOptions([ '--experiments=disableCounterMetrics', '--experiments=disableStringSetMetrics', '--experiments=disableBoundedTrieMetrics', ]) MetricsFlag.set_default_pipeline_options(options) self.assertTrue(MetricsFlag.counter_disabled) self.assertTrue(MetricsFlag.string_set_disabled) self.assertTrue(MetricsFlag.bounded_trie_disabled) MetricsFlag.reset()

Drops the classmethod getter wrapper around each disabled flag and exposes counter_disabled / string_set_disabled / bounded_trie_disabled directly as public class attributes. The gate runs on every metric update, so swapping a descriptor lookup + function call for a single attribute load matters in exactly the high-throughput pipelines these experiments are designed for. Idiomatic Python; java parity is about behavior, not internal API shape. Addresses review feedback. For apache#38746.

github-actions · 2026-05-30T08:25:13Z

Assigning reviewers:

R: @jrmccluskey for label python.

Note: If you would like to opt out of this review, comment assign to next reviewer.

Available commands:

stop reviewer notifications - opt out of the automated review tooling
remind me after tests pass - tag the comment author after tests pass
waiting on author - shift the attention set back to the author (any comment or push by the author will return the attention set to the reviewers)

The PR bot will only process comments in the main thread (not review comments).

Abacn · 2026-05-30T18:48:19Z

+class MetricsFlag(object):
+  """Process-wide flags controlling which user metric kinds are emitted.
+
+  Mirrors the Java SDK ``Metrics.MetricsFlag`` behavior. The flags are read


We don't need to mention "Mirrors the Java SDK..." in Python pydoc, as we don't assume Beam python user have Java knowledge (or even aware of Java SDK). In general please simplify pydocs as MetricsFlag is internal API

Done in 7839e72. Dropped the Java reference and trimmed method docstrings.

Abacn · 2026-05-30T18:53:36Z

+    self.assertTrue(MetricsFlag.string_set_disabled)
+    self.assertTrue(MetricsFlag.bounded_trie_disabled)
+
+    MetricsFlag.reset()


Consider wrap the test into a try block and then

finally: MetricsFlag.reset()

as below

Done in 7839e72. Wrapped the body in try and moved reset into finally.

…inally

Abacn · 2026-05-30T19:15:36Z

+    finally:
+      MetricsFlag.reset()
+
+  def test_disabled_bounded_trie_is_noop(self):


These ..._is_noop tests aren't checking anything.

Done in 859715d. Each test now sets up a StateSampler with a MetricsContainer, runs a baseline inc/add to confirm the container grows to 1, then enables the disable experiment and runs another inc/add and asserts the count is still 1, proving no MetricCell was created.

…onor disable* experiments The previous init only ran in the gRPC SDK harness (sdk_worker_main), so DirectRunner and other in-process runners silently ignored the disableCounterMetrics / disableStringSetMetrics / disableBoundedTrieMetrics experiments. Initializing in Pipeline.__init__ matches the pattern of FileSystems.set_options at the same call site and mirrors the Java init point (PipelineRunner.run).

Anuragp22 · 2026-05-30T19:46:58Z

859715d: real assertions on the disabled-noop tests per the review comment.
80c13c3: also initialize MetricsFlag from Pipeline.init. Previously the init only ran in sdk_worker_main, so DirectRunner and other in-process runners silently ignored the experiments. Mirrors the Java init point in PipelineRunner.run and sits next to the existing FileSystems.set_options call.

Also updated the PR description with a note that disableBoundedTrieMetrics will suppress Lineage data in Python because Python's Lineage uses DelegatingBoundedTrie unconditionally. In Java that path is gated by lineageRollupEnabled, but porting the dual-backend split is out of scope for this issue.

Anuragp22 added 3 commits May 30, 2026 11:45

[Python] Add unit tests for MetricsFlag

93c99a9

Ports Java MetricsTest.testMetricsFlag and adds three smoke tests confirming the disabled Counter / StringSet / BoundedTrie short-circuit without raising when called on a no-current-tracker path. For apache#38746.

[Python] Note disable*Metrics support in CHANGES.md

f860d86

For apache#38746.

github-actions Bot added python runners labels May 30, 2026

gemini-code-assist Bot reviewed May 30, 2026

View reviewed changes

github-actions Bot added the Next Action: Reviewers label May 30, 2026

Abacn reviewed May 30, 2026

View reviewed changes

[Python] Address PR review: simplify MetricsFlag pydoc and reset in f…

7839e72

…inally

Abacn reviewed May 30, 2026

View reviewed changes

Anuragp22 added 2 commits May 31, 2026 01:12

[Python] Address PR review: assert no MetricCell created when disabled

859715d

Conversation

Anuragp22 commented May 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changes

Note on Lineage

Uh oh!

gemini-code-assist Bot commented May 30, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot May 30, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 30, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 30, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 30, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot commented May 30, 2026

Uh oh!

Abacn May 30, 2026

Choose a reason for hiding this comment

Uh oh!

Anuragp22 May 30, 2026

Choose a reason for hiding this comment

Uh oh!

Abacn May 30, 2026

Choose a reason for hiding this comment

Uh oh!

Anuragp22 May 30, 2026

Choose a reason for hiding this comment

Uh oh!

Abacn May 30, 2026

Choose a reason for hiding this comment

Uh oh!

Anuragp22 May 30, 2026

Choose a reason for hiding this comment

Uh oh!

Anuragp22 commented May 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Anuragp22 commented May 30, 2026 •

edited

Loading