Skip to content

libdatadog update to c79d783f#3983

Open
dd-octo-sts[bot] wants to merge 2 commits into
masterfrom
bot/libdatadog-latest
Open

libdatadog update to c79d783f#3983
dd-octo-sts[bot] wants to merge 2 commits into
masterfrom
bot/libdatadog-latest

Conversation

@dd-octo-sts

@dd-octo-sts dd-octo-sts Bot commented Jun 13, 2026

Copy link
Copy Markdown
Contributor

Summary

Automated update of the libdatadog submodule to the latest HEAD.

SHA
Previous $LIBDATADOG_PINNED_SHA
New c79d783f79f4a2d1e637906f3323600c6e2b5b17

Full CI result: ❌ 606 job(s) failed
CI pipeline: https://gitlab.ddbuild.io/DataDog/apm-reliability/dd-trace-php/-/pipelines/118651636


libdatadog Integration Report

libdatadog SHA: c79d783f79f4a2d1e637906f3323600c6e2b5b17
Analysis date: 2026-06-14

Overall status

⚠️ Adapted (API changes fixed)

A single libdatadog FFI signature change is responsible for the entire failure set.
One new parameter (retry_interval_milliseconds) was added to
ddog_sidecar_session_set_config. The committed C header and the C caller were
adapted to the new signature.

Build & test summary

  • Nothing failed to compile. There are no Rust compiler errors (error[E…]),
    no cannot find, no mismatched types, no unresolved imports in any trace.
    Both the Rust workspace and the C extensions built successfully — test jobs ran,
    which means the prebuilt .so/packages they install were produced by passing
    build jobs.
  • The failures are all runtime crashes, not compile or logic failures.
    31 of the captured traces show Segmentation fault / exit code 139 /
    core dumped. No Rust panics appear anywhere (no panicked, no
    RUST_BACKTRACE, no thread '…'). A segfault with no panic is the signature of
    memory corruption at the C↔Rust boundary, not a logic bug inside Rust.
  • The crash only affects code paths that initialize the sidecar / send traces:
    • x-profiling phpt tests on alpine — all PHP versions 7.1–8.5 segfault.
    • test_distributed_tracing — segfaults with both sidecar_trace_sender=0 and =1.
    • appsec integration tests (incl. helper-rust) — composer install itself
      core-dumps (/tmp/cores/core.php.*), failing ~194 tests per job.
    • system_tests (appsec_api_security, runtime_activation, crossed_tracing) —
      segfault during install_ddtrace.sh.
    • appsec extension client_init_sidecar.phpt — the one extension test that
      activates the sidecar — fails.
  • Tests that do NOT exercise the sidecar pass. The appsec extension suite
    passes 345/346 (only client_init_sidecar fails). If this were a broad ABI/
    header break, those extension tests would crash en masse; they do not. This
    localized the regression to the sidecar configuration path.

The remaining ~600 web/integration/language test failures are downstream of the
same crash: every traced PHP process connects to the sidecar at startup via
dd_sidecar_post_connect(), which calls the corrupted function, so the process
crashes before tests can pass.

Non-trivial changes made

Root cause

libdatadog commit aceec12b1feat(sidecar): add retry interval configuration
(#2106)
— added a new parameter to the exported FFI function
ddog_sidecar_session_set_config (in libdatadog/datadog-sidecar-ffi/src/lib.rs):

flush_interval_milliseconds: u32,
retry_interval_milliseconds: u32,          // <-- NEW, inserted here
remote_config_poll_interval_millis: u32,
...

The dd-trace-php side ships a committed, hand-checked-in cbindgen header
(components-rs/sidecar.h) that is not regenerated during the build (cbindgen
only runs via the manual make cbindgen target). The header therefore still
declared the old parameter list, and ext/sidecar.c called the function with the
old argument list.

Because C does not validate argument count against the actual symbol, this
compiles and links cleanly, but at runtime every argument after
flush_interval_milliseconds is passed in the wrong register/stack slot. In
particular the pointer arguments (remote_config_products,
remote_config_capabilities, the Endpoint/Vec<Tag> references) receive
shifted, garbage values, which the sidecar then dereferences — producing the
SIGSEGV seen across every trace-sending job.

Fix

Adapted both the header declaration and the single call site to the new signature,
passing 100 for the retry interval, which matches libdatadog's prior hardcoded
default (RetryStrategy::default() uses Duration::from_millis(100) in
libdatadog/libdd-trace-utils/src/send_with_retry/retry_strategy.rs), so runtime
behavior is preserved.

  • components-rs/sidecar.h — added
    uint32_t retry_interval_milliseconds, between flush_interval_milliseconds
    and remote_config_poll_interval_millis in the
    ddog_sidecar_session_set_config declaration.

  • ext/sidecar.c — added the 100 argument (trace-send retry interval, in
    ms) at the corresponding position in the ddog_sidecar_session_set_config(...)
    call inside dd_sidecar_post_connect(). The full call now aligns
    parameter-for-parameter with the new Rust signature.

ext/sidecar.c is the only caller in the repository, and components-rs/sidecar.h
is the only declaration; no other copies exist.

Audit of other headers (no changes needed)

The remaining committed FFI headers were compared against the new libdatadog
source. No ABI-affecting changes were found:

  • common.h / telemetry.h (libdd-common-ffi, libdd-telemetry-ffi): ddog_Error,
    ddog_Vec_U8, CharSlice, Slice, Option_*, Timespec, Method,
    MetricType, etc. all match. The #2029 "align tracer FFI error/response types"
    change affects libdd-data-pipeline-ffi, which is not consumed via these headers.
  • crashtracker.h (libdd-crashtracker-ffi): Config, ReceiverConfig,
    Metadata, RuntimeStackFrame (constructed by value in
    ext/crashtracking_frames.c / ext/signals.c) all match. The "flatten threads"
    / "frame count" changes are in the internal crash-report serialization, not the
    C FFI surface.
  • library-config.h, live-debugger.h: all #[repr(C)] structs/enums and
    signatures match. ddog_Probe (copied by value in tracer/live_debugger.c) is
    unchanged.
  • datadog.h / rest of sidecar.h: no other signature mismatches found.

Identified libdatadog issues

None identified. The breaking change (#2106) is a legitimate, intentional API
evolution; the failure was on the dd-trace-php side because the committed cbindgen
header had not been regenerated. Note for maintainers: after merging, regenerate
all headers with make cbindgen to catch any further drift, since the build does
not do this automatically.

Flaky / ignored failures

None clearly attributable to flakiness. All 606 failures are consistent with the
single sidecar-config ABI break (crash on sidecar connect, which every traced
process performs). Once the signature is corrected the crash is removed; any
residual failures after a re-run would need separate triage, but no independent
failure cause was identified in the captured traces.


/cc @bwoebi

@dd-octo-sts dd-octo-sts Bot requested review from a team as code owners June 13, 2026 05:00
@dd-octo-sts dd-octo-sts Bot requested review from leoromanovsky and sameerank and removed request for a team June 13, 2026 05:00

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: f826cc728b

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread libdatadog
@@ -1 +1 @@
Subproject commit 6760faaeeda1cfcf634410105f93cf7149265592
Subproject commit c79d783f79f4a2d1e637906f3323600c6e2b5b17

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Regenerate the sidecar FFI signature

This libdatadog bump changes ddog_sidecar_session_set_config by inserting retry_interval_milliseconds after flush_interval_milliseconds, but the checked-in components-rs/sidecar.h and the call in ext/sidecar.c still use the old argument list. Because C compiles against the stale header, this will not be caught at compile time; at runtime the new Rust FFI function will read every argument after the flush interval in the wrong slot, so normal sidecar startup can misconfigure intervals/sizes and eventually interpret non-pointer values as strings or callbacks. Please regenerate/update the header and pass the new retry interval at the call site as part of this bump.

Useful? React with 👍 / 👎.

Comment thread components-rs/ffe.rs Outdated
AssignmentReason::Static => REASON_STATIC,
AssignmentReason::TargetingMatch => REASON_TARGETING_MATCH,
AssignmentReason::Split => REASON_SPLIT,
AssignmentReason::Default => REASON_DEFAULT,

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Handle invalid flag configs as defaults

The new libdatadog revision also adds EvaluationError::FlagConfigurationInvalid, which get_assignment can return for a requested flag whose per-flag config is invalid/unsupported; upstream FFE FFI maps that case to DEFAULT with no error so callers get their supplied default. This wrapper only added the new assignment reason, while the error match below still sends the new error through _ => (ERROR_GENERAL, REASON_ERROR), so those flags will now surface as evaluation errors instead of default evaluations. Please add an explicit arm for the new error variant.

Useful? React with 👍 / 👎.

@datadog-prod-us1-3

datadog-prod-us1-3 Bot commented Jun 13, 2026

Copy link
Copy Markdown

Pipelines  Tests

Fix all issues with BitsAI

⚠️ Warnings

🚦 57 Pipeline jobs failed

DataDog/apm-reliability/dd-trace-php | ASAN test_c: [8.5, arm64]   View in Datadog   GitLab

🧪 1 Test failed

tmp/build_extension/tests/ext/ffe/native_bridge_evaluate.phpt (FFE native bridge evaluates through libdatadog) from php.tmp.build_extension.tests.ext.ffe   View in Datadog (Fix with Cursor)
--
     empty_targeting_key={&#34;valueJson&#34;:&#34;\&#34;empty-targeting-key\&#34;&#34;,&#34;variant&#34;:&#34;empty-target&#34;,&#34;allocationKey&#34;:&#34;alloc-empty-targeting-key&#34;,&#34;reason&#34;:3,&#34;errorCode&#34;:0,&#34;doLog&#34;:true,&#34;providerState&#34;:[],&#34;errorMessage&#34;:null,&#34;hasConfig&#34;:null,&#34;configVersion&#34;:null}
     missing={&#34;valueJson&#34;:&#34;null&#34;,&#34;variant&#34;:null,&#34;allocationKey&#34;:null,&#34;reason&#34;:5,&#34;errorCode&#34;:3,&#34;doLog&#34;:false,&#34;providerState&#34;:[],&#34;errorMessage&#34;:null,&#34;hasConfig&#34;:null,&#34;configVersion&#34;:null}
     type_mismatch={&#34;valueJson&#34;:&#34;null&#34;,&#34;variant&#34;:null,&#34;allocationKey&#34;:null,&#34;reason&#34;:5,&#34;errorCode&#34;:1,&#34;doLog&#34;:false,&#34;providerState&#34;:[],&#34;errorMessage&#34;:null,&#34;hasConfig&#34;:null,&#34;configVersion&#34;:null}
015- parse_error={&#34;valueJson&#34;:&#34;null&#34;,&#34;variant&#34;:null,&#34;allocationKey&#34;:null,&#34;reason&#34;:5,&#34;errorCode&#34;:2,&#34;doLog&#34;:false,&#34;providerState&#34;:[],&#34;errorMessage&#34;:null,&#34;hasConfig&#34;:null,&#34;configVersion&#34;:null}
015&#43; parse_error={&#34;valueJson&#34;:&#34;null&#34;,&#34;variant&#34;:null,&#34;allocationKey&#34;:null,&#34;reason&#34;:5,&#34;errorCode&#34;:7,&#34;doLog&#34;:false,&#34;providerState&#34;:[],&#34;errorMessage&#34;:null,&#34;hasConfig&#34;:null,&#34;configVersion&#34;:null}

DataDog/apm-reliability/dd-trace-php | ASAN test_c: [8.0, arm64]   View in Datadog   GitLab

🧪 1 Test failed

All test failures are known flaky.

❄️ Known flaky: tmp/build_extension/tests/ext/ffe/native_bridge_evaluate.phpt (FFE native bridge evaluates through libdatadog) from PHP.tmp.build_extension.tests.ext.ffe   View in Datadog (Fix with Cursor)
--
     empty_targeting_key={&#34;valueJson&#34;:&#34;\&#34;empty-targeting-key\&#34;&#34;,&#34;variant&#34;:&#34;empty-target&#34;,&#34;allocationKey&#34;:&#34;alloc-empty-targeting-key&#34;,&#34;reason&#34;:3,&#34;errorCode&#34;:0,&#34;doLog&#34;:true,&#34;providerState&#34;:[],&#34;errorMessage&#34;:null,&#34;hasConfig&#34;:null,&#34;configVersion&#34;:null}
     missing={&#34;valueJson&#34;:&#34;null&#34;,&#34;variant&#34;:null,&#34;allocationKey&#34;:null,&#34;reason&#34;:5,&#34;errorCode&#34;:3,&#34;doLog&#34;:false,&#34;providerState&#34;:[],&#34;errorMessage&#34;:null,&#34;hasConfig&#34;:null,&#34;configVersion&#34;:null}
     type_mismatch={&#34;valueJson&#34;:&#34;null&#34;,&#34;variant&#34;:null,&#34;allocationKey&#34;:null,&#34;reason&#34;:5,&#34;errorCode&#34;:1,&#34;doLog&#34;:false,&#34;providerState&#34;:[],&#34;errorMessage&#34;:null,&#34;hasConfig&#34;:null,&#34;configVersion&#34;:null}
015&#43; parse_error={&#34;valueJson&#34;:&#34;null&#34;,&#34;variant&#34;:null,&#34;allocationKey&#34;:null,&#34;reason&#34;:5,&#34;errorCode&#34;:7,&#34;doLog&#34;:false,&#34;providerState&#34;:[],&#34;errorMessage&#34;:null,&#34;hasConfig&#34;:null,&#34;configVersion&#34;:null}
015- parse_error={&#34;valueJson&#34;:&#34;null&#34;,&#34;variant&#34;:null,&#34;allocationKey&#34;:null,&#34;reason&#34;:5,&#34;errorCode&#34;:2,&#34;doLog&#34;:false,&#34;providerState&#34;:[],&#34;errorMessage&#34;:null,&#34;hasConfig&#34;:null,&#34;configVersion&#34;:null}

Not introduced in this PR.

DataDog/apm-reliability/dd-trace-php | ASAN test_c with multiple observers: [8.0]   View in Datadog   GitLab

View all 57 failed jobs.

ℹ️ Info

No other issues found (see more)

❄️ No new flaky tests detected

🔄 Datadog auto-retried 3 jobs - 0 passed on retry View in Datadog

🎯 Code Coverage (details)
Patch Coverage: 100.00%
Overall Coverage: 54.08% (-0.04%)

Useful? React with 👍 / 👎

This comment will be updated automatically if new data arrives.
🔗 Commit SHA: d3d1bae | Docs | Datadog PR Page | Give us feedback!

@dd-octo-sts dd-octo-sts Bot requested a review from a team as a code owner June 13, 2026 10:13
@pr-commenter

pr-commenter Bot commented Jun 13, 2026

Copy link
Copy Markdown

Benchmarks [ tracer ]

Benchmark execution time: 2026-06-14 09:38:52

Comparing candidate commit d3d1bae in PR branch bot/libdatadog-latest with baseline commit 46641ee in branch master.

Found 1 performance improvements and 1 performance regressions! Performance is the same for 191 metrics, 1 unstable metrics.

Explanation

This is an A/B test comparing a candidate commit's performance against that of a baseline commit. Performance changes are noted in the tables below as:

  • 🟩 = significantly better candidate vs. baseline
  • 🟥 = significantly worse candidate vs. baseline

We compute a confidence interval (CI) over the relative difference of means between metrics from the candidate and baseline commits, considering the baseline as the reference.

If the CI is entirely outside the configured SIGNIFICANT_IMPACT_THRESHOLD (or the deprecated UNCONFIDENCE_THRESHOLD), the change is considered significant.

Feel free to reach out to #apm-benchmarking-platform on Slack if you have any questions.

More details about the CI and significant changes

You can imagine this CI as a range of values that is likely to contain the true difference of means between the candidate and baseline commits.

CIs of the difference of means are often centered around 0%, because often changes are not that big:

---------------------------------(------|---^--------)-------------------------------->
                              -0.6%    0%  0.3%     +1.2%
                                 |          |        |
         lower bound of the CI --'          |        |
sample mean (center of the CI) -------------'        |
         upper bound of the CI ----------------------'

As described above, a change is considered significant if the CI is entirely outside the configured SIGNIFICANT_IMPACT_THRESHOLD (or the deprecated UNCONFIDENCE_THRESHOLD).

For instance, for an execution time metric, this confidence interval indicates a significantly worse performance:

----------------------------------------|---------|---(---------^---------)---------->
                                       0%        1%  1.3%      2.2%      3.1%
                                                  |   |         |         |
       significant impact threshold --------------'   |         |         |
                      lower bound of CI --------------'         |         |
       sample mean (center of the CI) --------------------------'         |
                      upper bound of CI ----------------------------------'

scenario:MessagePackSerializationBench/benchMessagePackSerialization

  • 🟥 execution_time [+3.492µs; +5.028µs] or [+3.128%; +4.503%]

scenario:SamplingRuleMatchingBench/benchRegexMatching4-opcache

  • 🟩 execution_time [-72.182ns; -35.818ns] or [-4.087%; -2.028%]

@dd-octo-sts dd-octo-sts Bot force-pushed the bot/libdatadog-latest branch from 122e2ec to 3a44de2 Compare June 14, 2026 04:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

0 participants