Launch Manager loads the new configuration file by paulquiring · Pull Request #248 · eclipse-score/lifecycle

paulquiring · 2026-06-15T14:18:09Z

Map user-facing json config to the new flatbuffer config
Load new launch manager config
Introduce adapters to make the rest of the code work with the new configuration content
Remove the legacy configuration files

Closes: #209

github-actions · 2026-06-15T14:19:10Z

License Check Results

🚀 The license check job ran with the Bazel command:

bazel run --lockfile_mode=error //:license-check

Status: ⚠️ Needs Review

Click to expand output

[License Check Output]
Extracting Bazel installation...
Starting local Bazel server (8.4.2) and connecting to it...
INFO: Invocation ID: 87349e67-6baf-4dc4-8451-12ef92285d38
Computing main repo mapping: 
Computing main repo mapping: 
Loading: 
Loading: 0 packages loaded
Loading: 0 packages loaded
Loading: 0 packages loaded
    currently loading: 
Loading: 0 packages loaded
    currently loading: 
Loading: 0 packages loaded
    currently loading: 
Analyzing: target //:license-check (1 packages loaded, 0 targets configured)
Analyzing: target //:license-check (1 packages loaded, 0 targets configured)

Analyzing: target //:license-check (30 packages loaded, 10 targets configured)

Analyzing: target //:license-check (82 packages loaded, 10 targets configured)

Analyzing: target //:license-check (86 packages loaded, 10 targets configured)

Analyzing: target //:license-check (138 packages loaded, 2721 targets configured)

Analyzing: target //:license-check (150 packages loaded, 5404 targets configured)

Analyzing: target //:license-check (155 packages loaded, 5453 targets configured)

Analyzing: target //:license-check (155 packages loaded, 5453 targets configured)

Analyzing: target //:license-check (155 packages loaded, 5453 targets configured)

Analyzing: target //:license-check (158 packages loaded, 7340 targets configured)

Analyzing: target //:license-check (160 packages loaded, 9062 targets configured)

Analyzing: target //:license-check (162 packages loaded, 10142 targets configured)

Analyzing: target //:license-check (162 packages loaded, 10142 targets configured)

Analyzing: target //:license-check (162 packages loaded, 10142 targets configured)

Analyzing: target //:license-check (162 packages loaded, 10142 targets configured)

Analyzing: target //:license-check (162 packages loaded, 10142 targets configured)

INFO: Analyzed target //:license-check (163 packages loaded, 10268 targets configured).
[13 / 16] JavaToolchainCompileClasses external/rules_java+/toolchains/platformclasspath_classes; 0s disk-cache, processwrapper-sandbox
[14 / 16] JavaToolchainCompileBootClasspath external/rules_java+/toolchains/platformclasspath.jar; 0s disk-cache, processwrapper-sandbox
INFO: Found 1 target...
Target //:license.check.license_check up-to-date:
  bazel-bin/license.check.license_check
  bazel-bin/license.check.license_check.jar
INFO: Elapsed time: 31.286s, Critical Path: 2.50s
INFO: 16 processes: 12 internal, 3 processwrapper-sandbox, 1 worker.
INFO: Build completed successfully, 16 total actions
INFO: Running command line: bazel-bin/license.check.license_check ./formatted.txt <args omitted>
usage: org.eclipse.dash.licenses.cli.Main [-batch <int>] [-cd <url>]
       [-confidence <int>] [-ef <url>] [-excludeSources <sources>] [-help] [-lic
       <url>] [-project <shortname>] [-repo <url>] [-review] [-summary <file>]
       [-timeout <seconds>] [-token <token>]

S-MOHAMD · 2026-06-17T16:47:59Z

Please lnk corresponding issue/bug.
see: https://docs.github.com/en/issues/tracking-your-work-with-issues/using-issues/linking-a-pull-request-to-an-issue

github-actions · 2026-06-18T08:33:36Z

The created documentation from the pull request is available at: docu-html

NicolasFussberger · 2026-06-22T08:11:05Z

+    // strdup() returns nullptr on OOM. On this embedded target, OOM during daemon
+    // startup is unrecoverable — the OS will terminate the process.
+    size_t arg_index = 0U;
+    startup.argv_[arg_index++] = strdup(startup.executable_path_.c_str());


Du we need to strdup the string here or should we just point to the c_str()?

We could use comp.deployment_config.executable_path.c_str(), but this would create a mixed ownership between the configuration object in main and the adapter. The Adapter copies the configuration so that it owns it. Yes this costs a bit of resources but is the simplest and clearest way currently to bridge the old to new configuration.

NicolasFussberger · 2026-06-23T06:12:52Z

+*************
+
+- The ``launch_manager`` binary built and available on your target.
+- Python 3 with the ``jsonschema`` package installed (required for config validation).


I think the user does not really need to be aware of these things as this is all handled in the background. python and jsonschema dependency is automatically handled via bazel. I don't think anyone (besides LaunchManager developers) should try to invoke lifecycle_config.py themselves.

All the user needs to be aware of is:

to use the provided bazel abstraction

How to write a config file

start launch_manager with "-c" argument

load("//:defs.bzl", "launch_manager_config") launch_manager_config( name = "examples_test_config", config = ":lifecycle_demo_test.json", flatbuffer_out_dir = "etc", )

done here https://github.com/eclipse-score/lifecycle/pull/248/changes#diff-cf0cb0e9ec62a01afbed5ad12df6cc113d0beb22a8415bcd5ef51b79e7ba3781

NicolasFussberger · 2026-06-23T06:14:42Z

+Step 1 — Write the JSON config
+==============================
+
+Create a file ``my_config.json``:


I wonder if we should just link to an example config (or include the example config) from examples/demo_verification/lifecycle_demo_test.json so that we do not duplicate a config here.

For example:

.. dropdown:: example_launch_manager_config.json .. literalinclude:: /../../../examples/demo_verification/lifecycle_demo_test.json :language: json

This worked as suggested and makes the getting stared much shorter!

NicolasFussberger · 2026-06-23T06:19:47Z

-    /// @param [in] f_recoveryClient_r       Interface to the launch manager for recovery
-    /// @param [in] f_processStateReader_r   Process state reader object for PHM daemon
-    /// @param [in] f_bufferConfig_r           Configuration settings for constructing workers
-    /// @return                              Construction is successful (true), otherwise failure (false)


Maybe we can bring back some of the doxygen as that seems still valid and to some extent helpful

NicolasFussberger · 2026-06-23T06:23:49Z

+    try
    {
-        try
+        supervised_components_.clear();


should we move the setup of supervised_components into the init method?
Other methods also assume this vector being already filled which creates a non-obvious dependency that createProcessStates must always be called first.

NicolasFussberger · 2026-06-23T06:36:38Z

-            // Accessing the same index (vector <-> config vector) is mapping the same configuration.
-            // coverity[cert_exp34_c_violation] f_interfaceIpcs_r is created from PhmMonitorInterface
-            // coverity[dereference] f_interfaceIpcs_r is created from PhmMonitorInterface same size assured
-            auto refProcessIndex{flatBuffer_p->hmMonitorInterface()->Get(index)->refProcessIndex()};


Previously this was getting the processIndex via the config. Now we rely on the assumption that for every component in supervised_components vector, a CheckpointIpcServer and ProcessState have been created with same ordering as in supervised_components vector. This assumption should be fine now, as there is a 1:1 mapping between Process and Alive supervision (this was not true before, as there could be multiple independent checkpoint reporters per process).

I wonder if we should make this more obvious by using the same index for those vectors.

E.g.

// before uint32_t index{0U}; f_interfaces_r.reserve(f_interfaceIpcs_r.size()); for (auto& interfaceIpc : f_interfaceIpcs_r) { f_interfaces_r.emplace_back(interfaceIpc, interfaceIpc.getPath().data()); f_processStates_r.at(static_cast<size_t>(index)).attachObserver(f_interfaces_r.back()); logger_r.LogDebug() << "Successfully created MonitorInterface:" << f_interfaces_r.back().getInterfaceName(); index++; } // suggestion f_interfaces_r.reserve(supervised_components_.size()); for(std::size_t compIndex = index; compIndex < supervised_components_.size(); compIndex++) { auto& interfaceIpc = f_interfaceIpcs_r.at(compIndex); f_interfaces_r.emplace_back(interfaceIpc, interfaceIpc.getPath().data()); f_processStates_r.at(static_cast<size_t>(compIndex)).attachObserver(f_interfaces_r.back()); logger_r.LogDebug() << "Successfully created MonitorInterface:" << f_interfaces_r.back().getInterfaceName(); }

Yes this is much better, it's pushed.

NicolasFussberger · 2026-06-23T06:41:28Z

-                                << std::string_view{f_exception_r.what()};
+            const auto* comp = supervised_components_[idx];
+            const std::string checkpointCfgName = comp->name + "_checkpoint";
+            const uint32_t checkpointId = 1U;


Not sure if this is worth it, but the checkpointId has to match this

lifecycle/score/launch_manager/src/alive/src/alive.cpp

Line 21 in 58d6d1b

static constexpr std::uint32_t kDefaultCheckpointId{1U};

I wonder if we should make this a static constant in score/launch_manager/src/daemon/src/alive_monitor/details/factory/StaticConfig.hpp which is then used in both places.

The background is, when this code was still supporting Deadline and Logical supervisions, then different kinds of checkpoints were required.
But now only Alive supervision support is left, which only needs sending "Alive notificiations" - for which he just use checkpointId 1.

We would need to reorganize the code a bit
Looking at:

score/launch_manager/src/daemon/src/alive_monitor/details/factory/StaticConfig.hpp
score/launch_manager/src/alive/src/alive.cpp

we might want to move StaticConfig.hpp into score/launch_manager/src/common or a similar location. We might want to create a ticket out if this comment to work in it later?

NicolasFussberger · 2026-06-23T06:43:42Z

-
-                logger_r.LogDebug() << kLogPrefix << "Successfully created alive supervision worker object:"
-                                    << f_alive_r.back().getConfigName();
+                continue;


I think this case should not really happen. If the component is supervised then alive supervision had to be specified.
Maybe we should have an assert here?

NicolasFussberger · 2026-06-23T06:47:42Z

+            aliveSupCfg.checkpointBufferSize = bufferConfig_r.bufferSizeAliveSupervision;
+            aliveSupCfg.recoveryClient = f_recoveryClient_r;
+
+            aliveSupCfg.processIdentifier = score::lcm::IdentifierHash{comp->name};


I some other place the function getProcessId was used to calculate the process Id. Maybe we should use the same way, to avoid accidentally using a different string as a base.

Maybe change getProcessId to this, then the decision which string is hashed is in a single place:

score::lcm::IdentifierHash getProcessId(const score::mw::launch_manager::configuration::ComponentConfig* config) { return score::lcm::IdentifierHash{comp->name}; }

NicolasFussberger · 2026-06-23T06:53:06Z

    void logConfiguration() noexcept(true);

    /// @brief Configured watchdog devices
-    /// By default, no watchdog device is configured


Maybe we should bring back some of this doxygen, thats seems rather useful

NicolasFussberger · 2026-06-23T06:56:14Z

-        logConfiguration();
-        return true;
+    const auto& alive_sup = config.aliveSupervision();
+    if (alive_sup.evaluation_cycle_ms != 0U)


alive_sup.evaluation_cycle_ms == 0 should never happen. Maybe we should make this an assert?
Maybe SCORE_LANGUAGE_FUTURECPP_PRECONDITION_PRD_MESSAGE

Going for SCORE_LANGUAGE_FUTURECPP_PRECONDITION_PRD_MESSAGE seems to be a good idea. I chose assert to be consistent with the rest of the code base. We should change this in the code base with a separate PR & Issue.

paulquiring · 2026-06-22T12:08:48Z

+    // strdup() returns nullptr on OOM. On this embedded target, OOM during daemon
+    // startup is unrecoverable — the OS will terminate the process.
+    size_t arg_index = 0U;
+    startup.argv_[arg_index++] = strdup(startup.executable_path_.c_str());


We could use comp.deployment_config.executable_path.c_str(), but this would create a mixed ownership between the configuration object in main and the adapter. The Adapter copies the configuration so that it owns it. Yes this costs a bit of resources but is the simplest and clearest way currently to bridge the old to new configuration.

NicolasFussberger · 2026-06-23T06:58:16Z

+        const auto& wd = *wd_opt;
+        watchdog::DeviceConfig wdConfig{};
+        wdConfig.fileName = wd.device_file_path;
+        assert(wd.max_timeout_ms <= std::numeric_limits<std::uint16_t>::max());


Should we make this a validation in score/launch_manager/src/daemon/src/configuration/details/flatbuffer_type_converters.cpp ?

NicolasFussberger · 2026-06-23T07:01:56Z

 class MachineConfigFactory : public watchdog::IDeviceConfigFactory
 {
 public:
    /// @brief Holds different buffer sizes that may be configured in the HM Machine Config


The comment is no longer valid.
All these buffer sizes are currently not configurable anymore.

This seems like some fine tuning we may enable with e.g. CLI flags in the future. For now the default values should be sufficient

changed to "Holds buffer sizes for supervision objects (currently using compile-time defaults)"

NicolasFussberger · 2026-06-23T07:11:39Z

+        }
+        dep.target_process_id_ = IdentifierHash{dep_name};
+        os_process.dependencies_.push_back(dep);
+    }


I think here is a bug. We need to lookup the ReadyCondition process_state for the referenced component.

I think it would be good to extend the UTs with these different state combinations to catch this

NicolasFussberger · 2026-06-23T07:19:38Z

+        for (const auto& comp_name : fallback.depends_on) {
+            auto it = component_to_process_index.find(comp_name);
+            if (it != component_to_process_index.end()) {
+                fallback_state.process_indexes_.push_back(it->second);


I think here we also need to resolve the dependencies recursively.

E.g. fallback_run_target depends_on [CompA]
and CompA depends_on [CompB] then

fallback_run_target = [CompA, CompB]

I think it would be good to extend the UT with this case

paulquiring requested a deployment to workflow-approval June 15, 2026 14:18 — with GitHub Actions Waiting

paulquiring requested a deployment to workflow-approval June 16, 2026 06:42 — with GitHub Actions Waiting

paulquiring commented Jun 16, 2026

View reviewed changes

Comment thread scripts/config_mapping/lifecycle_config.py

paulquiring marked this pull request as ready for review June 16, 2026 08:02

paulquiring requested a review from NicolasFussberger June 16, 2026 08:02