feat: add language variant support by userlaurin · Pull Request #286 · spcl/serverless-benchmarks

userlaurin · 2026-02-27T10:06:43Z

Add --language-variant CLI flag to local start and experiment invoke
Select variant-tagged Docker images (run.local.lang.variant.ver)
Apply per-variant patch files (benchmarks/bm/lang/variant/patch.diff) during code packaging
Extend cache keys to include variant; store variant in FunctionConfig
Add build_docker_images.py support for variant image tags
Added requirement for language variant patches "patch-ng" that can apply a patch for a variant to the default

Summary by CodeRabbit

New Features
- Added language variants support (Bun, LLRT, PyPy) with --language-variant CLI option
- Enhanced caching and container management for variant-aware deployments
Documentation
- Added comprehensive Language Variants deployment and integration guide

- Add --language-variant CLI flag to local start and experiment invoke - Select variant-tagged Docker images (run.local.<lang>.<variant>.<ver>) - Apply per-variant patch files (benchmarks/<bm>/<lang>/<variant>/patch.diff) during code packaging - Extend cache keys to include variant; store variant in FunctionConfig - Add build_docker_images.py support for variant image tags

coderabbitai · 2026-02-27T10:06:50Z

Warning

Rate limit exceeded

@userlaurin has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 19 minutes and 55 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 400b02e1-447d-4726-a5f3-9ae7a936c8e7

📥 Commits

Reviewing files that changed from the base of the PR and between e66a5c6 and 3d64355.

📒 Files selected for processing (5)

config/nodejs.json
config/storage.json
docs/build.md
sebs/cpp_dependencies.py
sebs/faas/function.py

📝 Walkthrough

Walkthrough

Introduces language variant support enabling multiple runtime implementations (Bun, LLRT, PyPy) per language. Adds variant configuration, deserialization, caching, and deployment logic across the benchmarking framework. Removes deprecated C++ dependencies module and storage configuration.

Changes

Cohort / File(s)	Summary
Variant Core Infrastructure `sebs/benchmark.py`, `sebs/faas/function.py`, `sebs/config.py`	New `LanguageSpec` class models languages with variants. `BenchmarkConfig` replaced `languages` with `language_specs` and added `supported_variants()` and `supports()` methods. `Benchmark` extended with `language_variant` and `cache_language_key` properties; `hash_directory()` now accepts variant parameter. New `Variant` and `Architecture` classes in `function.py` with per-language enum namespaces. `Runtime` dataclass updated to carry variant field with language-specific defaults.
Cache and Configuration Updates `sebs/cache.py`, `sebs/config.py`	Cache layer updated to use `cache_language_key` instead of `language_name` in code package operations. `SeBSConfig.supported_variants()` method added to retrieve variant lists per deployment and language.
CLI and Dependencies `sebs.py`, `requirements.txt`	Added `--language-variant` CLI parameter and propagated through config as `experiments.runtime.variant`. Added `patch-ng` dependency for variant patch application.
Provider Implementations `sebs/aws/aws.py`, `sebs/local/local.py`	Minor formatting in AWS provider. Local provider updated to extend container names with variant suffix when non-default.
Documentation `docs/build.md`	New Language Variants section covering patch/overlay variant strategies, registration in systems.json, Dockerfile placement, image tagging, cache isolation, and variant addition checklist with examples.
Removals `sebs/cpp_dependencies.py`, `config/storage.json`	Deleted entire C++ dependencies module with `CppDependencyConfig`, `CppDependencies` enum, and Docker/CMake generation utilities. Removed storage configuration (minio and scylladb sections).

Sequence Diagram(s)

sequenceDiagram
    participant CLI as CLI (sebs.py)
    participant BenchConfig as BenchmarkConfig
    participant Benchmark as Benchmark
    participant Runtime as Runtime
    participant FunctionConfig as FunctionConfig
    participant Variant as Variant Resolution

    CLI->>BenchConfig: parse language_variant param
    CLI->>BenchConfig: deserialize language_specs
    BenchConfig->>Variant: validate supports(language, variant)
    Variant-->>BenchConfig: variant supported
    BenchConfig->>Benchmark: initialize with config
    Benchmark->>Benchmark: set language_variant from config.runtime.variant
    Benchmark->>Benchmark: compute cache_language_key (language+variant)
    Benchmark->>Benchmark: hash_directory(dir, deploy, language, variant)
    Benchmark->>Benchmark: apply patch or overlay based on variant
    FunctionConfig->>Runtime: construct Runtime with variant
    Runtime->>Variant: for_language(language, variant_value)
    Variant-->>Runtime: assign language-specific enum
    Runtime-->>FunctionConfig: Runtime with variant field

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Add support for C++ benchmarks. #99 — Deletes the same sebs/cpp_dependencies.py module being removed in this PR.
Improve SeBS documentation #244 — Modifies core infrastructure files (sebs/benchmark.py, sebs/faas/function.py) with type annotations and documentation in the same areas where variant support is being added.
Container support for AWS #205 — Modifies shared core modules (sebs/benchmark.py, sebs/cache.py, sebs.py) affecting benchmark configuration and caching logic.

Suggested reviewers

mcopik

Poem

🐰 Variants hop in, patch by patch,
Each language finds its perfect match!
PyPy, Bun, and LLRT flow,
Cache keys jingle, fast they go!
A rabbit's blessing on this quest—
Runtime choices, truly blessed! 🌟

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 66.67% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'feat: add language variant support' accurately and concisely summarizes the main change—introducing language variant support across the codebase.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

…requirements.txt

sebs/faas/function.py

sebs/benchmark.py

tools/build_docker_images.py

- Replace flat Variant enum with nested namespace class (Variant.Python, Variant.NodeJS) keyed by Language in Variant._LANG_MAP; new languages add one nested Enum + one map entry - Fix Runtime.__post_init__ to skip variant resolution when language is None (used as sentinel in ExperimentConfig) - Fix FunctionConfig._from_benchmark to carry variant through to Runtime - Remove FIXME comments and unnecessary forward-reference quotes now that benchmark.py has from __future__ import annotations - Document patch-based vs overlay-based variant formats in benchmark.py - Simplify build_variant_language to delegate to build() so SeBS version suffix and future push support are handled consistently

coderabbitai

Actionable comments posted: 10

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

tools/build_docker_images.py (1)
47-55: ⚠️ Potential issue | 🔴 Critical

Critical: Image tag mismatch with local.py will cause container startup failures.

The build script appends the SeBS version suffix to image tags (line 55), producing tags like:
run.local.nodejs.bun.18-1.2.0
However, sebs/local/local.py lines 149-160 construct container names without this suffix:
# For variant != "default"
container_name = "{}:run.local.{}.{}.{}".format(
    self._system_config.docker_repository(),
    code_package.language_name,
    variant,
    code_package.language_version,
)
This produces: run.local.nodejs.bun.18

Docker will fail to find the images when starting containers because the tags don't match. Either the build script must stop appending the SeBS version suffix to tags, or local.py must be updated to include the suffix when referencing built images.

🧹 Nitpick comments (2)

tools/build_docker_images.py (1)
102-115: Consider extracting shared config iteration logic.

The config iteration logic (lines 104-112) duplicates build_language (lines 84-92). While acceptable for a small script, extracting this could reduce maintenance burden.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tools/build_docker_images.py` around lines 102 - 115, The loop that builds
the version/base_image list is duplicated between build_variant_language and
build_language; extract that logic into a helper (e.g., get_language_configs or
iterate_language_configs) that accepts language_config and args.language_version
and returns the list of (version, base_image) pairs (handling the "base_images"
-> "x64" branch and the fallback [None, None]); then replace the duplicated
loops in build_variant_language and build_language to call this helper and
iterate its result before calling build("run", ...) so behavior remains
identical.
docs/build.md (1)
117-119: Add language specifiers to fenced code blocks.

Static analysis flagged several code blocks showing directory structures without a language specifier. Consider using text or plaintext for consistency:

Lines 117-119

Lines 128-135

Lines 147-154

Lines 195-197

Lines 201-205

Lines 209-211
Example fix
-```
+```text
 benchmarks/<id>/<language>/<variant>/
</details>

<details>
<summary>🤖 Prompt for AI Agents</summary>
Verify each finding against the current code and only fix it if needed.

In @docs/build.md around lines 117 - 119, Several fenced code blocks (e.g., the
block containing the literal "benchmarks////") are
missing language specifiers; update each of the listed fenced code blocks to use
a language token such as text or plaintext (for example change totext)
so static analysis stops flagging them; locate the exact fence instances by
searching for the literal snippets shown in the diff (e.g.,
"benchmarks////") and apply the same fix to the other
blocks mentioned in the comment.
</details>

</blockquote></details>

</blockquote></details>

<details>
<summary>🤖 Prompt for all review comments with AI agents</summary>
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @docs/build.md:

Around line 170-178: sebs/config.py currently uses the wrong key when reading
language variants: update the SeBSConfig.supported_variants() (or the method
that returns supported variants) to read "variant_images" from the languages
entry instead of "supported_variants" so it matches docs and
tools/build_docker_images.py; change the get call that currently does
languages.get(language_name, {}).get("supported_variants", ["default"]) to use
"variant_images" with the same default (["default"]) so configured
variant_images are respected throughout the codebase.

In @sebs/aws/aws.py:

Around line 135-136: The comparison and dict lookup are using language.value
(a string) against Language enum members and CONFIG_FILES keys (which are enum
members), causing mismatches and KeyError; change the condition to compare the
enum directly (e.g., use language == Language.PYTHON or language in
(Language.PYTHON, Language.NODEJS)) and use the enum language (not
language.value) when indexing CONFIG_FILES to retrieve package_config;
alternatively, if CONFIG_FILES must use strings, normalize keys consistently but
prefer the simpler fix of comparing and indexing with the Language enum
(symbols: language, Language, CONFIG_FILES, package_config).

Line 155: The comparison uses language.value (a string) against an enum member
(Language.CPP), causing mismatches; update the check in the same block (the elif
that currently reads language.value == Language.CPP) to compare like-for-like —
either compare the enum instance (language == Language.CPP) or compare the
string value to the enum's value (language.value == Language.CPP.value) so the
condition correctly detects the C++ branch.

Around line 177-182: The comparisons are incorrectly using language.value (a
string) against Language enum members; update the conditional checks to compare
the enum itself (e.g., change checks to language == Language.NODEJS, language ==
Language.CPP, and language in [Language.PYTHON]) and keep using language.value
when building return strings (e.g., f"{language.value}{language_version}" or
f"{language.value}{language_version}.x" as appropriate) so the logic in aws.py
matches enum semantics.

In @sebs/benchmark.py:

Around line 264-266: The hash calculation is passing a string
(self.language_name) into Benchmark.hash_directory which expects a Language enum
to index FILES and WRAPPERS, causing a KeyError; change the call to pass the
Language enum (e.g., self.language or convert the name to the enum via
Language[self.language_name] or Language(self.language_name) depending on how
Language is defined) so Benchmark.hash_directory receives a Language enum rather
than a string; ensure any other callers also pass the enum and update the call
site that currently uses self.language_name accordingly.

Around line 53-61: LanguageSpec.deserialize currently imports Language from
sebs.faas.function (wrong enum), causing missing CPP support; remove the local
import in deserialize and use the module-level Language (sebs.types.Language)
already imported at module scope so LanguageSpec.deserialize(...) resolves CPP
and other language variants correctly, and ensure both branches (str and dict)
call Language.deserialize from the module-level enum.

In @sebs/config.py:

Around line 54-63: The supported_variants method in sebs/config.py is reading
the wrong key from the systems config; update the lookup inside
supported_variants (the languages variable usage) to read "variant_images"
instead of "supported_variants" so it matches tools/build_docker_images.py and
docs, while preserving the fallback of ["default"]; adjust the get call on
languages.get(language_name).get(...) to use "variant_images" and leave behavior
otherwise unchanged.

In @sebs/faas/function.py:

Around line 335-339: The error message in deserialize(val: str) for
Architecture is misleading because it logs the loop variable member instead of
the invalid input; update the exception to include the original val (e.g., raise
Exception(f"Unknown architecture type {val}")) so the error reports the actual
bad config value; locate the deserialize function and replace the exception
message that currently references member with one referencing val.

Around line 271-324: Remove the locally defined Language enum so the module
uses the imported sebs.types.Language (avoids enum identity mismatches), update
Variant._LANG_MAP to include a mapping for Language.CPP -> Variant., and ensure Variant.for_language continues
to use the mapped enum class for lookups; also fix Architecture.deserialize to
reference the input parameter (val) in its error message instead of the
undefined variable member.

In @sebs/local/local.py:

Around line 147-160: The container image tag is missing the SeBS version
suffix; retrieve the SeBS version from system config (e.g., sebs_version =
self._system_config.sebs_version()) and include it in the formatted
container_name strings so the tag matches build_docker_images.py output. Update
both branches that set container_name (the two format calls that use
self._system_config.docker_repository(), code_package.language_name, variant,
and code_package.language_version) to insert the sebs_version token (e.g., as
.{sebs_version} before the language_version) so both default and variant paths
produce identical SeBS-versioned tags.

Nitpick comments:
In @docs/build.md:

Around line 117-119: Several fenced code blocks (e.g., the block containing
the literal "benchmarks////") are missing language
specifiers; update each of the listed fenced code blocks to use a language token
such as text or plaintext (for example change totext) so static analysis
stops flagging them; locate the exact fence instances by searching for the
literal snippets shown in the diff (e.g.,
"benchmarks////") and apply the same fix to the other
blocks mentioned in the comment.

In @tools/build_docker_images.py:

Around line 102-115: The loop that builds the version/base_image list is
duplicated between build_variant_language and build_language; extract that logic
into a helper (e.g., get_language_configs or iterate_language_configs) that
accepts language_config and args.language_version and returns the list of
(version, base_image) pairs (handling the "base_images" -> "x64" branch and the
fallback [None, None]); then replace the duplicated loops in
build_variant_language and build_language to call this helper and iterate its
result before calling build("run", ...) so behavior remains identical.
</details>

---

<details>
<summary>ℹ️ Review info</summary>

<details>
<summary>⚙️ Run configuration</summary>

**Configuration used**: defaults

**Review profile**: CHILL

**Plan**: Pro

**Run ID**: `89084e4e-de4a-4e55-a9fb-905420aca9aa`

</details>

<details>
<summary>📥 Commits</summary>

Reviewing files that changed from the base of the PR and between ce1dacbbbad71f3fce877f7adcd15f131b6928e9 and 2aba9eacf9d7af2ab1863e769a87ddb1bb9d3488.

</details>

<details>
<summary>📒 Files selected for processing (10)</summary>

* `docs/build.md`
* `requirements.txt`
* `sebs.py`
* `sebs/aws/aws.py`
* `sebs/benchmark.py`
* `sebs/cache.py`
* `sebs/config.py`
* `sebs/faas/function.py`
* `sebs/local/local.py`
* `tools/build_docker_images.py`

</details>

</details>

docs/build.md

sebs/aws/aws.py

sebs/benchmark.py

sebs/config.py

sebs/faas/function.py

sebs/local/local.py

Language was defined twice - once in sebs/types.py (str, Enum) and again in sebs/faas/function.py (plain Enum, missing CPP). This caused identity comparisons like `language == Language.NODEJS` to silently fail, requiring the workaround of comparing `.value` instead which was removed again in this commit. Remove the duplicate definition and update all import sites in benchmark.py to use the canonical Language from sebs/types.py.

…) to get back to the current handling of the master repo

coderabbitai

Actionable comments posted: 1

♻️ Duplicate comments (3)

sebs/faas/function.py (2)

323-327: ⚠️ Potential issue | 🟡 Minor

Use val in the invalid-architecture error message.

Line 327 reports {member} instead of the bad input, so failures show the wrong value.

Proposed fix

-        raise Exception(f"Unknown architecture type {member}")
+        raise Exception(f"Unknown architecture type {val}")

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@sebs/faas/function.py` around lines 323 - 327, In deserialize(val: str) in
function.py the raised Exception reports the last loop variable member instead
of the invalid input; change the error to include the provided val (e.g., raise
Exception(f"Unknown architecture type {val}")) so the message shows the actual
bad input when Architecture parsing fails; keep function and Architecture
references unchanged.

309-312: ⚠️ Potential issue | 🔴 Critical

Add C++ default variant mapping to prevent runtime breakage.

Line 309-312 omits Language.CPP, so Variant.for_language(...) fails for C++ in both Runtime.deserialize (Line 355) and FunctionConfig._from_benchmark (Line 374).

Proposed fix

 class Variant:
@@
     class NodeJS(Enum):
         DEFAULT = "default"
         BUN = "bun"
         LLRT = "llrt"
+
+    class CPP(Enum):
+        DEFAULT = "default"
@@
 Variant._LANG_MAP: Dict[Language, Type[Enum]] = {
     Language.PYTHON: Variant.Python,
     Language.NODEJS: Variant.NodeJS,
+    Language.CPP: Variant.CPP,
 }

Also applies to: 351-356, 371-375

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@sebs/faas/function.py` around lines 309 - 312, The mapping Variant._LANG_MAP
is missing an entry for C++ which causes Variant.for_language(...) to fail in
Runtime.deserialize and FunctionConfig._from_benchmark; add Language.CPP:
Variant.Cpp (or the actual C++ Variant class name used) to Variant._LANG_MAP
alongside Language.PYTHON and Language.NODEJS so Variant.for_language can
resolve C++ variants correctly, and update any other similar maps around
Runtime.deserialize and FunctionConfig._from_benchmark if duplicated.

sebs/benchmark.py (1)

262-264: ⚠️ Potential issue | 🔴 Critical

Normalize language key type usage (Enum vs string) in hashing and overlay copy paths.

Line 263 passes self.language_name (string), while Line 355 and Line 376 index enum-keyed maps; Line 475 also switches to string-key indexing. This inconsistency can break variant hashing/copy behavior.

Proposed fix

-        self._hash_value = Benchmark.hash_directory(
-            path, self._deployment_name, self.language_name, self._language_variant
-        )
+        self._hash_value = Benchmark.hash_directory(
+            path, self._deployment_name, self.language, self._language_variant
+        )
@@
-    def hash_directory(directory: str, deployment: str, language: str, variant: str = "default"):
+    def hash_directory(
+        directory: str, deployment: str, language: Language, variant: str = "default"
+    ):
@@
             wrappers = project_absolute_path(
-                "benchmarks", "wrappers", deployment, language, wrapper
+                "benchmarks", "wrappers", deployment, language.value, wrapper
             )
@@
-                for file_type in FILES[self.language_name]:
+                for file_type in FILES[self.language]:

Also applies to: 341-355, 377-379, 475-476

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@sebs/benchmark.py` around lines 262 - 264, The code mixes enum keys and
string keys when building hashes and accessing language-indexed maps: normalize
to a single key type by converting language_name to the Enum (or vice versa)
before use; update Benchmark.hash_directory calls and any places that index
language-keyed dicts (e.g., places that use self.language_name, overlay copy
logic, and any map lookups referenced around hash computation and copy paths) to
consistently use the Enum member (or consistently use its .name/.value) across
Benchmark.hash_directory, the overlay copy routines, and all map accesses so the
same key type is used everywhere.

🧹 Nitpick comments (1)

sebs/benchmark.py (1)

50-51: Return a copy from LanguageSpec.variants to protect internal state.

Exposing the backing list allows callers to mutate benchmark config state implicitly.

Proposed refactor

     `@property`
     def variants(self) -> List[str]:
-        return self._variants
+        return list(self._variants)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@sebs/benchmark.py` around lines 50 - 51, LanguageSpec.variants currently
returns the internal list self._variants directly, exposing mutable internal
state; change the method to return a shallow copy (e.g., return
list(self._variants) or self._variants.copy()) so callers cannot mutate the
LanguageSpec._variants backing list, and ensure the return type remains
List[str] to preserve the signature.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@sebs/faas/function.py`:
- Line 315: A new local enum class named Architecture (class Architecture(Enum))
conflicts with an imported Architecture symbol (imported at the top of the
file), causing an F811 redefinition; fix by either removing the imported
Architecture from the imports and updating any external references accordingly,
or rename the local enum (e.g., LocalArchitecture or BackendArchitecture) and
update all local references in this file (constructors, type hints, comparisons,
and any usages of Architecture in sebs/faas/function.py) to the new name so
there is no name collision.

---

Duplicate comments:
In `@sebs/benchmark.py`:
- Around line 262-264: The code mixes enum keys and string keys when building
hashes and accessing language-indexed maps: normalize to a single key type by
converting language_name to the Enum (or vice versa) before use; update
Benchmark.hash_directory calls and any places that index language-keyed dicts
(e.g., places that use self.language_name, overlay copy logic, and any map
lookups referenced around hash computation and copy paths) to consistently use
the Enum member (or consistently use its .name/.value) across
Benchmark.hash_directory, the overlay copy routines, and all map accesses so the
same key type is used everywhere.

In `@sebs/faas/function.py`:
- Around line 323-327: In deserialize(val: str) in function.py the raised
Exception reports the last loop variable member instead of the invalid input;
change the error to include the provided val (e.g., raise Exception(f"Unknown
architecture type {val}")) so the message shows the actual bad input when
Architecture parsing fails; keep function and Architecture references unchanged.
- Around line 309-312: The mapping Variant._LANG_MAP is missing an entry for C++
which causes Variant.for_language(...) to fail in Runtime.deserialize and
FunctionConfig._from_benchmark; add Language.CPP: Variant.Cpp (or the actual C++
Variant class name used) to Variant._LANG_MAP alongside Language.PYTHON and
Language.NODEJS so Variant.for_language can resolve C++ variants correctly, and
update any other similar maps around Runtime.deserialize and
FunctionConfig._from_benchmark if duplicated.

---

Nitpick comments:
In `@sebs/benchmark.py`:
- Around line 50-51: LanguageSpec.variants currently returns the internal list
self._variants directly, exposing mutable internal state; change the method to
return a shallow copy (e.g., return list(self._variants) or
self._variants.copy()) so callers cannot mutate the LanguageSpec._variants
backing list, and ensure the return type remains List[str] to preserve the
signature.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 399467d7-c174-4ceb-b76c-b7fdbd23409b

📥 Commits

Reviewing files that changed from the base of the PR and between 2aba9ea and 44b54b4.

📒 Files selected for processing (3)

sebs/aws/aws.py
sebs/benchmark.py
sebs/faas/function.py

🚧 Files skipped from review as they are similar to previous changes (1)

sebs/aws/aws.py

sebs/faas/function.py

coderabbitai

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

sebs/benchmark.py (1)

633-635: ⚠️ Potential issue | 🟠 Major

Variant cache hash misses versioned Node package files.

package.json.<version> is used in overlay variants (Line [789]-Line [793]), but the hash input patterns (Line [634], Line [659]) exclude it. Cache entries can stay “valid” after variant dependency changes.

Proposed fix

         FILES = {
             Language.PYTHON: ["*.py", "requirements.txt*"],
-            Language.NODEJS: ["*.js", "package.json"],
+            Language.NODEJS: ["*.js", "package.json*"],
             Language.CPP: ["*.cpp", "*.hpp", "dependencies.json"],
         }

Also applies to: 649-662, 789-793

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@sebs/benchmark.py` around lines 633 - 635, The NodeJS hash input patterns
omit versioned package files (e.g., package.json.<version>) so variant cache
hashes miss changes; update the Language.NODEJS pattern lists used for hashing
to include versioned package files by adding "package.json.*" (or
"package.json*") alongside "package.json" wherever the hash input patterns are
defined (references: Language.NODEJS and the lists around the existing ["*.js",
"package.json"] entries and the later overlay/variant handling that uses
package.json.<version>); apply the same change to each occurrence of the NodeJS
pattern lists so variant dependency changes invalidate the cache correctly.

♻️ Duplicate comments (3)

sebs/faas/function.py (3)

567-570: ⚠️ Potential issue | 🔴 Critical

C++ flows break because Variant has no Language.CPP mapping.

Line [607] and Line [678] both rely on Variant.for_language(...); with the current map, any C++ benchmark can fail with ValueError("No variants defined for language ...").

Proposed fix

 class Variant:
@@
     class NodeJS(Enum):
         DEFAULT = "default"
         BUN = "bun"
         LLRT = "llrt"
+
+    class Cpp(Enum):
+        DEFAULT = "default"
@@
 Variant._LANG_MAP: Dict[Language, Type[Enum]] = {
     Language.PYTHON: Variant.Python,
     Language.NODEJS: Variant.NodeJS,
+    Language.CPP: Variant.Cpp,
 }

Also applies to: 605-607, 675-679

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@sebs/faas/function.py` around lines 567 - 570, The _LANG_MAP in Variant is
missing an entry for Language.CPP causing Variant.for_language(...) to raise
ValueError for C++ benchmarks; add a mapping for Language.CPP to the appropriate
enum class (e.g., Variant.Cpp or the correct C++ variant Enum) inside
Variant._LANG_MAP so that Variant.for_language(Language.CPP) succeeds and C++
flows on lines referencing Variant.for_language no longer fail.

581-585: ⚠️ Potential issue | 🟡 Minor

Use the invalid input value in architecture deserialization errors.

Line [585] currently reports member instead of the actual invalid val, which makes debugging harder.

Proposed fix

-        raise Exception(f"Unknown architecture type {member}")
+        raise Exception(f"Unknown architecture type {val}")

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@sebs/faas/function.py` around lines 581 - 585, The deserialize function in
sebs/faas/function.py returns an error that references the loop variable member
instead of the invalid input; update the exception in deserialize(val: str) to
include the actual invalid input val (e.g., raise Exception(f"Unknown
architecture type {val}")) so errors correctly report the value that failed to
deserialize; ensure the change is made in the deserialize function and preserves
the existing loop over Architecture members.

26-26: ⚠️ Potential issue | 🟠 Major

Resolve Architecture redefinition (Ruff F811).

Line [26] imports Architecture, then Line [573] redefines it in the same module. This is an avoidable redefinition error and makes the active type ambiguous.

Proposed fix

-from sebs.types import Language, Architecture
+from sebs.types import Language

Also applies to: 573-573

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@sebs/faas/function.py` at line 26, The module currently imports Architecture
from sebs.types and also defines a local Architecture (redefinition causing Ruff
F811); pick one source and remove the other: if the file should use the shared
type, delete the local Architecture definition (the one around the later
redefinition) and update any local references to use the imported Architecture;
if the local definition is intentional and distinct, rename that local symbol
(e.g., LocalArchitecture or FunctionArchitecture) and update all references in
this file to that new name so there is no name collision with the imported
Architecture.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docs/build.md`:
- Around line 118-120: Several fenced code blocks in docs/build.md are unlabeled
(MD040); update each triple-backtick block that currently contains examples like
"benchmarks/<id>/<language>/<variant>/", the longer benchmark trees (e.g.,
"benchmarks/100.webapps/110.dynamic-html/..."), the dockerfile paths
("dockerfiles/<system>/<language>/<variant>/Dockerfile.run",
"dockerfiles/local/nodejs/bun/Dockerfile.run", etc.), and the image tag example
("<docker_repository>:run.<system>.<language>.<variant>.<version>-<SeBS_version>")
to include an explicit language after the opening ``` (use text for plain
paths/trees, bash for shell examples, json where appropriate) so each fenced
block is labeled and satisfies markdownlint MD040; locate the blocks by their
content strings and add the appropriate language identifiers.

In `@sebs/benchmark.py`:
- Around line 785-787: FILES is keyed by the Language enum but the loop uses
self.language_name (a string), causing a KeyError for overlay variants; change
the lookup to use the enum value (e.g., self.language or convert the string to
Language) when iterating FILES in the block that copies files from variant_dir
to output_dir so the key types match (ensure you reference FILES,
self.language_name, and Language enum when making the fix).

---

Outside diff comments:
In `@sebs/benchmark.py`:
- Around line 633-635: The NodeJS hash input patterns omit versioned package
files (e.g., package.json.<version>) so variant cache hashes miss changes;
update the Language.NODEJS pattern lists used for hashing to include versioned
package files by adding "package.json.*" (or "package.json*") alongside
"package.json" wherever the hash input patterns are defined (references:
Language.NODEJS and the lists around the existing ["*.js", "package.json"]
entries and the later overlay/variant handling that uses
package.json.<version>); apply the same change to each occurrence of the NodeJS
pattern lists so variant dependency changes invalidate the cache correctly.

---

Duplicate comments:
In `@sebs/faas/function.py`:
- Around line 567-570: The _LANG_MAP in Variant is missing an entry for
Language.CPP causing Variant.for_language(...) to raise ValueError for C++
benchmarks; add a mapping for Language.CPP to the appropriate enum class (e.g.,
Variant.Cpp or the correct C++ variant Enum) inside Variant._LANG_MAP so that
Variant.for_language(Language.CPP) succeeds and C++ flows on lines referencing
Variant.for_language no longer fail.
- Around line 581-585: The deserialize function in sebs/faas/function.py returns
an error that references the loop variable member instead of the invalid input;
update the exception in deserialize(val: str) to include the actual invalid
input val (e.g., raise Exception(f"Unknown architecture type {val}")) so errors
correctly report the value that failed to deserialize; ensure the change is made
in the deserialize function and preserves the existing loop over Architecture
members.
- Line 26: The module currently imports Architecture from sebs.types and also
defines a local Architecture (redefinition causing Ruff F811); pick one source
and remove the other: if the file should use the shared type, delete the local
Architecture definition (the one around the later redefinition) and update any
local references to use the imported Architecture; if the local definition is
intentional and distinct, rename that local symbol (e.g., LocalArchitecture or
FunctionArchitecture) and update all references in this file to that new name so
there is no name collision with the imported Architecture.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 8e024faa-2760-43e6-8f3b-83acb18a2858

📥 Commits

Reviewing files that changed from the base of the PR and between 44b54b4 and 762ba1d.

📒 Files selected for processing (8)

docs/build.md
requirements.txt
sebs/aws/aws.py
sebs/benchmark.py
sebs/cache.py
sebs/config.py
sebs/faas/function.py
sebs/local/local.py

✅ Files skipped from review due to trivial changes (1)

requirements.txt

🚧 Files skipped from review as they are similar to previous changes (3)

sebs/aws/aws.py
sebs/config.py
sebs/local/local.py

docs/build.md

sebs/benchmark.py

docs/build.md

sebs/faas/function.py

tools/build_docker_images.py

…) to get back to the current handling of the master repo

coderabbitai

Actionable comments posted: 1

♻️ Duplicate comments (1)

sebs/benchmark.py (1)

799-801: ⚠️ Potential issue | 🔴 Critical

Critical: FILES dict keyed by enum, but indexed with string.

FILES is keyed by Language enum members (Lines 745-750), but Line 799 uses self.language_name (a string like "python") to index it. This will raise KeyError when applying overlay variants.

🐛 Proposed fix

             else:
                 # Overlay-based variant...
-                for file_type in FILES[self.language_name]:
+                for file_type in FILES[self.language]:
                     for f in glob.glob(os.path.join(variant_dir, file_type)):
                         shutil.copy2(f, output_dir)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@sebs/benchmark.py` around lines 799 - 801, The loop indexes FILES with a
string (self.language_name) but FILES is keyed by Language enum members; change
the lookup to convert the string to the enum first (e.g. resolve lang_enum =
Language(self.language_name) or Language[self.language_name.upper()] depending
on how Language is defined) and then use FILES[lang_enum] when iterating files
in variant_dir and copying to output_dir so the dict is accessed by the correct
enum key.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@sebs/benchmark.py`:
- Around line 644-649: The code incorrectly constructs file paths by doing path
= os.path.join(directory, f) even though glob.glob(os.path.join(directory,
file_type)) returns full paths, causing FileNotFoundError; update the loop to
use the glob result directly (use f as the path when opening and hashing) —
i.e., replace the os.path.join(directory, f) usage so that selected_files
iteration uses the full path from glob (refer to selected_files, FILES,
NON_LANG_FILES, path, and hash_sum.update).

---

Duplicate comments:
In `@sebs/benchmark.py`:
- Around line 799-801: The loop indexes FILES with a string (self.language_name)
but FILES is keyed by Language enum members; change the lookup to convert the
string to the enum first (e.g. resolve lang_enum = Language(self.language_name)
or Language[self.language_name.upper()] depending on how Language is defined)
and then use FILES[lang_enum] when iterating files in variant_dir and copying to
output_dir so the dict is accessed by the correct enum key.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: fc68df97-13dd-4998-809e-abc067f76b2e

📥 Commits

Reviewing files that changed from the base of the PR and between 762ba1d and e66a5c6.

📒 Files selected for processing (9)

config/storage.json
requirements.txt
sebs.py
sebs/aws/aws.py
sebs/benchmark.py
sebs/cache.py
sebs/config.py
sebs/cpp_dependencies.py
sebs/local/local.py

💤 Files with no reviewable changes (2)

config/storage.json
sebs/cpp_dependencies.py

🚧 Files skipped from review as they are similar to previous changes (3)

requirements.txt
sebs.py
sebs/local/local.py

coderabbitai · 2026-03-10T12:08:42Z

sebs/benchmark.py

        selected_files = FILES[language] + NON_LANG_FILES
        for file_type in selected_files:
-            for f in glob.glob(os.path.join(directory, "**", file_type), recursive=True):
-                if os.path.isfile(f):
-                    path = os.path.join(directory, f)
-                    with open(path, "rb") as opened_file:
-                        hash_sum.update(opened_file.read())
+            for f in glob.glob(os.path.join(directory, file_type)):
+                path = os.path.join(directory, f)
+                with open(path, "rb") as opened_file:
+                    hash_sum.update(opened_file.read())


⚠️ Potential issue | 🔴 Critical

Critical: Incorrect path construction will cause FileNotFoundError.

The glob result f already contains the full path (e.g., "benchmarks/001/python/handler.py"), so joining it again with directory creates an invalid path like "benchmarks/001/python/benchmarks/001/python/handler.py".

🐛 Proposed fix

for file_type in selected_files: for f in glob.glob(os.path.join(directory, file_type)): - path = os.path.join(directory, f) - with open(path, "rb") as opened_file: + with open(f, "rb") as opened_file: hash_sum.update(opened_file.read())

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@sebs/benchmark.py` around lines 644 - 649, The code incorrectly constructs file paths by doing path = os.path.join(directory, f) even though glob.glob(os.path.join(directory, file_type)) returns full paths, causing FileNotFoundError; update the loop to use the glob result directly (use f as the path when opening and hashing) — i.e., replace the os.path.join(directory, f) usage so that selected_files iteration uses the full path from glob (refer to selected_files, FILES, NON_LANG_FILES, path, and hash_sum.update).

fix: requirement for diff library patch-ng used for patches added to …

a9b6658

…requirements.txt

mcopik requested changes Feb 27, 2026

View reviewed changes

laurin and others added 3 commits February 27, 2026 15:26

docs: document language variant support (patch and overlay strategies)

e4b92e9

Merge branch 'spcl:master' into feature/language-variants

63a46a8

userlaurin marked this pull request as ready for review March 2, 2026 14:45

Merge branch 'master' into feature/language-variants

2aba9ea

coderabbitai bot reviewed Mar 4, 2026

View reviewed changes

laurin added 2 commits March 4, 2026 14:54

refactor: small change relating to language handling (class vs string…

89b8ab9

…) to get back to the current handling of the master repo

coderabbitai bot reviewed Mar 4, 2026

View reviewed changes

sebs/faas/function.py Outdated Show resolved Hide resolved

Merge branch 'master' into feature/language-variants

762ba1d

userlaurin requested a review from mcopik March 4, 2026 20:49

coderabbitai bot reviewed Mar 4, 2026

View reviewed changes

docs/build.md Show resolved Hide resolved

sebs/benchmark.py Show resolved Hide resolved

mcopik reviewed Mar 9, 2026

View reviewed changes

docs/build.md Outdated Show resolved Hide resolved

mcopik reviewed Mar 9, 2026

View reviewed changes

sebs/faas/function.py Outdated Show resolved Hide resolved

mcopik reviewed Mar 9, 2026

View reviewed changes

tools/build_docker_images.py Outdated Show resolved Hide resolved

mcopik approved these changes Mar 9, 2026

View reviewed changes

laurin added 3 commits March 10, 2026 13:02

refactor: small change relating to language handling (class vs string…

e66a5c6

…) to get back to the current handling of the master repo

refactor: removed local build script from build.md docs

3834569

refactor: removed Architecture merge artefact

0933091

coderabbitai bot reviewed Mar 10, 2026

View reviewed changes

refactor: resolve merge missing files

3d64355

Conversation

userlaurin commented Feb 27, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Feb 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Rate limit exceeded

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

❌ Failed checks (1 warning)

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 10, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

userlaurin commented Feb 27, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Feb 27, 2026 •

edited

Loading