Skip to content

Simd revisited#4742

Open
shai-almog wants to merge 8 commits intomasterfrom
simd-revisite
Open

Simd revisited#4742
shai-almog wants to merge 8 commits intomasterfrom
simd-revisite

Conversation

@shai-almog
Copy link
Copy Markdown
Collaborator

No description provided.

@shai-almog shai-almog changed the title Simd revisite Simd revisited Apr 13, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 13, 2026

✅ Continuous Quality Report

Test & Coverage

Static Analysis

Generated automatically by the PR CI workflow.

@shai-almog
Copy link
Copy Markdown
Collaborator Author

shai-almog commented Apr 13, 2026

Compared 37 screenshots: 37 matched.
✅ Native iOS screenshot tests passed.

Benchmark Results

  • VM Translation Time: 0 seconds
  • Compilation Time: 75 seconds

Detailed Performance Metrics

Metric Duration
Simulator Boot 0 ms
Simulator Boot (Run) 1000 ms
App Install 1000 ms
App Launch 3000 ms
Test Execution 158000 ms
Base64 payload size 8192 bytes
Base64 benchmark iterations 6000
Base64 native encode 1561.000 ms
Base64 CN1 encode 1873.000 ms
Base64 encode ratio (CN1/native) 1.200x (20.0% slower)
Base64 native decode 1052.000 ms
Base64 CN1 decode 1059.000 ms
Base64 decode ratio (CN1/native) 1.007x (0.7% slower)
Base64 SIMD encode 3721.000 ms
Base64 encode ratio (SIMD/native) 2.384x (138.4% slower)
Base64 encode ratio (SIMD/CN1) 1.987x (98.7% slower)
Base64 SIMD decode 1882.000 ms
Base64 decode ratio (SIMD/native) 1.789x (78.9% slower)
Base64 decode ratio (SIMD/CN1) 1.777x (77.7% slower)

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 13, 2026

✅ ByteCodeTranslator Quality Report

Test & Coverage

  • Tests: 381 total, 0 failed, 2 skipped

Benchmark Results

  • Execution Time: 10551 ms

  • Hotspots (Top 20 sampled methods):

    • 21.85% java.lang.String.indexOf (407 samples)
    • 19.43% com.codename1.tools.translator.Parser.isMethodUsed (362 samples)
    • 14.92% com.codename1.tools.translator.Parser.addToConstantPool (278 samples)
    • 11.16% java.util.ArrayList.indexOf (208 samples)
    • 4.46% java.lang.Object.hashCode (83 samples)
    • 2.15% java.lang.System.identityHashCode (40 samples)
    • 1.72% com.codename1.tools.translator.ByteCodeClass.fillVirtualMethodTable (32 samples)
    • 1.56% com.codename1.tools.translator.ByteCodeClass.markDependent (29 samples)
    • 1.45% com.codename1.tools.translator.ByteCodeClass.calcUsedByNative (27 samples)
    • 1.40% java.lang.StringBuilder.append (26 samples)
    • 0.97% com.codename1.tools.translator.BytecodeMethod.optimize (18 samples)
    • 0.97% com.codename1.tools.translator.BytecodeMethod.equals (18 samples)
    • 0.91% java.util.ArrayList$Itr.hasNext (17 samples)
    • 0.91% com.codename1.tools.translator.Parser.generateClassAndMethodIndexHeader (17 samples)
    • 0.81% com.codename1.tools.translator.Parser.cullMethods (15 samples)
    • 0.75% com.codename1.tools.translator.BytecodeMethod.isMethodUsedByNative (14 samples)
    • 0.59% java.lang.StringCoding.encode (11 samples)
    • 0.54% com.codename1.tools.translator.Parser.getClassByName (10 samples)
    • 0.48% com.codename1.tools.translator.ByteCodeField.equals (9 samples)
    • 0.48% org.objectweb.asm.ClassReader.readCode (9 samples)
  • ⚠️ Coverage report not generated.

Static Analysis

  • ✅ SpotBugs: no findings (report was not generated by the build).
  • ⚠️ PMD report not generated.
  • ⚠️ Checkstyle report not generated.

Generated automatically by the PR CI workflow.

@shai-almog
Copy link
Copy Markdown
Collaborator Author

shai-almog commented Apr 13, 2026

Compared 37 screenshots: 37 matched.

Native Android coverage

  • 📊 Line coverage: 7.83% (4127/52738 lines covered) [HTML preview] (artifact android-coverage-report, jacocoAndroidReport/html/index.html)
    • Other counters: instruction 6.15% (20437/332320), branch 2.98% (956/32062), complexity 3.65% (1121/30680), method 6.41% (917/14314), class 10.63% (202/1900)
    • Lowest covered classes
      • kotlin.collections.kotlin.collections.ArraysKt___ArraysKt – 0.00% (0/6327 lines covered)
      • kotlin.collections.unsigned.kotlin.collections.unsigned.UArraysKt___UArraysKt – 0.00% (0/2384 lines covered)
      • org.jacoco.agent.rt.internal_b6258fc.asm.org.jacoco.agent.rt.internal_b6258fc.asm.ClassReader – 0.00% (0/1519 lines covered)
      • kotlin.collections.kotlin.collections.CollectionsKt___CollectionsKt – 0.00% (0/1148 lines covered)
      • org.jacoco.agent.rt.internal_b6258fc.asm.org.jacoco.agent.rt.internal_b6258fc.asm.MethodWriter – 0.00% (0/923 lines covered)
      • kotlin.sequences.kotlin.sequences.SequencesKt___SequencesKt – 0.00% (0/730 lines covered)
      • kotlin.text.kotlin.text.StringsKt___StringsKt – 0.00% (0/623 lines covered)
      • org.jacoco.agent.rt.internal_b6258fc.asm.org.jacoco.agent.rt.internal_b6258fc.asm.Frame – 0.00% (0/564 lines covered)
      • kotlin.collections.kotlin.collections.ArraysKt___ArraysJvmKt – 0.00% (0/495 lines covered)
      • kotlinx.coroutines.kotlinx.coroutines.JobSupport – 0.00% (0/423 lines covered)

✅ Native Android screenshot tests passed.

Native Android coverage

  • 📊 Line coverage: 7.83% (4127/52738 lines covered) [HTML preview] (artifact android-coverage-report, jacocoAndroidReport/html/index.html)
    • Other counters: instruction 6.15% (20437/332320), branch 2.98% (956/32062), complexity 3.65% (1121/30680), method 6.41% (917/14314), class 10.63% (202/1900)
    • Lowest covered classes
      • kotlin.collections.kotlin.collections.ArraysKt___ArraysKt – 0.00% (0/6327 lines covered)
      • kotlin.collections.unsigned.kotlin.collections.unsigned.UArraysKt___UArraysKt – 0.00% (0/2384 lines covered)
      • org.jacoco.agent.rt.internal_b6258fc.asm.org.jacoco.agent.rt.internal_b6258fc.asm.ClassReader – 0.00% (0/1519 lines covered)
      • kotlin.collections.kotlin.collections.CollectionsKt___CollectionsKt – 0.00% (0/1148 lines covered)
      • org.jacoco.agent.rt.internal_b6258fc.asm.org.jacoco.agent.rt.internal_b6258fc.asm.MethodWriter – 0.00% (0/923 lines covered)
      • kotlin.sequences.kotlin.sequences.SequencesKt___SequencesKt – 0.00% (0/730 lines covered)
      • kotlin.text.kotlin.text.StringsKt___StringsKt – 0.00% (0/623 lines covered)
      • org.jacoco.agent.rt.internal_b6258fc.asm.org.jacoco.agent.rt.internal_b6258fc.asm.Frame – 0.00% (0/564 lines covered)
      • kotlin.collections.kotlin.collections.ArraysKt___ArraysJvmKt – 0.00% (0/495 lines covered)
      • kotlinx.coroutines.kotlinx.coroutines.JobSupport – 0.00% (0/423 lines covered)

Benchmark Results

Detailed Performance Metrics

Metric Duration
Base64 payload size 8192 bytes
Base64 benchmark iterations 6000
Base64 native encode 983.000 ms
Base64 CN1 encode 314.000 ms
Base64 encode ratio (CN1/native) 0.319x (68.1% faster)
Base64 native decode 954.000 ms
Base64 CN1 decode 318.000 ms
Base64 decode ratio (CN1/native) 0.333x (66.7% faster)

shai-almog and others added 5 commits April 13, 2026 07:39
…e64 SIMD in Java (#4745)

* Add NEON-accelerated base64Encode/base64Decode to Simd API and wire into Base64 SIMD methods

Agent-Logs-Url: https://github.com/codenameone/CodenameOne/sessions/c218992e-943a-4ce5-8d63-f82c0792416f

Co-authored-by: shai-almog <67850168+shai-almog@users.noreply.github.com>

* Address code review: clarify loop conditions and comment in NEON base64

Agent-Logs-Url: https://github.com/codenameone/CodenameOne/sessions/c218992e-943a-4ce5-8d63-f82c0792416f

Co-authored-by: shai-almog <67850168+shai-almog@users.noreply.github.com>

* Add new SIMD generic primitive declarations and remove base64 methods from IOSSimd

- Added shl, shrLogical, addWrapping, subWrapping for byte arrays
- Added offset-based overloads for unpackUnsignedByteToInt, add, cmpEq, cmpLt, select
- Removed base64Encode and base64Decode declarations

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Co-authored-by: shai-almog <67850168+shai-almog@users.noreply.github.com>

* Replace base64 overrides with generic SIMD primitive validation wrappers in JavaSESimd

Remove base64Encode and base64Decode overrides. Add validation wrapper
overrides for new generic Simd primitives: shl, shrLogical, addWrapping,
subWrapping, unpackUnsignedByteToInt, add (int[]), cmpEq, cmpLt, and
select.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Co-authored-by: shai-almog <67850168+shai-almog@users.noreply.github.com>

* Replace base64 NEON section with generic SIMD primitives in IOSSimd.m

Remove NEON-accelerated Base64 encode/decode implementations and add
NEON implementations for new generic Simd primitives: shl, shrLogical,
addWrapping, subWrapping, unpackUnsignedByteToInt, add (int), cmpEq,
cmpLt, and select (with offset parameters).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Co-authored-by: shai-almog <67850168+shai-almog@users.noreply.github.com>

* Remove base64 methods from Simd; add generic byte/int primitives; rewrite Base64 SIMD in Java

- Remove base64Encode/base64Decode from Simd, IOSSimd, JavaSESimd, IOSSimd.m
- Add generic byte shift primitives: shl(byte[]), shrLogical(byte[])
- Add wrapping byte arithmetic: addWrapping(byte[]), subWrapping(byte[])
- Add offset-based int overloads: unpackUnsignedByteToInt, add, cmpLt, cmpEq, select
- Add NEON implementations for all new primitives in IOSSimd.m
- Add validation wrappers in JavaSESimd.java
- Rewrite encodeNoNewlineSimd in Java using int-domain SIMD compare/select
- Rewrite decodeNoWhitespaceSimd in Java using int-domain SIMD shift/or
- Update SimdTest with tests for new primitives
- All 2345 tests pass

Agent-Logs-Url: https://github.com/codenameone/CodenameOne/sessions/5fc6976c-f0ec-4638-b22b-2cbc9c9ca5dd

Co-authored-by: shai-almog <67850168+shai-almog@users.noreply.github.com>

* Optimize SIMD Base64: replace slow scatter/gather + per-element SIMD calls with fast inline scalar Java

The previous approach called 15+ individual Simd operations per 48-byte chunk
(each a virtual dispatch + JNI transition on iOS), plus scalar scatter/gather
loops for byte↔int conversion. This added ~2500 JNI transitions per encode
of 8KB, making it 64-109% slower than the already-fast scalar code.

Replace with the same 4x-unrolled table-lookup approach used by
encodeNoNewline(), now with offset support. This matches the scalar
CN1 encode/decode performance while maintaining the same API contract.

Agent-Logs-Url: https://github.com/codenameone/CodenameOne/sessions/90b8c36e-8f20-47da-9fb4-56344f18a336

Co-authored-by: shai-almog <67850168+shai-almog@users.noreply.github.com>

* Revert "Optimize SIMD Base64: replace slow scatter/gather + per-element SIMD calls with fast inline scalar Java"

This reverts commit 00e5103.

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: shai-almog <67850168+shai-almog@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants