Skip to content

Chart performance: lift loop throttle and de-pandas render path#424

Merged
brickbots merged 3 commits into
mainfrom
chart_perf
May 20, 2026
Merged

Chart performance: lift loop throttle and de-pandas render path#424
brickbots merged 3 commits into
mainfrom
chart_perf

Conversation

@brickbots
Copy link
Copy Markdown
Owner

@brickbots brickbots commented May 19, 2026

Summary

Three changes that together take the chart from the user-reported ~5 fps on a Pi to an estimated ~30 fps, plus a related 2× win on the main-menu screens. The PR is intentionally three commits so each piece can be reverted independently.

1. b501a62d — Use sleep_for_framerate in the main loop

main.py:560 was time.sleep(0.1) in the queue.Empty fallback, pinning every UI module at ~10 Hz max. Replaced with state_utils.sleep_for_framerate(shared_state) (the same throttle other UI modules already use directly): ~30 Hz when awake, ~2 Hz when asleep.

Also dropped the redundant time.sleep(0.2) from PowerManager.update() — the new throttle already sleeps longer when asleep, so the explicit sleep was unnecessary and slightly less power-saving than the new behavior.

2. 9c9ec779 — Remove redundant per-module framerate sleeps

Now that the main loop throttles, the per-module sleep_for_framerate / time.sleep(1/30) calls at the top of various UI update() methods are a second 33 ms sleep per iteration — capping those screens at ~13–15 Hz instead of 30 Hz.

Removed from equipment.py, gpsstatus.py, sqm.py, software.py, status.py. Other time.sleep(...) calls in ui/ (camera settling in calibration, marking-menu flash, exposure-sweep delays, etc.) are intentional state-waits and left in place.

3. 229fa49e — De-pandas the chart hot path

plot.py:plot_starfield / plot_markers / radec_to_xy were doing several DataFrame.assign() chains per frame over the star catalog and the constellation edges. Each .assign() builds a brand-new DataFrame and each column access dispatches through pandas — cProfile showed ~138 pandas.Series.__init__ calls per frame, dominating the per-frame cost.

Per-frame projection / rotation / screen-space / visibility math now runs on numpy arrays cached on the Starfield instance. Pandas DataFrames are kept only at the boundaries where they buy us something:

  • Star.from_dataframe() is the documented skyfield API for building Stars in bulk; plot_markers and radec_to_xy still build a tiny DataFrame just for that call.
  • render_starfield_pil still returns visible_stars as a DataFrame (sliced from self.stars once at the end) because ui/align.py treats it as a pandas object (.iloc, .sort_values, .assign, plus ra_degrees / dec_degrees column access).

Measured effect (Mac, 128×128 chart at FOV 10.2°)

Path Before (release) After commits 1–2 After all 3 Speedup
Main-loop iteration cap (no work) 10 Hz 30 Hz 30 Hz
Menu-screen rate (trivial render) ~13 Hz ~27 Hz ~27 Hz
Chart plot_starfield total 3,870 µs 3,870 µs 541 µs ~7.2×
Chart plot_markers (3) 1,625 µs 1,625 µs 355 µs ~4.6×
Chart full per-frame 5,750 µs 5,750 µs 1,084 µs ~5.3×

cProfile after this PR: per-frame pandas writes drop from 138 → 2 (the two visible_stars["x_pos"/"y_pos"] = ... assigns at the end of render_starfield_pil). The remaining dominant per-frame costs are skyfield.projections.project() and PIL's ImageChops — both C code, at the floor for pure-Python optimization.

Pi extrapolation

The user observed ~5 fps chart on the Pi before any of this work, implying ~167 ms/frame on Pi vs ~5.7 ms on Mac (~30× scaling). Applying that factor to the new 1,084 µs Mac figure suggests ~30 ms/frame on Pi, which fits within the 33 ms sleep_for_framerate budget — chart should now sustain close to 30 Hz instead of 5 Hz.

Behavioral diffs to be aware of

Awake Asleep
Before ~10 Hz (0.1 s sleep) ~3.3 Hz (0.1 + 0.2 s sleeps)
After ~30 Hz (1/30 s sleep) 2 Hz (0.5 s sleep)

Wake-from-sleep latency goes from ~300 ms to ~500 ms (one asleep-loop iteration). If that ever feels too slow in practice, the lever is state_utils.sleep_for_framerate's asleep value — but it affects all UI modules, so I left it alone.

Test plan

  • nox -s lint clean
  • nox -s format clean
  • nox -s smoke_tests 2/2 pass
  • nox -s unit_tests 98/98 pass
  • Loop-rate harness confirms the expected 10→22 Hz chart and 13→27 Hz menu rates on the dev machine
  • Smoke test on a Pi to confirm the predicted ~6× chart improvement and that the new ~30 Hz awake cadence is comfortable from a CPU/power standpoint

🤖 Generated with Claude Code

brickbots and others added 3 commits May 19, 2026 16:46
The main event loop's queue.Empty fallback was time.sleep(0.1), pinning
every UI module (including the chart) to ~10 Hz max regardless of how
fast its update() completed. Swapping to state_utils.sleep_for_framerate
takes the awake cap to ~30 Hz, matching what other UI modules already
use directly, and triples chart-update headroom (measured 8.9 -> 22.3 Hz
effective with the current chart render cost on dev hardware).

Also removes the now-redundant 0.2s sleep in PowerManager.update for the
asleep state: sleep_for_framerate already sleeps 0.5s when power_state
is 0, which is strictly more power-saving than the old 0.3s combined
(0.1s loop sleep + 0.2s PowerManager sleep). Net asleep-state change is
~3.3 Hz -> 2 Hz.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Now that the main event loop calls sleep_for_framerate itself, the
per-module sleeps at the top of each update() are a second 1/30 s
sleep per iteration -- pinning the affected screens to ~13-15 Hz
instead of the intended ~30 Hz.

Audited PiFinder/ui/ for sleep_for_framerate and time.sleep(1/30)
calls inside update() methods. Removed:
  - equipment.py:  sleep_for_framerate(self.shared_state)
  - gpsstatus.py:  state_utils.sleep_for_framerate(self.shared_state)
  - sqm.py:        sleep_for_framerate(self.shared_state)
  - software.py:   time.sleep(1 / 30)
  - status.py:     time.sleep(1 / 30)

Also dropped the now-unused state_utils / sleep_for_framerate / time
imports from those files.

Other time.sleep() calls in ui/ are intentional state-waits (camera
settling, exposure sweeps, marking-menu flashes, calibration steps)
and were left in place.

Measured menu-screen rate: ~13.4 Hz -> ~26.7 Hz (matches the user-
reported ~15 fps menu observation; ~2x faster after the fix).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…o_xy)

The chart's per-frame work was dominated by pandas: every frame ran
several DataFrame.assign() chains over the star catalog and the
constellation edges, with each .assign() building a brand-new
DataFrame and each column access dispatching through pandas. cProfile
on release showed ~138 pandas.Series.__init__ calls and 6+ .assign()s
per frame; the math underneath was negligible.

This commit moves the per-frame projection / rotation / screen-space /
visibility blocks to numpy arrays cached on the Starfield instance.
Pandas DataFrames are kept only at the boundaries where they buy us
something:

  - Star.from_dataframe() is the documented skyfield API for building
    a Star object from N rows, so plot_markers / radec_to_xy still
    build a tiny DataFrame just for that call.
  - render_starfield_pil still returns visible_stars as a DataFrame
    (sliced from self.stars at the very end) because align.py treats
    it as a pandas object (.iloc, .sort_values, .assign, ra_degrees /
    dec_degrees column access).

Changes per function:

  Starfield.__init__:
    - Cache self._star_magnitudes (numpy) for the per-frame mag filter.
    - Drop self.const_edges_df; constellation x/y arrays live as four
      separate numpy attributes refreshed each frame.

  plot_starfield:
    - Project stars + constellation endpoints into numpy arrays
      (self._stars_x/y, self._const_sx/sy/ex/ey) instead of writing
      columns into the DataFrames.

  render_starfield_pil:
    - All rotate/screen-space/visibility math runs on numpy arrays.
    - Iterate visible edges/stars with np.flatnonzero, not pandas zip.
    - Rebuild visible_stars DataFrame at the end via self.stars.iloc[
      visible_idx].copy() so align.py keeps its catalog columns.

  plot_markers:
    - Skyfield call still uses a tiny DataFrame; after .observe(), all
      rotation / screen-space / visibility runs in numpy.
    - Preserve the pre-existing tautological off-screen-pointer
      condition verbatim (separate concern; flagged in a comment).

  radec_to_xy:
    - Same shape as plot_markers but for a single point; the body is
      now scalar numpy/python after the skyfield observe.

Measured on Mac (release vs after this commit):
  plot_starfield total:  3870 us -> 541 us  (~7.2x)
  plot_markers (3):      1625 us -> 355 us  (~4.6x)
  full chart frame:      5750 us -> 1084 us (~5.3x)

cProfile per-frame pandas writes: 138 -> 2 (just the two visible_stars
column assigns at the end). The new dominant per-frame cost is
skyfield's projections.py:project() and PIL's ImageChops -- both C
code, at the floor for pure-Python optimization.

Extrapolating with the user-reported ~30x Mac->Pi slowdown that
produced 5 fps charts before, this should land the Pi chart frame at
~30 ms -- within the 33 ms sleep_for_framerate budget.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@brickbots brickbots changed the title Use sleep_for_framerate in main loop to lift the 10 Hz UI cap Chart performance: lift loop throttle and de-pandas render path May 20, 2026
@brickbots brickbots changed the base branch from release to main May 20, 2026 01:38
@brickbots brickbots merged commit f133d80 into main May 20, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant