cache rate index to avoid lookup in form_matrix loop#3884
cache rate index to avoid lookup in form_matrix loop#3884shimwell wants to merge 1 commit intoopenmc-dev:developfrom
Conversation
|
I think this degrade readability a bit. I will be surprised if this will improve performance in a non negligible way. Do you have an example where this change affect performance significantly for the better? |
|
Yes good point, on my computer the form_matrix takes 5.3s on this branch and 5.6s on develop. With a bigger chain (perhaps a tendl chain) I guess it would take longer. But perhaps 0.3s is still worth it. import time
import urllib.request
from pathlib import Path
import numpy as np
from openmc.deplete import Chain, reaction_rates
# Download the 27MB ENDF/B-VIII.0 SFR chain
CHAIN_URL = "https://raw.githubusercontent.com/openmc-data-storage/openmc_data/refs/heads/main/src/openmc_data/depletion/chain_endf_b8.0_sfr.xml"
CHAIN_FILE = Path("chain_endf_b8.0_sfr.xml")
print(f"Downloading chain from {CHAIN_URL} ...")
urllib.request.urlretrieve(CHAIN_URL, CHAIN_FILE)
print("Download complete.")
chain = Chain.from_xml(CHAIN_FILE)
# Create reaction rates with random values
nuc_names = [nuc.name for nuc in chain.nuclides]
mats = ["1"]
rates = reaction_rates.ReactionRates(mats, nuc_names, chain.reactions)
rng = np.random.default_rng(42)
for nuc in chain.nuclides:
for r_type, target, _, br in nuc.reactions:
# Random rate in [0, 10) reactions/sec
rates.set("1", nuc.name, r_type, rng.random() * 10.0)
# Time form_matrix
# Warm-up call
_ = chain.form_matrix(rates[0])
n_trials = 100
t0 = time.perf_counter()
for _ in range(n_trials):
matrix = chain.form_matrix(rates[0])
t_total = time.perf_counter() - t0
print(f"form_matrix timing ({n_trials} calls):")
print(f" Total: {t_total:.3f} s")
print(f" Average: {t_total / n_trials * 1000:.3f} ms") |
|
IMO a 5% speedup does not justify the readability cost. |
|
Thanks for the PR @shimwell and @yrrepy - |
Description
A tiny performance improvement that caches the rates.index_nuc and rates.index_rx so that we can avoid looking them up within the loop
I have committed as @yrrepy as this is his idea, I am just facilitating by splitting up a otherwise large piece of work.
I think this small speed improvement makes sense on its own
Fixes # (issue)
Checklist