Skip to content

[doc] Fixathon 2: documentation sprint#22289

Open
siliataider wants to merge 15 commits into
masterfrom
fixathon_docs
Open

[doc] Fixathon 2: documentation sprint#22289
siliataider wants to merge 15 commits into
masterfrom
fixathon_docs

Conversation

@siliataider
Copy link
Copy Markdown
Contributor

@siliataider siliataider commented May 13, 2026

Cleanup, refactoring, cheat sheets, etc.

Changes summary

General

  • Remove source files from doxygen groups (\file xxx \ingroup yyy). These clutter the overview without adding useful documentation.
  • Remove/reorder groups that are used infrequently.
  • Enable inlining of inherited functions into the overview list. This significantly simplifies viewing/searching the available interface of a class.

Python Interface

  • Updated the dedicated Python Interface top-level section in the Doxygen navigation, with a landing page covering installation, quickstart and a quick overview
  • Added a new structured RDataLoader page walking users through data preparation, loader configuration, batch iteration etc.
    • TODO: Eventually add the actual docstrings of the public facing classes instead of the custom reference table I created
  • Revamped the UHI page with an updated intro and a new Serialization section
  • Added two cheat sheets (RDataLoader and UHI) as a proof of concept (one-page PDF references downloadable and embedded directly in the docs too)
    • TODO: update the UHi cheatsheet to make the plotting section more prominent
    • TODO: once enough cheat sheets exist, refactor them into a dedicated Cheat Sheets index page

Search Engine

  • No changes for the moment

Preview

See a preview of the doxygen page here:
https://root.cern/doc/hackathon/index.html

Note: This webpage does not contain the full doyxgen run. Macros embedded in the source code are not being run.

Copy link
Copy Markdown
Collaborator

@ferdymercury ferdymercury left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for this endeavor!

Two remarks

  • QHP generation shouldn't be disabled for main Doxyfile since it's used to publish qch file which is fundamental for qtcreator IDE
  • it would be a lot cleaner if you used, as ALICE O2, this approach: #17426 rather than having two huge Doxyfiles almost impossible to review and annoying to maintain with warnings depending on version, etc

Comment thread documentation/doxygen/Doxyfile Outdated
# This tag requires that the tag GENERATE_HTML is set to YES.

GENERATE_QHP = YES
GENERATE_QHP = NO
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
GENERATE_QHP = NO
GENERATE_QHP = YES

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, we won't touch it. I thought nobody uses this. 😅

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:) thanks
It's nice because it allows you press F1 in the IDE:
https://user-images.githubusercontent.com/10653970/154870916-28e4009d-eb70-46df-a52b-da81cfe3c97f.png
and that works also offline / no need to open web browser

Comment thread hist/hist/src/TH2.cxx Outdated
@hageboeck hageboeck added the clean build Ask CI to do non-incremental build on PR label May 15, 2026
hageboeck and others added 13 commits May 15, 2026 14:38
- Move the web widgets to the webdisplay group.
- Move webdisplay to GUI group.
- Put the parametric functions group under Math.
- Regroup I/O doxygen groups.
- Move doxygen GUI group to Graphics

Co-authored-by: martinfoell <m.foell.1999@gmail.com>
- Remove internal and detail classes from RDF group.
- Remove source files from RDF group.
- Expand docs of RDataFrame overview page.
- Structure documentation of RDataFrame API.
…xygen group.

Listing files on the doxygen page doesn't have a lot of benefit.
Instead, we will list the contained classes.
- Enable sorting of groups in the treeview
- Enable right-hand side scrolling site overview
- Add "make preview" for a fast preview mode without ROOT
  customisations, with MT processing, and without dot graphs
- Enable inlining of inherited members into the overview of class
  functions
@siliataider siliataider marked this pull request as ready for review May 15, 2026 14:19
Copy link
Copy Markdown
Member

@vepadulano vepadulano left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for all of this work! I have reviewed the Python part of the PR, here are some comments from my side.

Comment on lines +14 to +17
\htmlonly
<div class="install-tabs">
<div class="tab-buttons">
<button class="tab-btn active" onclick="switchTab(this, 'conda')">conda</button>
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we were discussing during the hackathon, this is quite cool to have in the doxygen pages! Would be nice to have this in some compact form that can be used in other places, not for this PR

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Noted!

Comment on lines +101 to +103
# Write it to a ROOT file
with ROOT.TFile.Open("output.root", "RECREATE") as f:
h.Write()
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick, I would prefer we showed f.WriteObject(h, "name_of_histo"). My reasoning is that RFile will not support the syntax object.Write because there will be no implicit object registration anyway. I do see the point of WriteObject needing the extra string argument (which could even be defaulted to h.GetName() internally for TObject-derived objects). So this is mostly to voice my opinion, I will accept what the majority prefers

Comment on lines +114 to +115
# Define a column x
rdf = rdf.Define("x", lambda : np.random.normal(0, 1))
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We know that this syntax will be raising a warning for a while now, until we enable pure Python callables in RDF. Perhaps best not to show it already?

Comment on lines +41 to +43
# Define a Python callback to compute a new variable
def invariant_mass(E: float, p: float) -> float:
return math.sqrt(E**2 - p**2)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar comment here about the implicit numba-jit API

# events with fewer than 10 jets are zero-padded
~~~

\warning Every RVec column in `columns` must appear in `max_vec_sizes`.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
\warning Every RVec column in `columns` must appear in `max_vec_sizes`.
\warning Every vector column in `columns` must appear in `max_vec_sizes`.

unless we really only support RVec?

optimizer = torch.optim.Adam(model.parameters())

for epoch in range(num_epochs):
for X, y in dl.as_torch():
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason for the X to be capitalised?


### Eager loading

By default the loader reads data lazily, one chunk of data at a time. For small datasets that fit in memory and will be iterated many times, eager loading pays a one-time cost at construction and then serves every epoch from memory:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
By default the loader reads data lazily, one chunk of data at a time. For small datasets that fit in memory and will be iterated many times, eager loading pays a one-time cost at construction and then serves every epoch from memory:
By default the loader reads data lazily, one chunk of data at a time. For small datasets that fit in memory and will be iterated many times, eager loading pays a one-time cost at construction and then serves batches in every epoch from memory:

loss = (loss_fn(model(X), y) * w).mean()
~~~

### Eager loading
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps I would move this section further up since it's referenced before by the resampling section


## API Reference

### RDataLoader(rdataframes, ...)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is nice, but it would be nicer if this was a full doxygen function doc like https://root.cern/doc/master/group__Pythonizations.html#ga7fd79fcb9358768e7b5f9fe0a924dd77

@github-actions
Copy link
Copy Markdown

Test Results

    22 files      22 suites   3d 14h 48m 10s ⏱️
 3 849 tests  3 849 ✅ 0 💤 0 ❌
76 030 runs  76 030 ✅ 0 💤 0 ❌

Results for commit cf59381.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants