-
Notifications
You must be signed in to change notification settings - Fork 7
Validation
Set appropriate options in default.yml and config_model_parameters.yml. The code used to extract the validation data is put to tpcwithdnn/idc_data_validator.py for 1D fluctuations and to tpcwithdnn/data_validator.py for the old CNN part.
In default.yml:
-
docreatendvaldata- whether to produce a root file with full validation data described on the Validation Data Format page; the result file is needed for other options below to work- the output is split into parts to be more efficient with memory. The output files can be merged with the script
tpcwithdnn/merge_validation_trees.sh
- the output is split into parts to be more efficient with memory. The output files can be merged with the script
- docreatepdfmaps – whether to create ND histograms (as
*.gzipfiles) and pdf maps (*.root) for all validation data:- old input (U-Net): mean with id 0, 9, 18 (scaling: 1.0, 1.1, 0.9)
- IDC input (BDT): mean maps with id 0, 9, 18, 27, 36 (scaling: 1.00, 1.03, 0.97, 1.06, 0.94)
-
docreatepdfmapforvariable- whether to create ND histograms and pdf maps for the data specified inconfig_model_parameters.yml -
domergepdfmaps- whether to merge pdf maps for different mean maps and factors into one file
Note: the data created by docreatendvaldata can now be visualized interactively with jupyter notebooks, including interactive histogramming, e.g. with notebooks/model_validation.ipynb. Therefore, the part of creating pdf maps is not necessary anymore.
In config_model_parameters.yml:
-
dirtree- where to save validation data (ROOT tree files) and pdf maps -
dirhist- where to save validation histograms -
nd_val_events- number of scenarios for ND validation -
nd_val_partition- where the validation scenarios should be taken from:- random - sample randomly, but use only mean factors of 0.9, 1.0, 1.1 for U-Net, 0.94, 0.97, 1.0, 1.03, 1.06 for BDT data
- train, val, apply - train / validation / apply data
-
nd_validate_model: if the trained model (its predictions) should be evaluated as well; required for ND histograms and maps -
pdf_map_var,pdf_map_mean_id: ND histograms and pdf maps will be created for this variable and mean map id ifdocreatepdfmapforvariableis set indefault.yml
The easiest way to examine the result files is to use the interactive Jupyter notebook available here.
Alternatively, one can manually draw plots with ROOT, from *.root pdf maps.
Enter the notebooks directory:
cd notebooks/Launch Jupyter without browser (you will later browse on your local machine). It will print an URL with a token, copy and store this for the next step.
python -m notebook --no-browser --port=8887 # Or any other reasonable port numberOn your local machine, tunnel the localhost to the notebook port:
ssh -N -L localhost:8888:localhost:{remote_port_number} {user@remote_machine}You can browse the notebooks at the URL returned to you by Jupyter, just change the port number to 8888.
- Prepare a list of pdf maps to be contained in the notebook:
- adjust makePDFMapsList() in TPCwithDNN/notebooks/makePDFMapsLists.sh
- run:
source makePDFMapsLists.sh
makePDFMapsList- You should get a new file pdfmaps.list with paths to proper pdf files
- Enter the notebook model_performance_evaluation.ipynb in the Jupyter web browser.
- You need to adjust the directory variables on the top
- Follow the rest of code in the notebook, adjusting any file paths as needed. You might need to adjust the cuts if there is no matching data.