Skip to content

Commit 8907496

Browse files
committed
v0.1.6 upload
- updated scripts for the MSORF preprint v2: https://arxiv.org/abs/2505.21247v2 - added tests for MSORF - revised SC Huber implementation - added an environmental variable for avoiding Numba-Numpy parallelization (see https://stackoverflow.com/questions/79673925/disable-numpy-parallelization-inside-numba-jit) - minor code revision
1 parent 7ec41fd commit 8907496

30 files changed

Lines changed: 9199 additions & 140 deletions

.pre-commit-config.yaml

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -77,3 +77,8 @@ repos:
7777
- id: conventional-pre-commit
7878
stages: [commit-msg]
7979
args: []
80+
81+
- repo: https://github.com/kieran-ryan/pyprojectsort
82+
rev: v0.4.0
83+
hooks:
84+
- id: pyprojectsort

README.md

Lines changed: 11 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -16,11 +16,13 @@ or, if `makefile` is installed,
1616
```
1717
Some parts of the code depend on additional dependencies that can be installed with the corresponding optional dependecies flag. The three defined for this package are
1818

19-
- `orb_ml` - for FJK (machine learning from orbital information)
19+
- `orb_ml` - for FJK (machine learning from orbital information).
2020

21-
- `msorf` - for MSORF (everything in `qml2.multilevel_sorf`)
21+
- `msorf` - for MSORF (everything in `qml2.multilevel_sorf`).
2222

23-
- `torch` - Torch functionality (efficiency questionable right now TBH)
23+
- `morfeus` - for applications dependend on `morfeus-ml` package (everything related to conformer ensemble generation).
24+
25+
- `torch` - Torch functionality (efficiency questionable right now TBH).
2426

2527
For example, to use the `orb_ml` and `msorf` optional dependecy flags in your installation use
2628

@@ -42,6 +44,8 @@ To check that the installed repo works correctly run
4244
```
4345
**NOTE:** The command assumes that `python` environmental variable points towards a valid executable. If you use an environment alias change definition of the `python` variable in the beginning of the Makefile.
4446

47+
If you have trouble getting all tests to complete consult the `tests/qml2_test.yml` file describing the conda environment where all tests have been created and reproduced.
48+
4549
## :books: API documentation
4650

4751
API documentation can be generated with [Doxygen](https://www.doxygen.nl/) by running
@@ -64,6 +68,10 @@ This will create `manual.html` file that can be opened with an Internet browser.
6468

6569
`QML2_NUM_PROCS` - number of processes spawned by parts of the code parallelized via `python.multiprocessing` (training set representations in model-related classes, `pyscf` calculations made by `OML_Compound_list` attribute calls). For limiting number of OpenMP threads spawned in turn by these processes use suppress options (such as `KRRModel` class's `training_reps_suppress_openmp` option). Also see `parallelization.set_default_num_procs`.
6670

71+
`QML2_NUM_PROCS` - number of processes spawned by parts of the code parallelized via `python.multiprocessing` (training set representations in model-related classes, `pyscf` calculations made by `OML_Compound_list` attribute calls). For limiting number of OpenMP threads spawned in turn by these processes use suppress options (such as `KRRModel` class's `training_reps_suppress_openmp` option). Also see `parallelization.set_default_num_procs`.
72+
73+
`QML2_AVOID_NUMBA_NUMPY_PARALLELIZATION` - some Numba routines in the code call in parallel Numpy routines, which creates problems in some setups (e.g. when both Numba and Numpy try to parallelize over a large number of threads without taking each other into account). Setting this environmental variable to `1` disables Numba parallelization in such routines, leaving them to be parallelized exclusively with Numpy.
74+
6775
### Experimental
6876

6977
`QML2_DEFAULT_JIT` - setting to `NUMBA` (default) or `TORCH` (both are case insensitive) determines whether Numba or TorchScript JIT compilation is used. Also see `jit_interfaces.set_default_jit`.

misc_commands/spython

Lines changed: 28 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,8 @@
88
# 1st argument - python script name.
99
# 2nd argument - how the SLURM job is named.
1010
# Arguments starting from third (if present) - arguments for the python script.
11+
# If SPYTHON_NO_SLURM is defined in shell as 1 "spython" launches the submitted script in simple shell. Introduced in order same bash scripts could be used to spam jobs on workstations that do not have SLURM.
12+
1113

1214
#TODO : K.Karan: I think GPU allocation is OK according to the manuals, but I never got it to work
1315
# on my workstation.
@@ -16,7 +18,19 @@ function flag_val(){
1618
echo $1 | cut -d'=' -f2-
1719
}
1820

19-
NCPUs=$(cat /etc/slurm/slurm.conf | tr ' ' '\n' | awk 'BEGIN {FS="="} {if ($1=="CPUs") print $2}')
21+
slurmconf_file="/etc/slurm/slurm.conf"
22+
23+
if [ -f "$slurmconf_file" ]
24+
then
25+
NCPUs=$(cat $slurmconf_file | tr ' ' '\n' | awk 'BEGIN {FS="="} {if ($1=="CPUs") print $2}')
26+
else
27+
echo "WARNING: SLURM configuration file not found."
28+
NCPUs=$OMP_NUM_THREADS
29+
if [ "$NCPUs" == "" ]
30+
then
31+
NCPUs=1
32+
fi
33+
fi
2034
NGPUs=0
2135
JOB_OMP_NUM_THREADS=$NCPUs
2236
nodelist=$(hostname)
@@ -26,6 +40,11 @@ conda_env=$CONDA_DEFAULT_ENV
2640
PIPE_ARGS=0
2741
array=""
2842

43+
if [ "$SPYTHON_NO_SLURM" == "" ]
44+
then
45+
SPYTHON_NO_SLURM=0
46+
fi
47+
2948
while [[ "$1" == --* ]]
3049
do
3150
case "$1" in
@@ -196,9 +215,14 @@ then
196215
fi
197216
EOF
198217

199-
SLURM_ID=$(sbatch $array_arg $DEPSTRING $EXNAME | awk '{print $4}')
200-
201-
DEPSTRING="--dependency=afterok:$SLURM_ID"
218+
if [ "$SPYTHON_NO_SLURM" == "0" ]
219+
then
220+
SLURM_ID=$(sbatch $array_arg $DEPSTRING $EXNAME | awk '{print $4}')
221+
DEPSTRING="--dependency=afterok:$SLURM_ID"
222+
else
223+
chmod +x $EXNAME
224+
./$EXNAME > $job_name.stdout_0 2> $job_name.stderr_0
225+
fi
202226

203227
cd ..
204228

-3.25 KB
Binary file not shown.

pyproject.toml

Lines changed: 57 additions & 35 deletions
Original file line numberDiff line numberDiff line change
@@ -1,41 +1,63 @@
11
[build-system]
2-
requires = ["setuptools >= 61.0"]
32
build-backend = "setuptools.build_meta"
3+
requires = [
4+
"setuptools >= 61.0",
5+
]
6+
47
[project]
8+
authors = [
9+
{ name = "Danish Khan" },
10+
{ name = "Jan Weinreich" },
11+
{ name = "Konstantin Karandashev" },
12+
{ name = "Stefan Heinen" },
13+
]
14+
dependencies = [
15+
"numba",
16+
"numpy",
17+
"scipy",
18+
]
19+
description = "Collection of Quantum Machine Learning routines"
520
name = "qml2"
6-
version = "0.1.5"
7-
dependencies=[
8-
"numpy",
9-
"numba",
10-
"scipy",
11-
]
12-
authors=[
13-
{name="Konstantin Karandashev"},
14-
{name="Stefan Heinen"},
15-
{name="Danish Khan"},
16-
{name="Jan Weinreich"},
17-
]
18-
description="Collection of Quantum Machine Learning routines"
19-
readme="README.md"
20-
license={text="MIT License"}
21-
[tool.setuptools]
22-
packages=[
23-
"qml2",
24-
"qml2.orb_ml",
25-
"qml2.representations",
26-
"qml2.models",
27-
"qml2.jit_interfaces",
28-
"qml2.dataset_formats",
29-
"qml2.kernels",
30-
"qml2.test_utils",
31-
"qml2.gpytorch",
32-
"qml2.cupy",
33-
"qml2.multilevel_sorf",
34-
"qml2.multilevel_sorf.processed_object_constructors",
35-
"qml2.multilevel_sorf.base_constructors"]
21+
readme = "README.md"
22+
version = "0.1.6"
23+
24+
[project.license]
25+
text = "MIT License"
26+
3627
[project.optional-dependencies]
37-
torch=["torch"]
38-
orb_ml=["pyscf", "geometric"]
39-
msorf=["aalto-boss"]
28+
morfeus = [
29+
"morfeus-ml",
30+
"rdkit",
31+
]
32+
msorf = [
33+
"aalto-boss",
34+
]
35+
orb_ml = [
36+
"geometric",
37+
"pyscf",
38+
]
39+
torch = [
40+
"torch",
41+
]
42+
43+
[tool.setuptools]
44+
packages = [
45+
"qml2",
46+
"qml2.cupy",
47+
"qml2.dataset_formats",
48+
"qml2.gpytorch",
49+
"qml2.jit_interfaces",
50+
"qml2.kernels",
51+
"qml2.models",
52+
"qml2.multilevel_sorf",
53+
"qml2.multilevel_sorf.base_constructors",
54+
"qml2.multilevel_sorf.processed_object_constructors",
55+
"qml2.orb_ml",
56+
"qml2.representations",
57+
"qml2.test_utils",
58+
]
59+
4060
[tool.setuptools.package-data]
41-
qml2 = ["*/*.cu"]
61+
qml2 = [
62+
"*/*.cu",
63+
]

qml2/ensemble.py

Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,7 @@
33
import os
44
from copy import deepcopy
55

6+
# TODO make morfeus and rdkit dependences here optional to allow running some MSORF tests without morfeus-ml
67
from morfeus.conformer import ConformerEnsemble
78
from rdkit import Chem, RDLogger
89

@@ -14,6 +15,7 @@
1415
from .orb_ml.representations import OML_rep_params
1516
from .utils import weighted_array, write_compound_to_xyz_file
1617

18+
# disable RdKit's verbose mode.
1719
RDLogger.DisableLog("rdApp.*")
1820

1921
base_compound_class_dict = {
@@ -23,6 +25,26 @@
2325
"OML_Slater_pairs": OML_Slater_pairs,
2426
}
2527

28+
morfeus_random_seed = None
29+
30+
31+
def set_morfeus_random_seed(new_morfeus_random_seed):
32+
"""
33+
Set RNG seed used by morfeus when generating conformers for the Ensemble class.
34+
"""
35+
global morfeus_random_seed
36+
morfeus_random_seed = new_morfeus_random_seed
37+
38+
39+
def get_morfeus_random_seed(default_seed=None):
40+
"""
41+
Get RNG seed used by morfeus when generating conformers for the Ensemble class.
42+
"""
43+
if default_seed is None:
44+
return morfeus_random_seed
45+
else:
46+
return default_seed
47+
2648

2749
class WeightedCompound:
2850
def __init__(self, compound, rho, ff_energy):
@@ -53,6 +75,7 @@ def __init__(
5375
num_conformer_generations=1,
5476
savefile_prefix=None,
5577
compound_kwargs={},
78+
random_seed=None,
5679
):
5780
self.rdkit_obj = rdkit_obj
5881
self.SMILES = SMILES
@@ -70,6 +93,7 @@ def __init__(
7093
self.filtered_conformers = None
7194
self.processed_conformers = None
7295
self.savefile_prefix = savefile_prefix
96+
self.random_seed = random_seed
7397

7498
def get_nuclear_charges(self):
7599
return array_([a.GetAtomicNum() for a in self.rdkit_obj.GetAtoms()])
@@ -113,6 +137,7 @@ def conformer_generation(self):
113137
n_confs=self.num_conformers,
114138
optimize=self.ff_type,
115139
n_threads=checked_environ_val("MORFEUS_NUM_THREADS", default_answer=1),
140+
random_seed=get_morfeus_random_seed(default_seed=self.random_seed),
116141
)
117142
except Exception as ex:
118143
if not isinstance(ex, ValueError):

qml2/jit_interfaces/__init__.py

Lines changed: 11 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -12,10 +12,11 @@
1212
# then it enters under the name given to it by Numpy/Numba.
1313
import importlib
1414

15-
from ..basic_utils import checked_environ_val
15+
from ..basic_utils import checked_environ_val, checked_logical_environ_val
1616
from .jit_manager import available_jits, numba_flag
1717

1818
default_jit_env_var_name = "QML2_DEFAULT_JIT"
19+
avoid_numba_numpy_parallellization_env_var_name = "QML2_AVOID_NUMBA_NUMPY_PARALLELIZATION"
1920

2021

2122
def set_defaults_from_interface(interface_name):
@@ -36,6 +37,15 @@ def set_default_jit(new_jit_flag):
3637
set_defaults_from_interface("." + used_jit_name + "_interface")
3738

3839

40+
def allow_numba_numpy_parallelization():
41+
"""
42+
This function is used as the `numba_parallel` value in `jit_` instances for routines where `prange` loop includes references to Numpy.
43+
"""
44+
return not checked_logical_environ_val(
45+
avoid_numba_numpy_parallellization_env_var_name, default_answer=False
46+
)
47+
48+
3949
if __name__ != "__main__":
4050
# Check environment for default jit flag, if none pick numba
4151
default_flag = checked_environ_val(

qml2/jit_interfaces/jit_manager.py

Lines changed: 8 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,6 @@
11
# Introduced to ensure flags for different JIT compilers do not obstruct each other.
22
import inspect
3+
import traceback
34

45
from ..basic_utils import checked_logical_environ_val
56

@@ -46,7 +47,7 @@ class JITFailureException(Exception):
4647

4748

4849
class ExceptionRaisingFunc:
49-
def __init__(self, func):
50+
def __init__(self, func, ex):
5051
self.exception_text = (
5152
"""
5253
Failure in JIT compilation.
@@ -56,6 +57,10 @@ def __init__(self, func):
5657
function source:
5758
"""
5859
+ inspect.getsource(func)
60+
+ """
61+
traceback:
62+
"""
63+
+ "\n".join(traceback.format_exception(ex))
5964
)
6065

6166
def __call__(self, *args, **kwargs):
@@ -96,10 +101,10 @@ def exception_skipping_jit(self, signature_or_function, local_skip_jit=False, **
96101
return signature_or_function
97102
try:
98103
return self.used_jit_func(signature_or_function, **kwargs)
99-
except (*self.possible_jit_failures,):
104+
except (*self.possible_jit_failures,) as ex:
100105
# assert type(ex) in self.possible_jit_failures
101106
if skip_jit_failures:
102-
return ExceptionRaisingFunc(signature_or_function)
107+
return ExceptionRaisingFunc(signature_or_function, ex)
103108
else:
104109
raise JITFailureException
105110

qml2/jit_interfaces/numba_interface.py

Lines changed: 35 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -125,8 +125,43 @@ def random_array_from_rng_(size=(1,), rng=None):
125125
return rng.random(size)
126126

127127

128+
@jit_
129+
def random_from_rng_(rng=None):
130+
return random_array_from_rng_(rng=rng)[0]
131+
132+
128133
standard_normal_ = np.random.standard_normal
134+
135+
136+
@jit_
137+
def standard_normal_array_from_rng_(size=(1,), rng=None):
138+
if rng is None:
139+
return standard_normal_(size)
140+
else:
141+
return rng.standard_normal(size)
142+
143+
144+
@jit_
145+
def standard_normal_from_rng_(rng=None):
146+
return standard_normal_array_from_rng_(rng=rng)[0]
147+
148+
129149
randint_ = np.random.randint
150+
151+
152+
@jit_
153+
def randint_array_from_rng_(lbound, ubound, size=(1,), rng=None):
154+
if rng is None:
155+
return randint_(lbound, ubound, size=size)
156+
else:
157+
return rng.integers(lbound, ubound, size=size)
158+
159+
160+
@jit_
161+
def randint_from_rng_(lbound, ubound, rng=None):
162+
return randint_array_from_rng_(lbound, ubound, rng=rng)[0]
163+
164+
130165
seed_ = np.random.seed
131166
permutation_ = np.random.permutation
132167
# copying

0 commit comments

Comments
 (0)