Skip to content

Commit 3307e39

Browse files
author
Quarto GHA Workflow Runner
committed
Built site for gh-pages
1 parent c926692 commit 3307e39

27 files changed

Lines changed: 1117 additions & 486 deletions

.nojekyll

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1 @@
1-
7bb2d613
1+
bd1bcd0f

_tex/index.tex

Lines changed: 192 additions & 114 deletions
Original file line numberDiff line numberDiff line change
@@ -295,13 +295,23 @@ \subsubsection{Unit I --- Experimental Data as a Learning Problem (Weeks
295295
Why ML failure modes are common in experimental science.
296296
\end{itemize}
297297

298-
\textbf{Summary:} This unit introduces the transition from classical
299-
physics-based modeling to data-driven discovery in materials science. We
300-
explore the unique challenges of experimental materials data, including
301-
its multi-modal nature, high acquisition cost, and the fundamental
302-
Processing-Structure-Property-Performance (PSPP) relationships. Key
303-
concepts include data scales, measurement uncertainty, and the CRISP-DM
304-
process adapted for scientific workflows.
298+
\textbf{Summary:}
299+
300+
\begin{itemize}
301+
\tightlist
302+
\item
303+
Transition from physics-based to data-driven modeling
304+
\item
305+
Experimental data challenges: multi-modal, high acquisition cost,
306+
sparse
307+
\item
308+
\textbf{PSPP} (Processing → Structure → Property → Performance) as a
309+
data dependency graph
310+
\item
311+
Data scales and measurement uncertainty
312+
\item
313+
\textbf{CRISP-DM} workflow adapted for scientific labs
314+
\end{itemize}
305315

306316
\textbf{Exercise:}\\
307317
Inspect real microscopy and process datasets; identify sources of bias
@@ -330,14 +340,18 @@ \subsubsection{Unit I --- Experimental Data as a Learning Problem (Weeks
330340
Relation to MFML refresher on PCA and covariance.
331341
\end{itemize}
332342

333-
\textbf{Summary:} This unit bridges the gap between the physical process
334-
of data acquisition and the mathematical tools used to describe it. We
335-
analyze how signals are formed in characterization tools and how
336-
physical constraints (resolution, noise, sampling) act as priors for
337-
learning. We then introduce Principal Component Analysis (PCA) and
338-
Singular Value Decomposition (SVD) as fundamental techniques for
339-
discovering low-dimensional structure in high-dimensional experimental
340-
datasets.
343+
\textbf{Summary:}
344+
345+
\begin{itemize}
346+
\tightlist
347+
\item
348+
Physical signal formation as a learning prior
349+
\item
350+
Resolution, noise, sampling as physical (not algorithmic) constraints
351+
\item
352+
\textbf{PCA} and \textbf{SVD} for low-dimensional structure in
353+
high-dimensional data
354+
\end{itemize}
341355

342356
\textbf{Exercise:}\\
343357
Fourier inspection of micrographs; effects of sampling and filtering.
@@ -363,25 +377,38 @@ \subsubsection{Unit I --- Experimental Data as a Learning Problem (Weeks
363377
Why ``good accuracy'' often means a broken pipeline.
364378
\end{itemize}
365379

366-
\textbf{Summary:} This unit covers the often-overlooked half of an ML
367-
pipeline: data integrity, validation, and how performance is measured.
368-
We start with the measurement chain and systematic \textbf{data
369-
cleaning} --- handling missing values, outliers, and duplicates with a
370-
``fix at source'' mindset. We then build the \textbf{transformation
371-
toolbox}: centering, min--max and z-score scaling, physics-aware
372-
non-dimensionalisation, log transforms, differentiation, and
373-
frequency-domain views (FFT, triggering for time series). On the
374-
supervision side we examine \textbf{labels and uncertainty} ---
375-
inter-annotator variance, probabilistic labels, and a Bayesian view of
376-
priors, likelihoods, and posteriors --- and then formalize the
377-
\textbf{bias--variance} tradeoff with parsimony and regularization. A
378-
major focus is \textbf{Data Leakage} in materials workflows
379-
(pre-processing, temporal, and group/spatial), tackled with proper
380-
holdout, K-fold, LOOCV, and stratified validation. We close with the
381-
\textbf{error measures} that decide what ``good'' actually means:
382-
MAE/MSE/RMSE and \(R^2\) for regression, and confusion matrices,
383-
precision/recall, F1/Dice, IoU, and categorical cross-entropy for
384-
classification and segmentation.
380+
\textbf{Summary:}
381+
382+
\begin{itemize}
383+
\tightlist
384+
\item
385+
Measurement chain → \textbf{data cleaning}: missing values, outliers,
386+
duplicates (``fix at source'')
387+
\item
388+
\textbf{Transformation toolbox}: centering, min--max / z-score
389+
scaling, non-dimensionalization, log, differentiation, FFT, triggering
390+
\item
391+
\textbf{Labels and uncertainty}: inter-annotator variance,
392+
probabilistic labels, Bayesian view (priors, likelihoods, posteriors)
393+
\item
394+
\textbf{Bias--variance} tradeoff with parsimony and regularization
395+
\item
396+
\textbf{Data leakage} in materials workflows: pre-processing,
397+
temporal, group/spatial
398+
\item
399+
\textbf{Validation}: holdout, K-fold, LOOCV, stratified
400+
\item
401+
\textbf{Error measures}:
402+
403+
\begin{itemize}
404+
\tightlist
405+
\item
406+
Regression: MAE, MSE, RMSE, \(R^2\)
407+
\item
408+
Classification / segmentation: confusion matrix, precision/recall,
409+
F1/Dice, IoU, categorical cross-entropy
410+
\end{itemize}
411+
\end{itemize}
385412

386413
\textbf{Exercise:}\\
387414
Construct a deliberately flawed ML pipeline and diagnose its failure.
@@ -410,15 +437,21 @@ \subsubsection{Unit II --- Representation Learning for Microstructures
410437
Transition to learned representations.
411438
\end{itemize}
412439

413-
\textbf{Summary:} This unit marks the transition from classical,
414-
hand-crafted microstructure quantification (like grain size and phase
415-
fractions) to the modern paradigm of \textbf{learned representations}.
416-
We first review traditional stereological metrics and their limitations
417-
in capturing complex structural nuances. We then introduce the
418-
foundational unit of modern ML: the \textbf{artificial neuron}. By
419-
understanding weights, biases, and non-linear activation functions, we
420-
build the framework for Multi-Layer Perceptrons (MLPs) that can
421-
automatically learn optimal features from materials data.
440+
\textbf{Summary:}
441+
442+
\begin{itemize}
443+
\tightlist
444+
\item
445+
Classical stereological metrics (grain size, phase fractions) and
446+
their limits
447+
\item
448+
Transition to \textbf{learned representations}
449+
\item
450+
The \textbf{artificial neuron}: weights, biases, non-linear
451+
activations
452+
\item
453+
\textbf{Multi-Layer Perceptrons (MLPs)} as automatic feature learners
454+
\end{itemize}
422455

423456
\textbf{Exercise:}\\
424457
Compare classical features vs simple NN-based features for
@@ -443,15 +476,23 @@ \subsubsection{Unit II --- Representation Learning for Microstructures
443476
Overfitting risks with small datasets.
444477
\end{itemize}
445478

446-
\textbf{Summary:} This unit introduces \textbf{Convolutional Neural
447-
Networks (CNNs)}, the workhorse of modern computer vision, and applies
448-
them to materials characterization. We explore how convolutions allow
449-
networks to automatically learn hierarchical structure detectors---from
450-
simple edges to complex phase morphologies---while drastically reducing
451-
the number of parameters compared to standard MLPs. Through case studies
452-
in phase segmentation and defect detection, students learn the intuition
453-
behind filters, pooling, and the unique challenges of applying deep
454-
learning to high-resolution, noisy experimental micrographs.
479+
\textbf{Summary:}
480+
481+
\begin{itemize}
482+
\tightlist
483+
\item
484+
\textbf{Convolutional Neural Networks (CNNs)} for materials
485+
characterization
486+
\item
487+
Hierarchical structure detectors: edges → textures → phase
488+
morphologies
489+
\item
490+
Filters and pooling; parameter efficiency vs.~MLPs
491+
\item
492+
Case studies: phase segmentation, defect detection
493+
\item
494+
Practical challenges: high-resolution, noisy micrographs
495+
\end{itemize}
455496

456497
\textbf{Exercise:}\\
457498
Train a small CNN on microstructure images; analyze failure cases.
@@ -474,15 +515,21 @@ \subsubsection{Unit II --- Representation Learning for Microstructures
474515
When transfer learning helps---and when it does not.
475516
\end{itemize}
476517

477-
\textbf{Summary:} This unit addresses the fundamental bottleneck of
478-
materials informatics: \textbf{Data Scarcity}. We explore how to build
479-
powerful deep learning models when only a few hundred labeled images or
480-
signals are available. The core focus is on \textbf{Transfer Learning},
481-
where we leverage knowledge from models pretrained on millions of
482-
natural images to accelerate learning and improve generalization on
483-
materials tasks. We also cover \textbf{Data Augmentation} strategies
484-
tailored for scientific data and discuss when and why transferring
485-
knowledge across different physical domains succeeds or fails.
518+
\textbf{Summary:}
519+
520+
\begin{itemize}
521+
\tightlist
522+
\item
523+
\textbf{Data scarcity} as the materials informatics bottleneck
524+
\item
525+
\textbf{Transfer learning} from natural-image pretrained models
526+
\item
527+
Self-supervised pretraining as an alternative
528+
\item
529+
\textbf{Data augmentation} tailored to scientific data
530+
\item
531+
When cross-domain transfer succeeds vs.~fails
532+
\end{itemize}
486533

487534
\textbf{Exercise:}\\
488535
Fine-tune a pretrained model; compare against training from scratch.
@@ -509,16 +556,21 @@ \subsubsection{Unit III --- Learning from Processing Data (Weeks
509556
Relation to MFML concepts of generalization.
510557
\end{itemize}
511558

512-
\textbf{Summary:} This unit explores the application of machine learning
513-
to \textbf{Time-Series Data}, specifically for monitoring and predicting
514-
materials processing outcomes. We introduce \textbf{Recurrent Neural
515-
Networks (RNNs)} and their advanced variants like \textbf{LSTMs}, which
516-
are designed to handle sequential dependencies. We discuss the critical
517-
preprocessing steps of signal smoothing and triggering required to
518-
handle noisy experimental logs. Through case studies in additive
519-
manufacturing and process stability, students learn how to build models
520-
that ``remember'' the processing history to predict future states and
521-
detect anomalies in real-time.
559+
\textbf{Summary:}
560+
561+
\begin{itemize}
562+
\tightlist
563+
\item
564+
\textbf{Time-series ML} for process monitoring and prediction
565+
\item
566+
\textbf{RNNs} and \textbf{LSTMs} for sequential dependencies
567+
\item
568+
Preprocessing: signal smoothing, triggering on noisy logs
569+
\item
570+
Case studies: additive manufacturing, process stability
571+
\item
572+
Real-time anomaly detection from processing history
573+
\end{itemize}
522574

523575
\textbf{Exercise:}\\
524576
Predict a process outcome from time-series data using regression or
@@ -542,15 +594,23 @@ \subsubsection{Unit III --- Learning from Processing Data (Weeks
542594
Robustness as a design criterion.
543595
\end{itemize}
544596

545-
\textbf{Summary:} This unit shifts the focus from model performance to
546-
\textbf{Model Reliability}. We explore the Bias-Variance tradeoff and
547-
the fundamental challenge of generalization---ensuring that an ML model
548-
works on new, unseen data from the factory floor. We introduce robust
549-
validation techniques like K-Fold and Stratified Cross-Validation to
550-
stabilize performance estimates on small materials datasets. A key focus
551-
is on \textbf{Process Robustness}, where we use sensitivity analysis to
552-
identify ``Process Windows''---regions in parameter space where material
553-
quality is maximized and insensitive to industrial noise.
597+
\textbf{Summary:}
598+
599+
\begin{itemize}
600+
\tightlist
601+
\item
602+
Shift from raw performance to \textbf{model reliability}
603+
\item
604+
Bias--variance tradeoff and generalization to factory-floor data
605+
\item
606+
Robust validation: K-fold and stratified cross-validation on small
607+
datasets
608+
\item
609+
\textbf{Process robustness} via sensitivity analysis
610+
\item
611+
\textbf{Process windows}: parameter regions insensitive to industrial
612+
noise
613+
\end{itemize}
554614

555615
\textbf{Exercise:}\\
556616
Analyze model robustness under perturbed process conditions.
@@ -573,16 +633,22 @@ \subsubsection{Unit III --- Learning from Processing Data (Weeks
573633
Physics-informed vs unconstrained regression.
574634
\end{itemize}
575635

576-
\textbf{Summary:} This unit explores \textbf{Inverse Problems}---the
577-
cornerstone of materials design where we seek the processing parameters
578-
required to achieve a target microstructure or performance. We contrast
579-
these with causal forward problems and discuss why they are often
580-
ill-posed and multi-valued. We introduce \textbf{Physics-Informed
581-
Learning} as a way to solve these challenges by enriching models with
582-
physical transformations and constraints. Students learn how to build
583-
and interpret \textbf{Process Maps} and ``Process Corridors,'' using
584-
machine learning to visualize safe operating regions in complex
585-
experimental spaces.
636+
\textbf{Summary:}
637+
638+
\begin{itemize}
639+
\tightlist
640+
\item
641+
\textbf{Inverse problems}: target microstructure / performance →
642+
processing parameters
643+
\item
644+
Forward (causal) vs.~inverse (often ill-posed, multi-valued)
645+
\item
646+
\textbf{Physics-informed learning}: physical transformations and
647+
constraints
648+
\item
649+
\textbf{Process maps} and \textbf{process corridors} for safe
650+
operating regions
651+
\end{itemize}
586652

587653
\textbf{Exercise:}\\
588654
Construct a simple ML-based process map; compare constrained vs
@@ -610,16 +676,22 @@ \subsubsection{Unit IV --- Uncertainty, Surrogates, and Automation
610676
Using ML without destroying physical meaning.
611677
\end{itemize}
612678

613-
\textbf{Summary:} This unit focuses on the processing of
614-
high-dimensional \textbf{Characterization Signals} (like XRD, EDS, and
615-
EELS) using unsupervised learning. We introduce \textbf{K-Means
616-
Clustering} and \textbf{t-SNE} for the automatic identification and
617-
visualization of phases in large experimental libraries. We then explore
618-
\textbf{Autoencoders}---neural networks that learn to compress complex
619-
spectra into a low-dimensional ``latent space.'' This allows for
620-
advanced denoising and feature extraction, enabling scientists to handle
621-
the massive data volumes produced by modern high-throughput
622-
characterization tools without losing physical insight.
679+
\textbf{Summary:}
680+
681+
\begin{itemize}
682+
\tightlist
683+
\item
684+
Unsupervised ML on high-dimensional spectra (XRD, EDS, EELS)
685+
\item
686+
\textbf{K-Means} and \textbf{t-SNE} for phase identification and
687+
visualization
688+
\item
689+
\textbf{Autoencoders}: compressing spectra into a low-dimensional
690+
latent space
691+
\item
692+
Denoising and feature extraction at high throughput without losing
693+
physics
694+
\end{itemize}
623695

624696
\textbf{Exercise:}\\
625697
Apply PCA/NMF to spectral datasets; interpret components physically.
@@ -640,8 +712,26 @@ \subsubsection{Unit IV --- Uncertainty, Surrogates, and Automation
640712
ML as a control component, not just a predictor.
641713
\end{itemize}
642714

643-
\textbf{Exercise:}\\
644-
Implement a simple ML-assisted autofocus or defect detector.
715+
\textbf{Summary:}
716+
717+
\begin{itemize}
718+
\tightlist
719+
\item
720+
\textbf{Autonomous characterization}: ML moves from passive analysis
721+
to active instrument control
722+
\item
723+
\textbf{Multi-modal data fusion} (SEM + EDS + process logs) via
724+
Bayesian frameworks
725+
\item
726+
\textbf{Reinforcement learning} for instrument tuning and process
727+
optimization
728+
\item
729+
Pipelines that autonomously find → characterize → decide the next
730+
experiment
731+
\end{itemize}
732+
733+
\textbf{Exercise:} Implement a simple ML-assisted autofocus or defect
734+
detector.
645735

646736
\begin{center}\rule{0.5\linewidth}{0.5pt}\end{center}
647737

@@ -767,18 +857,6 @@ \subsection{Lab Possibilities}\label{lab-possibilities}
767857
Multi-modal fusion of images, spectra, and process parameters.
768858
\end{itemize}
769859

770-
\textbf{Summary:} This unit explores the cutting edge of
771-
\textbf{Autonomous Characterization}, where machine learning moves from
772-
passive data analysis to active instrument control. We introduce
773-
\textbf{Multi-Modal Data Fusion} techniques to combine information from
774-
diverse sensors like SEM images, EDS spectra, and process logs using
775-
Bayesian frameworks. We then discuss \textbf{Reinforcement Learning
776-
(RL)} as a tool for automating complex laboratory tasks, such as
777-
instrument tuning and process optimization. Through case studies in
778-
microscopy and industrial processing, students learn how to build
779-
integrated pipelines that can autonomously find, characterize, and
780-
decide the next steps of an experiment.
781-
782860
\protect\phantomsection\label{refs}
783861
\begin{CSLReferences}{1}{0}
784862
\bibitem[\citeproctext]{ref-sandfeld2024materials}

index-meca.zip

-8.6 KB
Binary file not shown.

0 commit comments

Comments
 (0)