This repository contains a collection of Jupyter notebooks designed for teaching the “Data Modeling and Visualization” course at HAW Hamburg. Each notebook focuses on a specific topic, illustrating both theoretical foundations and practical implementations using Python and popular data science libraries such as Pandas, NumPy, Scikit-learn, Seaborn, and PyTorch.
| Notebook | Topic | Description |
|---|---|---|
| 02_data_visualization | Data Visualization | Demonstrates how to create and customize various types of charts using Pandas and Seaborn. |
| 03_data_preprocessing | Data Preprocessing | Covers techniques for handling missing values, detecting and treating outliers, and resolving other common data quality issues. Explains how to merge, aggregate, and encode data using Pandas operations and one-hot encoding. |
| 03_scaling | Data Preprocessing | Demonstrate different scaling approaches. |
| 04_exploration_case | EDA | Demonstrate case for EDA. |
| 05_regression | Regression Analysis | Introduces linear regression and model evaluation using mean squared error (MSE). |
| 06_classification | Classification | Implements logistic regression, k-nearest neighbors (kNN) and decision tree classifiers, along with performance evaluation metrics. |
| 07-1_ARIMA | Time Series Analysis (Part 1) | Presents ARIMA modeling for time series forecasting. |
| 07-2_ARIMA | Time Series Analysis (Part 2) | Demonstrates how to search for optimal ARIMA hyperparameters. |
| 08_supervised_case | Supervised Learning | Demonstrate case for supervised learning. |
| 09_clustering | Clustering | Compares clustering techniques including K-Means, Agglomerative Clustering, and DBSCAN; introduces Local Outlier Factor (LOF) for anomaly detection. |
| 10_dimensionality_reduction | Dimensionality Reduction | Explores PCA and t-SNE for reducing data dimensionality and visualizing complex datasets. |
| 11_unsupervised_case | Unupervised Learning | Demonstrate case for unsupervised learning. |
| 12_deep_learning_1 | Introduction to Deep Learning | Implements a simple feed-forward neural network for supervised learning tasks. |
| 13_deep_learning_2 | Deep Learning for Sequential Data | Introduces the Long Short-Term Memory (LSTM) model for sequential data analysis. |
| 14_deep_learning_3 | Topology-Preserving Neural Networks for Data Visualization | Demonstrates the use of Self-Organizing Maps (SOM) for unsupervised learning and visualization. |
| 15_deep_learning_case | Deep Learning | Demonstrate case for deep learning. |
Each notebook is self-contained and can be executed independently. They are organized sequentially to reflect a progressive learning path—from data preprocessing and visualization to advanced machine learning and deep learning models.