| Name | GitHub | ||
|---|---|---|---|
| 1 | Harmandeep Pal |
An interactive Streamlit-based benchmarking and evaluation dashboard comparing spatial acceleration frameworks for the FLUX.1-dev Diffusion Transformer (DiT).
Unlike temporal acceleration methods that reduce denoising steps, this project evaluates spatial optimizations that modify latent resolution and caching dynamically during generation (Baseline vs. TaylorSeer vs. RALU). It tracks and plots latencies, speedup factors, step counts (NFEs), and peak GPU VRAM allocations.
- Project Overview
- Supported Spatial Acceleration Methods
- Project Structure
- Quick Start — Running the Demo
- Manual Setup
- Hugging Face Checkpoints Integration
- GitHub Repository Metadata Automation
- Credits & References
- Troubleshooting
This project implements a unified evaluation framework for spatially accelerating FLUX.1-dev:
- In-Memory Caching Backend: Houses a single pipeline instance dynamically patched on-the-fly to prevent GPU out-of-memory (OOM) errors.
- Streamlit Comparative UI: Offers a premium Zinc-themed sidebar and grid panel to input prompt strings, customize hyper-parameters, trigger lazy weights loading, execute comparisons in any order, and clear results.
- Output Isolation: Saves generated images locally as
{method}_seed{seed}_{prompt_slug}.pngunderoutputs/demo/so that different prompts on the same seed never overwrite one another.
-
1. Baseline (Standard): Runs unmodified FLUX.1-dev inference at standard resolution using
diffusers.FluxPipeline. -
2. TaylorSeer (ICCV 2025): Evaluates a training-free Taylor-expansion cache approximation method. Adjustable hyperparameters (sidebar) include:
-
TaylorSeer Mode:
Taylor(2nd order),ToCa(Attention-based),Delta, ororiginal. -
Cache Type: Toggle between
randomandattentioncaching. - Fresh Ratio / Fresh Threshold: Control cache step update weights and intervals.
-
Max Order: Choose
0(1st order approximation) or1(2nd order derivative).
-
TaylorSeer Mode:
-
3. RALU (CVPR 2026): Region-Adaptive Latent Upsampling targets mixed-resolution latent sampling by selectively upsampling edge regions. Adjustable parameters (sidebar) include:
-
RALU Mode: Use Default Acceleration presets (Level
4,5, or7representing speedup levels) or Custom Scheduling. -
Custom Scheduling (
$N$ &$e$ ): Input custom step stages (e.g.5, 6, 7) and end timesteps (e.g.0.3, 0.45, 1.0). - Upsampling Ratio: Customize latent upsampling thresholds.
-
RALU Mode: Use Default Acceleration presets (Level
diffusion-models-for-image-generation/
│
├── .github/
│ └── workflows/
│ └── update-repo-info.yml ← Auto-syncs GitHub About & Topics on git push
│
├── assets/ ← Dashboard screenshots & previews (place images here)
│ └── dashboard_preview.png ← Readme placeholder
│
├── models/ ← Ignored by git, holds cached FLUX weights
│ └── flux1-dev/ ← Gated HF model weights folder (~34 GB)
│
├── notebooks/ ← Evaluative summary notebooks
│ ├── 03_runpod_ralu.ipynb ← Stage execution and testing
│ ├── 05_runpod_eval_summary.ipynb ← Aggregates CSV logs & generates performance plots
│ └── 06_runpod_demo_launcher.ipynb ← Background runner and log monitor for RunPod
│
├── outputs/ ← Output benchmark images (ignored by git)
│ └── demo/ ← Saves `{method}_seed{seed}_{prompt_slug}.png`
│
├── src/ ← Python code directories
│ ├── ralu/ ← RALU pipeline implementation
│ ├── taylorseer_flux/ ← TaylorSeer forwards and utils
│ ├── dashboard.py ← Zinc-themed Streamlit benchmark frontend
│ ├── dashboard_backend.py ← Backend patching logic and registry controller
│ └── download_model.py ← CLI downloader for weights
│
├── bootstrap.ps1 ← Windows Conda environment setup & runner script
├── bootstrap.sh ← Linux / Codespaces environment setup & runner script
├── repo_metadata.json ← Description & keywords config for GitHub
├── requirements.yml ← Conda environment configuration
└── README.md ← This file
FLUX.1-dev is a gated model. Before downloading it, you must:
- Accept the model license on Hugging Face.
- Generate an Access Token with READ permissions at huggingface.co/settings/tokens.
- Create a
.envfile in the project root:(If noHF_TOKEN=hf_xxxxxxxxxxxxxxxxxxxx.envfile exists when the bootstrapper starts, it will prompt you in the command line and create it for you).
Run the PowerShell bootstrap script from the root folder:
.\bootstrap.ps1If Execution Policies block scripts from running, bypass them for the active terminal using:
Set-ExecutionPolicy -Scope Process -ExecutionPolicy BypassWhat it automates:
- Searches your system for Conda and creates a new virtual environment
deep_learningfromrequirements.yml(if missing). - Installs required python dependencies (including
streamlit,psutil,huggingface_hub, andpython-dotenv). - Loads your Hugging Face credentials.
- Validates model weights: downloads the model weights from Hugging Face directly if
models/flux1-dev/is empty. - Starts the Streamlit dashboard on port
8501.
Access the UI at: http://localhost:8501
Run the bash script from the root folder:
chmod +x bootstrap.sh
./bootstrap.shInside GitHub Codespaces:
- GitHub Codespaces will automatically forward port
8501to your local browser. - In Codespaces (which operates on system python environments), the script skips Conda and installs dependencies directly via pip to speed up booting.
If you prefer to configure your environment manually without using the bootstrap scripts:
# 1. Create and activate the conda environment
conda env create -f requirements.yml
conda activate deep_learning
# 2. Install extra dashboard requirements
pip install streamlit psutil huggingface_hub python-dotenv
# 3. Download the model weights
python src/download_model.py
# 4. Start the dashboard
streamlit run src/dashboard.py --server.port 8501 --server.address 0.0.0.0 --server.enableCORS false --server.enableXsrfProtection falseModel weights are ignored by git to keep the repository lightweight. When src/download_model.py is invoked (via the bootsrappers or manually), it calls huggingface_hub.snapshot_download to download the weights from the official repository:
- Hugging Face Repository: black-forest-labs/FLUX.1-dev
This project includes a metadata syncing system that updates your GitHub repository's About section, Homepage link, and Keywords (topics) automatically on git push using the configuration file repo_metadata.json.
Because changing repository properties (like description and keywords) requires admin permissions, GitHub Actions cannot do it by default. You need to explicitly authorize it:
-
Create an Access Token on GitHub:
- Go to github.com/settings/tokens in your browser.
- Click Generate new token (classic).
- Enter a name (e.g.,
Repo Metadata Updater). - Check the box next to
repo(this allows the token to manage repo properties). - Scroll to the bottom, click Generate token, and copy the token code (it starts with
ghp_).
-
Add it as a Secret in your Repository:
- Open your project repository on GitHub in your browser.
- Click the Settings tab at the top.
- On the left sidebar menu, expand Secrets and variables and click Actions.
- Click the New repository secret button.
- Name the secret:
REPO_ACCESS_TOKEN - Paste the token code (
ghp_...) into the Secret box and click Add secret.
Now, whenever you push changes to the main or master branch, the metadata script runs, reads repo_metadata.json, and updates your GitHub repo settings instantly!
This benchmark dashboard evaluates and incorporates the following research works:
-
TaylorSeer (ICCV 2025):
-
RALU (CVPR 2026 Highlight):
Ensure you have activated the environment before running streamlit. Run the bootstrap script or manually execute:
pip install streamlit psutil huggingface_hub python-dotenvVerify that your Hugging Face Account has accepted the license for FLUX.1-dev and that your token has READ access. Check your .env file credentials.
Last updated: June 2026
