SPAR: Scalable LLM-based PDDL Domain Generation for Aerial Robotics

About

SPAR is a framework that leverages the generative capabilities of LLMs to automatically produce valid, diverse, and semantically accurate PDDL domains from natural language input.

Authors: Songhao Huang *, Yuwei Wu *, Guangyao Shi, Gaurav S. Sukhatme, and Vijay Kumar

Related Paper: Songhao Huang*, Yuwei Wu*, Guangyao Shi, Gaurav S. Sukhatme, and Vijay Kumar. "SPAR: Scalable LLM-based PDDL Domain Generation for Aerial Robotics." arxiv Preprint

If this repo helps your research, please cite our paper at:

@article{huang2025spar,
  title={SPAR: Scalable LLM-based PDDL Domain Generation for Aerial Robotics},
  author={Huang, Songhao and Wu, Yuwei and Shi, Guangyao and Sukhatme, Gaurav S and Kumar, Vijay},
  journal={arXiv preprint arXiv:2509.13691},
  year={2025}
}

Framework

Repository Layout

action_gen.py: generate a PDDL domain for one benchmark domain.
eval_syntax.py: run domain-generation experiments and aggregate syntax error counts.
problem_gen.py: generate PDDL problem files compatible with generated domains.
batch_solve.py: solve generated domain/problem pairs with ENHSP.
eval/domain_similarity.py: validate plans against generated domains with VAL.
pddl_validator.py: syntax and semantic checks used during iterative correction.
llm_model.py: model wrapper, embedding lookup, and retrieval utilities.
planner/: ENHSP wrapper plus the bundled enhsp-20.jar.
prompts/: prompt templates, retrieval assets, and embedding-cache scripts.
uav_domain_benchmark/: benchmark and dataset for UAV task-planning domains.

Requirements

Python 3.10+
Java 17+ available as java, or set JAVA_BIN
At least one model API key:
- OPENAI_API_KEY for OpenAI models
- DEEPSEEK_API_KEY for DeepSeek models
Local sentence-transformer checkpoint for retrieval:
- local_model/all-mpnet-base-v2

Optional:

VAL_BIN for eval/domain_similarity.py
local_model/bge-reranker-v2-m3 for regenerating BGE retrieval caches

Install dependencies:

python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt

Set the environment variables you need:

export OPENAI_API_KEY=your_key_here
export DEEPSEEK_API_KEY=your_key_here
export JAVA_BIN=java
export VAL_BIN=/path/to/Validate

Quick Start

1. Generate one domain

Edit the variables in the __main__ block of action_gen.py:

_domain_name_str
_engine
_prompt_method
_result_log_dir

Then run:

python action_gen.py

Outputs include the generated domain, intermediate LLM transcripts, extracted predicates/functions, and validation error counts.

2. Run syntax evaluation

eval_syntax.py has two entry paths:

syntax_eval() to generate domains and log validation errors
total_error_count() to aggregate existing results

Before running, edit the module-level settings in eval_syntax.py, especially:

engine_list
prompt_method_restart_list
restart_domain
restart_method
restart_engine

Then run:

python eval_syntax.py

Results are written under results/<timestamp>/<engine>/<domain>/<prompt_method>/.

3. Generate problems for generated domains

problem_gen.py expects generated domains to already exist under results/. Update the module-level variables near the top of the file:

date_str
engine
gpt_engine
restart controls such as restart_domain, restart_method, and restart_problem

Then run:

python problem_gen.py

Generated problems are written under:

results/<date_str>/<engine>/<domain>/<prompt_method>/pddl/

4. Solve with ENHSP

Edit the module-level configuration in batch_solve.py:

date_str
engine
prompt_method_restart_list
optional restart filters such as restart_domain

Then run:

python batch_solve.py

The script writes .plan files next to generated problems and prints per-method success rates.

5. Validate plans with VAL

eval/domain_similarity.py compares generated plans and generated domains using VAL.

Requirements:

source plans in the benchmark domain folders under uav_domain_benchmark/<domain>/pddl/*.plan
generated domains and problems under results/
VAL_BIN pointing to the Validate executable

Edit the module-level variables in eval/domain_similarity.py, then run:

python eval/domain_similarity.py

6. Regenerate retrieval embeddings

If you need to rebuild the retrieval cache, update the options in prompts/save_action_embed.py and run:

python prompts/save_action_embed.py

This only requires OPENAI_API_KEY when the script is configured with use_llm=True.

Acknowledgements

ENHSP: planner-based evaluation of generated domain and problem pairs.
VAL: plan validation.
all-mpnet-base-v2: retrieval-based prompting and action similarity search.
OpenAI API and DeepSeek API

Maintaince

For any technical issues, please contact Yuwei Wu (yuweiwu@seas.upenn.edu, yuweiwu20001@outlook.com).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SPAR: Scalable LLM-based PDDL Domain Generation for Aerial Robotics

About

Framework

Repository Layout

Requirements

Quick Start

1. Generate one domain

2. Run syntax evaluation

3. Generate problems for generated domains

4. Solve with ENHSP

5. Validate plans with VAL

6. Regenerate retrieval embeddings

Acknowledgements

Maintaince

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
docs		docs
eval		eval
planner		planner
prompts		prompts
uav_domain_benchmark		uav_domain_benchmark
utils		utils
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
action_gen.py		action_gen.py
batch_solve.py		batch_solve.py
eval_syntax.py		eval_syntax.py
general_verb.txt		general_verb.txt
llm_model.py		llm_model.py
pddl_validator.py		pddl_validator.py
problem_gen.py		problem_gen.py
requirements.txt		requirements.txt
test.py		test.py

License

KumarRobotics/SPAR

Folders and files

Latest commit

History

Repository files navigation

SPAR: Scalable LLM-based PDDL Domain Generation for Aerial Robotics

About

Framework

Repository Layout

Requirements

Quick Start

1. Generate one domain

2. Run syntax evaluation

3. Generate problems for generated domains

4. Solve with ENHSP

5. Validate plans with VAL

6. Regenerate retrieval embeddings

Acknowledgements

Maintaince

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages