Skip to content

Automatically generate valid, diverse, and semantically accurate PDDL domains from natural language input for single and multiple UAV missions.

License

Notifications You must be signed in to change notification settings

KumarRobotics/SPAR

Repository files navigation

SPAR: Scalable LLM-based PDDL Domain Generation for Aerial Robotics

About

SPAR is a framework that leverages the generative capabilities of LLMs to automatically produce valid, diverse, and semantically accurate PDDL domains from natural language input.

Authors: Songhao Huang *, Yuwei Wu *, Guangyao Shi, Gaurav S. Sukhatme, and Vijay Kumar

Related Paper: Songhao Huang*, Yuwei Wu*, Guangyao Shi, Gaurav S. Sukhatme, and Vijay Kumar. "SPAR: Scalable LLM-based PDDL Domain Generation for Aerial Robotics." arxiv Preprint

If this repo helps your research, please cite our paper at:

@article{huang2025spar,
  title={SPAR: Scalable LLM-based PDDL Domain Generation for Aerial Robotics},
  author={Huang, Songhao and Wu, Yuwei and Shi, Guangyao and Sukhatme, Gaurav S and Kumar, Vijay},
  journal={arXiv preprint arXiv:2509.13691},
  year={2025}
}

Framework

Repository Layout

  • action_gen.py: generate a PDDL domain for one benchmark domain.
  • eval_syntax.py: run domain-generation experiments and aggregate syntax error counts.
  • problem_gen.py: generate PDDL problem files compatible with generated domains.
  • batch_solve.py: solve generated domain/problem pairs with ENHSP.
  • eval/domain_similarity.py: validate plans against generated domains with VAL.
  • pddl_validator.py: syntax and semantic checks used during iterative correction.
  • llm_model.py: model wrapper, embedding lookup, and retrieval utilities.
  • planner/: ENHSP wrapper plus the bundled enhsp-20.jar.
  • prompts/: prompt templates, retrieval assets, and embedding-cache scripts.
  • uav_domain_benchmark/: benchmark and dataset for UAV task-planning domains.

Requirements

  • Python 3.10+
  • Java 17+ available as java, or set JAVA_BIN
  • At least one model API key:
    • OPENAI_API_KEY for OpenAI models
    • DEEPSEEK_API_KEY for DeepSeek models
  • Local sentence-transformer checkpoint for retrieval:
    • local_model/all-mpnet-base-v2

Optional:

  • VAL_BIN for eval/domain_similarity.py
  • local_model/bge-reranker-v2-m3 for regenerating BGE retrieval caches

Install dependencies:

python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt

Set the environment variables you need:

export OPENAI_API_KEY=your_key_here
export DEEPSEEK_API_KEY=your_key_here
export JAVA_BIN=java
export VAL_BIN=/path/to/Validate

Quick Start

1. Generate one domain

Edit the variables in the __main__ block of action_gen.py:

  • _domain_name_str
  • _engine
  • _prompt_method
  • _result_log_dir

Then run:

python action_gen.py

Outputs include the generated domain, intermediate LLM transcripts, extracted predicates/functions, and validation error counts.

2. Run syntax evaluation

eval_syntax.py has two entry paths:

  • syntax_eval() to generate domains and log validation errors
  • total_error_count() to aggregate existing results

Before running, edit the module-level settings in eval_syntax.py, especially:

  • engine_list
  • prompt_method_restart_list
  • restart_domain
  • restart_method
  • restart_engine

Then run:

python eval_syntax.py

Results are written under results/<timestamp>/<engine>/<domain>/<prompt_method>/.

3. Generate problems for generated domains

problem_gen.py expects generated domains to already exist under results/. Update the module-level variables near the top of the file:

  • date_str
  • engine
  • gpt_engine
  • restart controls such as restart_domain, restart_method, and restart_problem

Then run:

python problem_gen.py

Generated problems are written under:

results/<date_str>/<engine>/<domain>/<prompt_method>/pddl/

4. Solve with ENHSP

Edit the module-level configuration in batch_solve.py:

  • date_str
  • engine
  • prompt_method_restart_list
  • optional restart filters such as restart_domain

Then run:

python batch_solve.py

The script writes .plan files next to generated problems and prints per-method success rates.

5. Validate plans with VAL

eval/domain_similarity.py compares generated plans and generated domains using VAL.

Requirements:

  • source plans in the benchmark domain folders under uav_domain_benchmark/<domain>/pddl/*.plan
  • generated domains and problems under results/
  • VAL_BIN pointing to the Validate executable

Edit the module-level variables in eval/domain_similarity.py, then run:

python eval/domain_similarity.py

6. Regenerate retrieval embeddings

If you need to rebuild the retrieval cache, update the options in prompts/save_action_embed.py and run:

python prompts/save_action_embed.py

This only requires OPENAI_API_KEY when the script is configured with use_llm=True.

Acknowledgements

Maintaince

For any technical issues, please contact Yuwei Wu (yuweiwu@seas.upenn.edu, yuweiwu20001@outlook.com).

About

Automatically generate valid, diverse, and semantically accurate PDDL domains from natural language input for single and multiple UAV missions.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors