Build machine learning models using natural language.
Quickstart | Features | Installation | Documentation
plexe lets you create machine learning models by describing them in plain language. Simply explain what you want, provide a dataset, and the AI-powered system builds a fully functional model through an automated agentic approach. Also available as a managed cloud service.
pip install plexe
export OPENAI_API_KEY=<your-key>
export ANTHROPIC_API_KEY=<your-key>Provide a tabular dataset (Parquet, CSV, ORC, or Avro) and a natural language intent:
python -m plexe.main \
--train-dataset-uri data.parquet \
--intent "predict whether a passenger was transported" \
--max-iterations 5from plexe.main import main
from pathlib import Path
best_solution, metrics, report = main(
intent="predict whether a passenger was transported",
data_refs=["train.parquet"],
max_iterations=5,
work_dir=Path("./workdir"),
)
print(f"Performance: {best_solution.performance:.4f}")The system uses 14 specialized AI agents across a 6-phase workflow to:
- Analyze your data and identify the ML task
- Select the right evaluation metric
- Search for the best model through hypothesis-driven iteration
- Evaluate model performance and robustness
- Package the model for deployment
Build complete models with a single call. Plexe supports XGBoost, CatBoost, and Keras for tabular data:
best_solution, metrics, report = main(
intent="predict house prices based on property features",
data_refs=["housing.parquet"],
max_iterations=10, # Search iterations
allowed_model_types=["xgboost"], # Or let plexe choose
enable_final_evaluation=True, # Evaluate on held-out test set
)Run python -m plexe.main --help for all CLI options.
The output is a self-contained model package at work_dir/model/ (also archived as model.tar.gz).
The package has no dependency on plexe — build the model with plexe, deploy it anywhere:
model/
├── artifacts/ # Trained model + feature pipeline (pickle)
├── src/ # Inference predictor, pipeline code, training template
├── schemas/ # Input/output JSON schemas
├── config/ # Hyperparameters
├── evaluation/ # Metrics and detailed analysis reports
├── model.yaml # Model metadata
└── README.md # Usage instructions with example code
Run plexe with everything pre-configured — PySpark, Java, and all dependencies included.
A Makefile is provided for common workflows:
make build # Build the Docker image
make test-quick # Fast sanity check (~1 iteration)
make run-titanic # Run on Spaceship Titanic datasetOr run directly:
docker run --rm \
-e OPENAI_API_KEY=$OPENAI_API_KEY \
-e ANTHROPIC_API_KEY=$ANTHROPIC_API_KEY \
-v $(pwd)/data:/data -v $(pwd)/workdir:/workdir \
plexe:py3.12 python -m plexe.main \
--train-dataset-uri /data/dataset.parquet \
--intent "predict customer churn" \
--work-dir /workdir \
--spark-mode localA config.yaml in the project root is automatically mounted. A Databricks Connect image
is also available: docker build --target databricks .
Customize LLM routing, search parameters, Spark settings, and more via a config file:
# config.yaml
max_search_iterations: 5
allowed_model_types: [xgboost, catboost]
spark_driver_memory: "4g"
hypothesiser_llm: "openai/gpt-5-mini"
feature_processor_llm: "anthropic/claude-sonnet-4-5-20250929"CONFIG_FILE=config.yaml python -m plexe.main ...See config.yaml.template for all available options.
Plexe uses LLMs via LiteLLM, so you can use any supported provider:
# Route different agents to different providers
hypothesiser_llm: "openai/gpt-5-mini"
feature_processor_llm: "anthropic/claude-sonnet-4-5-20250929"
model_definer_llm: "ollama/llama3"Note
Plexe should work with most LiteLLM providers, but we actively test only with openai/* and anthropic/*
models. If you encounter issues with other providers, please let us know.
Visualize experiment results, search trees, and evaluation reports with the built-in Streamlit dashboard:
python -m plexe.viz --work-dir ./workdirConnect plexe to custom storage, tracking, and deployment infrastructure via the WorkflowIntegration interface:
main(intent="...", data_refs=[...], integration=MyCustomIntegration())See plexe/integrations/base.py for the full interface.
pip install plexe # Core (XGBoost, CatBoost, Keras, scikit-learn)
pip install plexe[pyspark] # + Local PySpark execution
pip install plexe[aws] # + S3 storage support (boto3)Requires Python >= 3.10, < 3.13.
export OPENAI_API_KEY=<your-key>
export ANTHROPIC_API_KEY=<your-key>See LiteLLM providers for all supported providers.
For full documentation, visit docs.plexe.ai.
See CONTRIBUTING.md for guidelines. Join our Discord to connect with the team.
If you use Plexe in your research, please cite it as follows:
@software{plexe2025,
author = {De Bernardi, Marcello AND Dubey, Vaibhav},
title = {Plexe: Build machine learning models using natural language.},
year = {2025},
publisher = {GitHub},
howpublished = {\url{https://github.com/plexe-ai/plexe}},
}