AuctioNN is a mechanism-design project that explores how neural networks can be leveraged to outperform traditional auctions (like second-price auctions) in the context of online advertising. Our goal is to allocate ad impressions among multiple advertisers to maximize overall value (in terms of conversions, revenue, fairness, etc.).
We use a real-world ad impression dataset (provided by Claritas) with features such as device information, geolocation data, time since last impression, and conversion outcomes. By learning complex relationships in this data, our model aims to deliver better allocation decisions compared to traditional rule-based systems.
AuctioNN provides a command line interface for data wrangling, model training, and allocation simulation. You can view the available commands by running:
python main.py --helpThe preprocess command cleans and preprocesses the raw dataset and saves the cleaned data to a parquet file.
python main.py preprocess --impressions-file data/raw/impressions.parquet \
--conversions-file data/raw/conversions.parquet \
--output-file data/processed/clean_data.parquetThe fit-preprocessors command fits the preprocessors (encoders, scalers) on the training split of the cleaned data and saves them to a directory.
python main.py fit-preprocessors --cleaned-data-file data/processed/clean_data.parquet \
--output-dir ./preprocessorsThe train command trains the neural network model on the cleaned data using the preprocessors.
python main.py train \
-d data/clean_data.parquet \
-p ./preprocessors \
-o ./data/processed_splits \
-s ./models/best_model.pth \
--epochs 10 \
--batch-size 2048 \
--test-split-ratio 0.15 \
--val-split-ratio 0.15 \The evaluate command evaluates the performance of the model on the test split of the data.
python main.py evaluate -d ./data/processed_splits -p ./preprocessors -m ./models/best_model.pthThe allocate command runs the allocation mechanism using the trained model.
python main.py simulate \
--data data/clean_data.parquet \
--model models/best_model.pth \
--preproc preprocessors/ \
--out runs/simulation_results.parquet \
--num-imps 1000 \
--num-users 100 \
--device autoThe analyze command analyzes the results of the allocation mechanism.
python main.py analyze -d ./runs/simulation_results.parquetThe following is a typical directory structure for the AuctioNN project:
├── docs - LaTeX setup for the project write up
│ └── README.md
├── main.py - Main CLI entry point for the project
├── notebooks - Jupyter notebooks for exploratory data analysis and model development
│ └── README.md
├── requirements.txt - List of dependencies
├── setup.py - Setup script for the project (allows statements like `import auctionn.models.neural_net` anywhere in the project)
├── src
│ ├── __init__.py
│ ├── data_processing - Code for processing the dataset
│ │ ├── __init__.py
│ │ └── preprocess.py
│ ├── mechanism - Code for the mechanism design (e.g. second-price auction)
│ │ ├── __init__.py
│ │ └── allocation.py
│ └── models - Code for the neural network models
│ ├── __init__.py
│ └── neural_net.py
├── web_app - Code for the web app (optional: if time allows)
│ └── README.md
├── .gitignore
└── README.md
- Clone the repository:
git clone https://github.com/paramkpr/AuctioNN.git- Setup a virtual environment and install the dependencies:
python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
pip install -e .- Run the main script:
python main.py
python main.py --help # to see the available options- IMPORTANT If you want to run the code in the notebooks, you jupyter kernel needs to use this virtual environment so that it's the same for all users. You need to install the kernel in the venv by running the following command:
python -m ipykernel install --user --name auctionn --display-name "Python (auctionn)"Now when you run jupyter notebook, you should see the kernel (or the option to select the kernel) as Python (auctionn).
If you're using VSCode, you can select the kernel by going to the Python: Select Interpreter option in the command palette.
Before making any changes, please create a new branch:
git checkout -b feature/new-featureMake your changes and commit them:
git add .
git commit -m "Add new feature"Push your changes to the branch:
git push origin feature/new-featureCreate a pull request on GitHub. Of course, you can self-review your changes. However, making PRs for any changes is encouraged because then we'll all be on the same page.
- Docstrings: Please use PEP 257 style docstrings for all functions and classes.
- Type hints: Please use type hints for all functions and classes. (e.g.
def function_name(param1: int, param2: str) -> bool:) - Code readability: Please follow the PEP 8 style guide for Python code.
- Comments: Please add comments to the code to explain "why" behind the code in more complex functions.
You'll need to set up AWS CLI on your machine and create a profile with access to the Claritas bucket. Credentials are stored in the repository secrets.
- Install AWS CLI:
brew install awscli- Create a profile:
aws configure- Commmands you'll want:
aws --profile auctionn s3 ls s3://calpoly-artsai --human-readable --summarize --recursive
aws --profile auctionn s3 ls s3://calpoly-artsai/prediction_allocation_logic/ --human-readable --summarize --recursive
aws --profile auctionn s3 cp s3://calpoly-artsai/prediction_allocation_logic/ ./data --recursive
aws --profile auctionn s3 cp s3://calpoly-artsai/prediction_allocation_logic/snapshot_20250429/ ./data/snapshot_20250429 --recursive
aws --profile auctionn s3 ls s3://calpoly-artsai/prediction_allocation_logic/snapshot_20250429/ --human-readable --summarize --recursive
aws --profile auctionn \
s3 sync s3://calpoly-artsai/prediction_allocation_logic/snapshot_20250429/ \
./data/snapshot_20250429 \
--exact-timestamps --no-progress
We have an online model where we have a stream of impressions from an ad exchange. Our model represents a marketing company that has a group of advertisers and places bids on their behalf on the ad exchange.
We have a stream of impressions and a stream of conversions.
python prepare_inmemory_tensors.py
tensorboard --logdir runs/wad --port 6006 &
python train_wide_deep.py