Code for "DIFER: Differentiable Automated Feature Engineering"
accepted in 1st Conference on Automated Machine Learning
This code is implemented in PyTorch, and we have tested the code under the environment settings in requirements.txt.
-
data:$23/25$ medium-sized datasets that can be pushed to git and their meta information. -
NFS_sklearn_c: the open-source implementation of "Neural Feature Search: A Neural Architecture for Automated Feature Engineering". -
autolearn:the core coes for DIFER inautolearn/feat_selection/nfo, continas the feature optimizer incontroller.py, the feature space insearch_space.py, the end-to-end training process initer_train.py, the three forms of feature (i.e., the original form, the parse tree and the traversal string) infeat_tree.py.
We provide script files for convenience in conducting experiments.
run_iter.sh: after specifying the dataset and cuda, you can run DIFER to automate feature engineering for Random Forest.run_rq3.sh: the script for RQ3 in the paper.run_rq4_*.sh: the script of different machine learning algorithms for RQ4 in the paper.