Automated Data Cleaning for Faster Analytics
cleanframe-data is a lightweight, fast, and intuitive Python library designed to automate dataset diagnostics and cleaning. It helps data analysts, scientists, and beginners clean messy datasets, handle missing values, drop low-quality columns, and cap outliers in just one line of code.
- One-Line Auto-Clean: Drop duplicates, remove low-quality columns, impute missing values, and handle outliers instantly using
cf.auto_clean(df). - Advanced Outlier Handling: Automatically detects and caps extreme numerical outliers using the Interquartile Range (IQR) method.
- Smart Column Dropping: Drops columns automatically if their missing data percentage crosses your defined threshold.
- Dataset Diagnostics: Get a quick, comprehensive report of data types, missing values, and percentages.
- Modern Pandas Ready: Built from the ground up to support modern Pandas (2.0+) Copy-on-Write behaviors without annoying warnings.
You can install the official stable release directly from PyPI:
pip install cleanframe-data