Skip to content

TejasAnalyst/cleanframe

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

cleanframe-data 🪄

License: MIT Python Version PyPI version

Automated Data Cleaning for Faster Analytics

cleanframe-data is a lightweight, fast, and intuitive Python library designed to automate dataset diagnostics and cleaning. It helps data analysts, scientists, and beginners clean messy datasets, handle missing values, drop low-quality columns, and cap outliers in just one line of code.


🚀 Features

  • One-Line Auto-Clean: Drop duplicates, remove low-quality columns, impute missing values, and handle outliers instantly using cf.auto_clean(df).
  • Advanced Outlier Handling: Automatically detects and caps extreme numerical outliers using the Interquartile Range (IQR) method.
  • Smart Column Dropping: Drops columns automatically if their missing data percentage crosses your defined threshold.
  • Dataset Diagnostics: Get a quick, comprehensive report of data types, missing values, and percentages.
  • Modern Pandas Ready: Built from the ground up to support modern Pandas (2.0+) Copy-on-Write behaviors without annoying warnings.

🛠️ Installation

You can install the official stable release directly from PyPI:

pip install cleanframe-data

About

Automated Data Cleaning for Faster Analytics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages