Skip to content

masood2004/data_profiler

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Data Profiler

Data Profiler is a web-based tool that allows users to upload datasets and generate insightful visualizations and data reports. The project provides an intuitive interface for visualizing data, generating statistics, and exporting the results in Excel format. This tool is designed for data analysts and scientists to quickly explore and understand their datasets.

Features

  • User authentication (mock authentication in this version)
  • Data upload functionality for CSV or Excel files
  • Statistical analysis of the dataset (mean, median, standard deviation, etc.)
  • Visualizations:
    • Scatter plots, bar charts, and box plots
    • Correlation heatmaps
    • Histograms
  • Export of data insights to Excel format using xlsxwriter

Installation

  1. Clone the repository:
git clone https://github.com/masood2004/data_profiler.git
cd data-profiler
  1. Install the required dependencies:
pip install -r requirements.txt
  1. Run the app:
streamlit run data_profiler.py

Usage

  1. Start the application by running the command mentioned above.

  2. Log in using one of the mock credentials:

    • Username: user1, Password: password1

    • Username: user2, Password: password2

  3. Upload a CSV or Excel file containing your dataset.

  4. Explore the statistical summary and visualizations generated by the tool.

  5. Download the report as an Excel file.

Dependencies

• Streamlit

• Pandas

• Matplotlib

• Seaborn

• xlsxwriter (automatically installed if not present)

Contributing

Contributions are welcome! Please fork the repository and create a pull request with your changes. Be sure to follow the standard code style and include tests for new features.

License

This project is licensed under the MIT License.

About

A Python project for automated data profiling, providing insights into datasets through summary statistics, data quality checks, and visualization.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages