Data Profiler is a web-based tool that allows users to upload datasets and generate insightful visualizations and data reports. The project provides an intuitive interface for visualizing data, generating statistics, and exporting the results in Excel format. This tool is designed for data analysts and scientists to quickly explore and understand their datasets.
- User authentication (mock authentication in this version)
- Data upload functionality for CSV or Excel files
- Statistical analysis of the dataset (mean, median, standard deviation, etc.)
- Visualizations:
- Scatter plots, bar charts, and box plots
- Correlation heatmaps
- Histograms
- Export of data insights to Excel format using xlsxwriter
- Clone the repository:
git clone https://github.com/masood2004/data_profiler.git
cd data-profiler- Install the required dependencies:
pip install -r requirements.txt- Run the app:
streamlit run data_profiler.py-
Start the application by running the command mentioned above.
-
Log in using one of the mock credentials:
• Username: user1, Password: password1
• Username: user2, Password: password2
-
Upload a CSV or Excel file containing your dataset.
-
Explore the statistical summary and visualizations generated by the tool.
-
Download the report as an Excel file.
• Streamlit
• Pandas
• Matplotlib
• Seaborn
• xlsxwriter (automatically installed if not present)
Contributions are welcome! Please fork the repository and create a pull request with your changes. Be sure to follow the standard code style and include tests for new features.
This project is licensed under the MIT License.