Welcome to my portfolio repository for the Applied Data Science with Python course via Coursera (University of Michigan). This repository showcases my progression, hands-on assignments, and the practical data science skills I've developed using Python, Pandas, and NumPy.
Click the image above to view my official verified certificate.
- Data Manipulation & Cleaning: Expert manipulation of DataFrames and Series using
pandas. - Numerical Computing: Efficient array operations and mathematical computations with
numpy. - Text Processing: Pattern matching and text extraction using Regular Expressions (
regex). - Data Aggregation: Advanced grouping, merging, and pivot tables.
- Statistical Analysis: Basic statistical testing and hypothesis testing.
- Topics: Lambda functions, List Comprehensions, Regular Expressions (Regex), and an introduction to
numpy. - Highlights:
Numpy_ed.ipynb,Regex_ed.ipynb,assignment1.ipynb
- Topics: Introduction to
pandas, Series and DataFrame data structures, querying and indexing DataFrames, handling missing values. - Highlights:
DataFrameManipulation_ed.ipynb,assignment2.ipynb(Analyzing census and olympics data).
- Topics: Merging DataFrames, GroupBy idioms, Pivot Tables, Date Functionality, and Scales.
- Highlights:
MergingDataFrame_ed.ipynb,GroupBy_ed.ipynb,assignment3.ipynb(Processing and merging World Bank datasets).
- Topics: Distributions, Hypothesis Testing, and T-tests.
- Highlights:
BasicStatisticalTesting.ipynb,assignment4.ipynb(Analyzing sports data: MLB, NBA, NFL, NHL, and Wikipedia).
- Language: Python 3
- Libraries: Pandas, NumPy, SciPy
- Environment: Jupyter Notebooks
To explore the notebooks in this repository:
- Clone the repo:
git clone https://github.com/SsemuliJoseph/Applied-Data-Science-Python-Uni-Michigan.git
- Install the required dependencies:
pip install pandas numpy scipy jupyterlab
- Launch Jupyter Notebook:
jupyter lab
_Feel free to explore the code! Connect with me on LinkedIn