COVID-19 Dataset Analysis Project (Using Kaggle Dataset)
(Python · Pandas · Matplotlib)
1. Project Title
COVID-19 Data Analysis Using Python (Exploratory Data Analysis on Global COVID Statistics)
2. Dataset Source
Download from Kaggle:
“Novel Corona Virus 2019 Dataset” or
“COVID-19 World Vaccination Progress”
Search on Kaggle → Download CSV.
You will mainly use:
-
covid_19_data.csv -
time_series_covid_19_confirmed.csv -
time_series_covid_19_deaths.csv -
time_series_covid_19_recovered.csv
3. Project Objectives
✔ Understand global COVID-19 spread through data
✔ Identify most affected countries
✔ Visualize daily vs cumulative cases
✔ Analyze death & recovery trends
✔ Explore correlation between features
✔ Perform country-wise time-series analysis
4. Python Libraries Required
5. Import Libraries
6. Load Dataset
7. Basic Data Information
8. Data Cleaning
Rename messy column names:
Convert date:
Fill missing values:
9. Global Numbers Overview
Total Confirmed, Deaths, Recovered
10. Country-wise Summary
11. Plot: Top 10 Countries by Confirmed Cases
12. Death Rate and Recovery Rate
13. Heatmap (Correlation)
14. Time Series Analysis (Global Trend)
15. Country-Specific Analysis (e.g., India)
16. Key Findings (Write in Report)
✔ USA, India, Brazil were most affected
✔ Confirmed cases show exponential growth in early months
✔ High correlation between confirmed cases and deaths
✔ Death rate varies per country (health system differences)
✔ Recovery rate increased after vaccination rollout
✔ India shows rapid growth during second wave
No comments:
Post a Comment