Data Analysis And Visualization -IPL
In this article, we will learn to explore data using python. This will help us to get a better understanding of the data .
Data Analysis: Data analysis is a process of inspecting, cleansing, transforming and modeling data with the goal of discovering useful information, informing conclusions and supporting decision-making.
So for the data analysis , I have taken IPL (Indian Premiere League) data set from kaggle . The data set consists of matches between 2008–2019. So now lets find out what all you can find from this .
Data Preparation and Cleaning
The python libraries I used for data preparation and cleaning are numpy and pandas .
Reading the data:
Now that we have loaded the data into data frame , lets take an overview:
This dataset, as we can see, contains 17 columns including Id, Season, City, Date, Team1, Team2, Toss Winner, Toss Decision, Result, DL Applied, Winner, Winner by Runs, Winner by Wickets, Player of Match, Venue, Umpire1 and Umpire2 .
Now when you look at the info of dataframe we can see that the data type of date is mentioned as object.So lets change that
Let’s now see how we can split the date into different columns as day ,month,year and weekday.
Dropping columns in the dataset
Now,Lets look at the dimension of the dataset
Now, Lets check the teams played
Here in teams column we can see that Pune’s team has got three names and Delhi’s team has got two names . So we will replace those with proper names.
So now we have replaced the team names with proper names.
Lets Visualize our data set
First ,we will import the necessary libraries for visualization
Now lets see
Which team has won the most number of matches fro 2008 to 2019?
Plotting these values
Now lets see how it looks in a pie chart
Number of matches played in each IPL season
Toss decisions
Now lets explore more
Number of matches played in each city
Who has been awarded with Player Of the Match most number Of Times.
So lets plot top five players.
Number Of Matches that went normal and tie?
Which venue has conducted most number of matches?
So with that , I have come to an end of data analysis and visualization of IPL dataset. Hope you find this blog interesting and it helps you get an idea of data analysis.