Toss To Win

T20 is a dynamic format where every decision has an instant and apparent outcome. Every ball, every run and every toss matters. We talk a lot about — how sometimes, in some stadiums winning the toss has a HUGE impact on the outcome of the game. What if we look at some statistics?

Here we have the IPL data that has interesting parameters which could help us learn a thing or two about the impact of toss on match outcomes. The data is from 2008–2019 for different stadiums that hosted IPL games. However, I have picked out stadiums that have at least 40+ games. Here are the cities and stadium names —

  1. M Chinnaswamy Stadium, Bengaluru
  2. Punjab Cricket Association IS Bindra Stadium, Mohali, Chandigarh
  3. M. A. Chidambaram Stadium(aka Chepauk), Chennai
  4. Arun Jaitley Stadium (aka Feroz Shah Kotla), Delhi
  5. Rajiv Gandhi International Stadium, Hyderabad
  6. The Sawai Mansingh Stadium, Jaipur
  7. Eden Gardens, Kolkata
  8. Wankhede Stadium, Mumbai

First things first — sort the data!

A peek at the original data set (Source:Kaggle)

For us to figure out the number of matches won (with a positive outcome of toss), we would need to filter the data just a little bit. We go city-wise first…

Modifying dataset

Then we filter it based on “toss_decision.” The dataset does not give us the outcome right away. So we figure out if on a given day, after winning a toss and selecting to bat/field what the outcome was. I added a column “Win(1)/Loss(0)” which helps me determine if the selection led to a win or loss. Here “1” would indicate a win and “0” a loss.

This will further help me figure out the following —

For those who hate numbers, here is a simple explanation for what this data means. We are trying to figure out what was the percentage of times winning the toss and choosing to bat/field resulted in a win for a given stadium.

Here is the final dataset that we will plot using R —

Final Dataset (values in %)
#pivoting the tablefinaldata<-TossWin%>%pivot_longer(!city,names_to = "WinMethod",values_to="WinPercentage")#making the graph using ggplot2ggplot(finaldata)+geom_col(aes(city,WinPercentage, fill=WinMethod),position="dodge")+labs(title="Percentage of Matches Won", subtitle="Selecting: Batting vs Fielding", x="City",y="Percentage(%)",caption = "MK") + geom_hline(aes(yintercept=50),color="#D6AA98",lty=2)
Pivoted table
Graph depicting match results for different stadiums


We can look at each one of these cities and compare the data against its pitch conditions to see if there is a pattern. A captain has to rely on pitch reports (and several other parameters) for most instances. Historical data like this can also help one make decisions.

  1. It is clear that out of all the stadiums Chennai has higher wins from batting first. We can verify this by understanding the kind of pitch Chepauk has. The pitch is known to be hard and dry with almost no grass. Hence over time, the pitch starts to erode and a lot of footprints are left on the pitch. So teams batting second often struggle. It is ideal to pick to bat first if playing at Chennai and post a solid total. The past data indicates one of the lowest win percentages of 38 for opting to bowl first as the pitch becomes slow and chasing becomes a mammoth task under pressure.
  2. Another interesting pick is Hyderabad. There is a huge disparity between the % win for each of the possibilities. The pitch in Rajiv Gandhi Stadium is slightly similar to Chepauk but is a much smaller ground. According to experts the total set can be huge and chased down quite comfortably. However, it is heavily dependent on the “dew factor.” While historical data suggests that it is favourable to pick bowling first. This might not always be the best case. Hence, weather conditions and reports for the day would give us a more holistic picture.
  3. Now, let's look at a case where there is a very slight difference between the win % with bowling or batting first. The pitch at Wankhede (Mumbai) is made up of red clay that makes the surface slightly tough. This gives a good bounce for batsmen to play. However, it also gives medium-paced and fast bowlers an advantage. It is reported that the impact of batting first or bowling first is not huge. The game can go either way. This inference correlates to our findings where the win per cent with field first is 53% and with the bat is 49%. There is not much difference.

This initial analysis may not give one the big picture. We are aware that there are several other factors that come into play while choosing to bat or bowl first. Pitch conditions, historical data, weather conditions are some of the primary factors. A captain has to also evaluate the strengths of the team and the psychological behaviour towards chasing versus defending.

It would be interesting to look at these multiple variables for us to understand and build a model that would give a close to accurate prediction of the match but for now, this is all we got folks!



Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store