R to gain insights about Ranji Players!

R for Ranji…

So, ESPNCricinfo has all this data that we can use. While I was experimenting with data and plotting them in R, I stumbled upon some Ranji Stats (2019–2020). There are many parameters that I could have used to analyse the data. However, for ease and my convenience, I resorted to understanding how runs are gathered by a player.

  1. Boundaries — hitting 4s or 6s
  2. Other methods — running between wickets (1s, 2s or 3s)

These Ranji players later in their course of life become crucial at IPL auctions. Some of them are underdogs and dark horses, tucked away from the glory of Kohlis, Sharmas and Sachins of our country. In IPL, one of the most crucial and game-changing factors is how fast can one score. It could be from running between wickets or hitting the ball out of the park. So while the fancy board rooms discuss which player to take, we may want to look at how a player gathers these gems. Yes, batting average and strike rate are crucial but let's look at the fundamental method of scoring these runs.

Dataset (Courtesy: ESPNCricinfo)

Here we have our runs scored from 6s and 4s. For us to actually work with data, we need to reduce it further. Which is where I got this —

The values are in %

The dataset represents the % of runs scored by hitting 4s,6s or other avenues. Now for simplicity, we can assume that the 4s and 6s are the fastest ways to score runs. Hence we break it down further into just two categories; boundaries and others.

The values are in %

Now you may wonder, why we took the negative values for the “Other” method. Well, it is to help us plot the graph. In a graph we want to see the % share for each value so we could assume that the values move from -X to 0 to +X.

ggplot(data=cricket)+geom_col(mapping=aes(y=Players,x=Boundaries,fill=Players))+geom_col(mapping=aes(y=Players,x=Others,fill=Players),alpha=0.5)+labs(title = “How top 18 Ranji Players score their runs?”, x=”0 to -60: % Runs scored in Other ways & 0 to 60:% runs scored in Boundaries”, subtitle = “Others vs Boundaries”, caption = “Source Data=ESPNCricinfo”)
Graph using R for top 18 Ranji players of 2019–20


At one glance, we see that most players are gathering a higher % of their run share via boundaries.

  1. Players like P Bisht and SN Khan have hit >60% of their runs in boundaries. Their strike rates are 75.60 and 78.64 (in test format) respectively which is formidable in comparison to our National team. This means, they score runs quickly when compared to the rest.
  2. SN Khan is 6th in rank when we consider just the maximum runs scored (RR Dalal is the top scorer of the year 2019–20). Also, he has hit the highest number of 6s out of the top 20. He makes run at a faster pace and his average of 154.66 (highest out of the lot), this tells us that he does not get out that often. He is reliable and a power hitter. He is worth placing our bets on.
  3. Players like AV Vasavada, T Kholi and MK Tiwary have gathered most of their runs running between the wickets. The number of 6s they scored in one year are 4,8 and 1 respectively.

While these statistics may not really help us understand the whole picture. It tells us a little about their natural tendency or style of playing in a long format. We do understand that most players adjust their style of play and adapt to the given situation. Hence, we can not rely on just these measures to make our pick. A more holistic analysis using other parameters might help us get more insights into a player.



Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store