Why Indian Talent Season 2 got it wrong

IGT Khoj 2
IGT Khoj 2
So, India’s Got Talent Season 2 got it wrong. The best team did not win. What happened? The rule said judges and votes will get equal (50%) weightage. However its apparent to any Tom, Dick or Harry that the final winners were influenced much more by votes, and to a lesser extent by the judges’ scores.

I did not find any official details on what mathematics was behind the final result. However, my guess is votes were converted into scores (using percentiles for example), and averaged with judges’ scores. This would then yield a result similar to what we got, because judge scores were in the range 23 to 30, whereas prorated vote scores would be in the 0 to 30 range (might not start at zero, but the point is, it would start very low). So, while the worst performer as per judges got a score of 23, the worst voter got a very low score, say 10.

I have written about normalization before. I believe the use of this technique would have yielded more fair results. Normalization brings a set of values to a common mean, and a common standard deviation. Below is a comparison – I have taken guesses on how many votes each contestant received. First, prorating:

Group Judges score Votes Voting score Voting score prorata Final score
Teji Toko 23 100000 0.2786 30 26.5
Choir 27 80000 0.2228 24 25.5
Fictitious 30 60000 0.1671 18 24
Bir Khalsa 23 70000 0.195 21 22
Sanjay Mandal 29 15000 0.0418 4.5 16.75
Diwakar (Acroduo) 30 10000 0.0279 3 16.5
Manas Kumar Sahu 28 9000 0.0251 2.7 15.35
Underground Auth 26 8000 0.0223 2.4 14.2
Haridass 26 7000 0.0195 2.1 14.05

 

Next, with the use of normalization:

Group Judges score Votes Voting score Nor Judges score Nor voting score Final score
Fictitious 30 60000 0.167 27.916667 26.35 27.133
Choir 27 80000 0.223 25.104167 27.692 26.398
Diwakar (Acroduo) 30 10000 0.028 27.916667 22.994 25.455
Teji Toko 23 100000 0.279 21.354167 29.035 25.194
Sanjay Mandal 29 15000 0.042 26.979167 23.329 25.154
Manas Kumar Sahu 28 9000 0.025 26.041667 22.927 24.484
Bir Khalsa 23 70000 0.195 21.354167 27.021 24.188
Underground Auth 26 8000 0.022 24.166667 22.86 23.513
Haridass 26 7000 0.019 24.166667 22.793 23.48

 

Several improvements can be seen: Fictitious group has moved deservingly up; Teji Toko has moved down; Acroduo has moved upwards etc. Note again that the number of votes is just a guess (educated one, keeping relative popularities in mind). However, the point is, that with normalization being added to the calculation procedure the same number of votes can bring about a more fair result.

Share

Number crunching through clustering

Scatter
Scatter

Separating your data into buckets is useful in a lot of problems especially fraud detection. How do you mathematically ‘cluster’ your data? One statistical way is the K-means clustering.

Without delving into too much statistics, here is a spreadsheet you can use to do this for your own data.

This sheet accepts pairs of two variables – for example age versus
number of sick leave applied, by a group in a year. Thereafter it
categorizes this data into buckets, the number of buckets being
specified by you. Once the sheet gives you bucket classification, you can analyse it for problems. You should see the following cases as worthy of further attention:

  1. too many datapoints falling in a single group
  2. only one or two datapoints in a single group
  3. any point that does not belong to the group its in (this is only
    possible if the data has a subjective background)

This method can be used to sample data for further analysis wherever
there is simply too much data to analyse. In our example it can be
used to isolate people who may be feigning sickness to take leave. In
a test for people with different levels of capability it can be used
to grade scores etc. It may also be used to solve the needle in haystack problem.

Share

Magic square

Its really easy to make a magic square: one where the horizontal totals and the vertical totals all add up to the same number. For example, here is a 5×5 magic square:

Magic square

How to build one such magic square? You could build a 3×3, or a 7×7 one for example. The rules:

1. Start with writing 1 in the middle square
2. Keep moving diagonally upwards, increasing the number by 1 each time
3. If you move out of the square, roll over
4. If you hit a square already filled, move vertically downwards one square and keep moving

All the best!

Share

D.R.Kaprekar: Indian Mathematician

The readers may recall that I talked about Ramanujan sometime back. Today I introduce to my readers, a much less known Indian Mathematician.

Kaprekar constant
Kaprekar constant
Dattaraya Ramchandra Kaprekar, born 1905 worked on the number theory. He had no formal postgraduate training and worked as a schoolteacher in Nasik, India.

His claim to fame is the Kaprekar constant 6174. Start with any four digit number, with no repeating digits – say Z. Let A and B be two numbers formed by rearranging the digits of Z, such that A is the highest number that is possible, and B the smallest. Subtract B from A. If this is not 6174, continue the same way now taking this number to be Z. For example, starting with Ramanujan number 1729:

9721-1279 = 8442
8442-2448 = 5994
9954-4599 = 5355
5553-3555 = 1998
9981-1899 = 8082
8820-0288 = 8532
8532-2358 = 6174
7641-1467 = 6174

He also gave the world Harshad numbers: numbers that can be divided by the sum of their digits – for example 12, which is divisible by 3.

Share

Srinivasa Ramanujan: Indian mathematician

Ramanujan
Ramanujan

A leading Indian daily started a series on not so ordinary Indian people just before the Independence day on August 15th. On the d-day, ex Indian President APJ Abdul Kalam wrote a piece and talked about Srinivasa Ramanujan, one of the greatest mathematician of current times, from the land that created zero.

He is the person behind Ramanujan number, 1729 which is the smallest number to be a sum of two cubes in two different ways:

1729 = 9*9*9 + 10*10*10
1729 = 1*1*1 + 12*12*12

He was given books on advanced trigonometry which he mastered by the age of 13. While still in India, Ramanujan recorded the bulk of his results in four notebooks of loose leaf paper. These results were mostly written up without any derivations. This is probably the origin of the misperception that Ramanujan was unable to prove his results and simply thought up the final result directly. Mathematician Bruce C. Berndt, in his review of these notebooks and Ramanujan’s work, says that Ramanujan most certainly was able to make the proofs of most of his results, but chose not to.

Ramanujan is generally hailed as an all-time great mathematician, like Leonhard Euler, Carl Friedrich Gauss, and Carl Gustav Jacob Jacobi, for his natural mathematical genius. G. H. Hardy quotes: “The limitations of his knowledge were as startling as its profundity. Here was a man who could work out modular equations and theorems… to orders unheard of, whose mastery of continued fractions was… beyond that of any mathematician in the world, who had found for himself the functional equation of the zeta function and the dominant terms of many of the most famous problems in the analytic theory of numbers; and yet he had never heard of a doubly-periodic function or of Cauchy’s theorem, and had indeed but the vaguest idea of what a function of a complex variable was…”. Hardy went on to claim that his greatest contribution to mathematics was discovering Ramanujan.

Share

Making scores comparable

How to compare scores rated under different tests, or by different people? Find out, and use the ready spreadsheet to crunch your own numbers.

Assume that we have feedback on team members from two different project managers.

People-> U V W X Y Z
Managers |
A 70 63 82 91 56 77
B 68 60 80 80 55 60

Can we say that W performs better in team A compared to team B? It looks like yes, he works better but analyse the scores a bit deeper. Project manager B has rated all team members lower than manager A. It may be that he is using tighter scoring.

In a different situation, it may be that two teams (of different people) have gone through two different tests, and we want to compare the people against others. Or, a university might want to compare people passed out in 2001 with those passed out in 2009.

The question is, how do we compare people when the scores that we have do not use the same basis.

Bell Curve - CC Attribution License
Bell Curve - Created using HSB Grapher

The answer is: normalization. We fix the mean score, and the degree of deviation that we would like to see. For a 100 marks test, we may want 50 as the score and a 20% deviation. Now, we will compare this with the actual mean and the deviation of each of the sets, and modify the scores as needed. Please refer to the attached spreadsheet which helps you do this.

 

 

 

In the given example, A has a mean of 73, and a deviation of 13. B has a mean of 67 and a deviation of 11. Let us bring both to a mean of 70 and a deviation of 15.

People-> U V W X Y Z
Managers |
A 66.3 58.1 80.4 91 49.9 74.5
B 71.2 60 87.9 87.9 53.1 60

So we note that A is indeed better in project A, but only slightly. Also from the initial figures we might have concluded that U is better in Project A, but actually the reverse is true.

Mathematically this process is called Normalization and is useful in fitting scores to a bell curve. Read more about it here if you are interested. However the spreadsheet attached is sufficient to get you started.

Share

Drawing through Iterated Fractal Function

Leafy Mathematics
Leafy Mathematics

Have a look at the image on the right. What do you think it is?

Ok, you got it: it’s a fern. What’s the big deal about that? The big deal is that it has been generated mathematically. Generated mathematically? Yes, mimicking what happens in nature.

It also happens to be a ‘fractal’: each part of it is similar to the whole – no matter how much you zoom in.

I read about this on Wikipedia: I decided to give it a try so I built a spreadsheet. Download it here.

Start with x=0, y=0. In each iteration n, choose t randomly from {1, 2, 3, 4} such that it has value 1 with a probability of 0.01, values 2 & 3 with probability of 0.07 each, and value 4 with a probability of 0.85. Now (xn, yn) is

(0, 0.16 yn) if t=1

(0.2 xn – 0.26 yn, 0.23 xn + 0.22 yn + 1.6) if t=2

(-0.15 xn + 0.28 yn, 0.26 xn + 0.24 yn + 0.44) if t=3

(0.85 xn + 0.04 yn, -0.04 xn + 0.85 yn + 1.6) if t=4

So this is what I replicated in the spreadsheet. The first column A, contains random numbers generated using function RAND. The second column contains the calculation for ‘t’: if the value in column A is less than 0.85 – ‘t’ will be selected as 4. If not, and its less than 0.92, ‘t’ will be selected as 3 and so on. This will ensure the required probability distribution for the random values of ‘t’. In columns C & D, values of x and y are calculated respectively, using values in the previous row.

Move over to the second worksheet in the spreadsheet, called ‘Plot’. Can you see the fern similar to the one I showed above? Press F9 to refresh – this will cause the entire set of random numbers to be regenerated, the x & y values to be correspondingly recalculated and the graph redrawn. Even so, the fern shape will be retained. Mathematics is beautiful, isn’t it? The ‘fern’ we just generated is called Barnsley’s fern.

Share

Limericks II

Algebra?
Algebra?

Is Algebra fruitless endeavor?
It seems they’ve been trying for ever
To find x, y, and z
And it’s quite clear to me,
If they’ve not found them yet then they’ll never

Credit: Graham Lester

A wonderful bird is the pelican
His bill can hold more than his belican
He can take in his beak
Food enough for a week
But I’m damned if I see how the helican

Credit: Dixon Merit

A canner exceedingly canny
One morning remarked to his granny:
“A canner can can
Any thing that he can
But a canner can’t can a can, can he?”

Credit: Carolyn Wells

There was a young fellow of Wheeling
Endowed with such delicate feeling
When he read on the door,
“Don’t spit on the floor”
He jumped up and spat on the ceiling!

Share

The Spirograph

I became a kid again and purchased a Spirograph set. The Spirograph is a mathematical toy, which you can use for drawing nice figures. In the simplest case it exists of a fixed circle, used as a template, and a smaller rolling circle with holes. The result of my experiment is at the end of this post.
Thereafter, I turned my attention to mathematical generation of the Spirograph figures, and to my surprise I was able to find a number of good resources on the net.
The parametric equations for a Spirograph are:
x(t)=(R+r)cos(t) + p*cos((R+r)t/r)
y(t)=(R+r)sin(t) + p*sin((R+r)t/r)

More explanation here:
http://www.mathematische-basteleien.de/spirographs.htm

A digital spirograph can be created using, for example, the following equations:
(5-1.02)*cos(1.02*x/5)+2*cos((5-1.02/5)*x)
(5-1.02)*sin(1.02*x/5)+2*sin((5-1.02/5)*x)
with HSB Grapher.

An applet, and some data to play with it can be found here.
However, what I found more useful and interesting is this page: http://linuxgazette.net/133/luana.html. It provides a Spirograph compiler (an awk script). This awk script takes a Spirograph specification and generates a gnuplot script to create the Spirograph. Go try it – its very interesting. If you are on windows, read these to help you get started on awk and gnuplot.

Spirograph
Spirograph

Share

Ravan burning

Yet again India witnessed the effigy-burning of Ravana during the festival of Dussehra. This is how the stage was set initially with Ravana in the middle and Kumbhkaran (Ravana’s brother) and Meghnath (Ravana’s son) on the sides:

Ravana in the middle
Ravana in the middle

Some pictures of the fireworks that happened to initiate the ceremony:

Fireworks 1
Fireworks 1

Fireworks 2
Fireworks 2

Fireworks 3
Fireworks 3

Fireworks 4
Fireworks 4

Fireworks 5
Fireworks 5

In short the story goes as below: Ravana kidnaps Sita, the wife of Rama. Rama launches a battle, and after a few days of fighting, is face to face with Ravana on the day of Dussehra. He shoots a burning arrow and Ravana turns into a fireball:

The Fireball
The Fireball

This is what remains a bit later:

Remains
Remains

So Ravana is killed by Rama, and Sita saved. For more details visit the Wikipedia link.
Watch a video by clicking here.

Share

Benford’s law – how to detect fraud through numbers

How to detect problems pertaining to randomness of numbers using Benford’s law. The theory can be used to detect fraud, evasion of rules etc.

Benford's law

Suppose you have some financial data – let us say all the vouchers paid by the company in a given month, and you want to run some tests to determine if there are any anomalies. For example, are the employees beating the approval process by entering say $24.99 vouchers if the limit is $25, or if any fraud is being committed. One way to do this is to use Benford’s law.

Benford’s law states that in a given list of numbers generated naturally (for example stock prices or census figures), the probability of a number starting with 1 is 30.1%. The probability of a number starting with 2 is 17.6% and so on – it keeps decreasing as the numbers increase. The rationale behind it is explained as: it takes a 100% increase to take a number from 100 to 200. However, it takes only a 50% change to go from 200 to 300. 100% increase is more difficult to do (and thus has less probability of happening) than a 50% increase.

In this way, the probability of having a number starting with digit d is given by log(1+1/d), log to base 10. More information is available here. Its usually extended to the first two digits for analysis in the real world.

Download from here a spreadsheet (called Numeric Truth) to carry out this analysis for you. All you have to do is to paste your data into the green cells. After that, on the first sheet it will show the results of first digit analysis, and on the second sheet, two digit analysis. Have a look at the graph, the variances for the individual digits, and the total variance. That should give you a starting point for your analysis/audit.

Share

Licensing and information about the blog available here.