Using SQL potential

How to use the database SQL to its full potential. The idea is to reduce procedural coding and thereby improve performance, reduce defects.

I like to use the database to its full potential. For example, suppose someone has a list of vouchers and needs to find the vouchers that were paid later than the due date. One way to do this might be to read the vouchers one by one from the database, compare the due date with the payment date and determine the results. The other, recommended method will be to add the required criteria to the query itself so that only the exact result is obtained. With the second method, only 5% or 10% of the vouchers will need to be transferred from the database to the application while in the first method, all vouchers will need to be transferred.

In other words, the exact business requirements should determine the query. While you are at it, you should also keep in mind the indexes. Queries should always be written to minimise Disk I/O and transfers between the DB and the Application (server).

The database itself is quite powerful (esp Oracle) and I feel that its potential is always under-utilised. Let me show through an example.

I once had a requirement that there is a table having first, middle and last names of employees and the email ID. Something like this, ignoring the datatypes – assume all are VARCHAR2:

create table userlist(fname,mname,lname,emailid);

Each employee has middle name blank. Its possible that multiple employees have identical fname, lname with each other. For example, there can be two people having name ‘Hardeep Singh’. In this case, if the emailid of the two employees is same that means they are the same person having multiple rows, else they are different persons having the same name.

For example:

  1. Hardeep Singh alpha@gmail.com
  2. Hardeep Singh beta@gmail.com
  3. Hardeep Singh beta@gmail.com
  4. Satinder Singh gamma@gmail.com
  5. Satinder Singh gamma@gmail.com
  6. Gorakh Nath gn@gmail.com

In this case, 2 & 3 are the same person and 4 & 5 are also the same person. 1 & 2 are two different people.

Now the requirement is that we have to modify the middle name by adding a number such that every different person has a unique name. In the example above, the names should be:

  1. Hardeep Singh '1' alpha@gmail.com
  2. Hardeep Singh ‘ ′ beta@gmail.com
  3. Hardeep Singh ‘ ′ beta@gmail.com
  4. Satinder Singh ‘ ′ gamma@gmail.com
  5. Satinder Singh ‘ ′ gamma@gmail.com
  6. Gorakh Nath ‘ ′ gn@gmail.com

Now we know that ‘1’ is different from ‘2’ and ‘3’ because he has a different middle name.

The middle name to be added is given at the end of the name, in quotes. Gorakh Nath does not get any middle name since his name is unique. Any Tom, Dick or Harry would do this requirement in the following way: Read all the details one by one, look for people having the same name, then check the emailID then issue an UPDATE like this:

UPDATE userlist SET mname='1' where emailID='alpha@gmail.com';

Such UPDATES would need to be issued one for each person. However, this can be done through just a single UPDATE statement, without reading the list of employees at all. Here is the query:

update userlist a
set mname=(select x from (select rownum x,emailid,fname,
                                 lname
                          from userlist xa
                          where exists
                          (select 1
                           from userlist xb
                           where xa.lname=xb.lname and
                           xa.mname=xb.mname and
                           xa.fname=xb.fname and
                           xa.emailid<>xb.emailid))
                          ord
           where ord.emailid=a.emailid and
                 ord.fname=a.fname and
                 ord.lname=a.lname)
where exists(select 1
             from userlist b
             where a.lname=b.lname and
                   a.mname=b.mname and
                   a.fname=b.fname and
                   a.emailid<>b.emailid);
  
                
I guess an explanation is owed as to how it works. To my knowledge this query would work only in Oracle – but there would be ways to make it work in other Databases as well.

‘rownum’ returns the number of that particular row in the result set. The ‘exists’ clause at the end makes sure only people with same names are processed (‘gn@gmail.com’ is ignored). The part:

(select x from (select rownum x,emailid,fname,lname
from userlist xa
where exists
(select 1
from userlist xb
where xa.lname=xb.lname and
xa.mname=xb.mname and
xa.fname=xb.fname and
xa.emailid<>xb.emailid))
ord

                                             
creates a temporary view having the number, the email ID and the firstname. In the given scenario the result from this will be something like:

  1. 1, alpha@gmail.com, Hardeep, Singh
  2. This row will be absent because of the xa.emailid<>xb.emailid clause
  3. This row will be absent because of the xa.emailid<>xb.emailid clause
  4. This row will be absent because of the xa.emailid<>xb.emailid clause
  5. This row will be absent because of the xa.emailid<>xb.emailid clause
  6. This row wont even be considered, as I explained above

 

Had there been yet another ‘Hardeep Singh’ with a different email ID, he would have got a middle name of ‘2’.
Now the last step is to copy over the numbers based on the first and last names only – that part is pretty simple. Please post any questions in the comments area.

Share

Ishmeet lost

The Star voice of India, Ishmeet Singh is no longer with us. He died in Maldives where he had gone to perform, and was only 18 years old. He had a great career in front of him. Its very sad, and we cannot even begin to understand what his family must be going through.

I liked him a lot especially he was the only turbaned Sikh to reach the finals in any singing competition. During the VoI final, when Asha had to announce the winner, everybody including Harshit was praising her. However, Ishmeet informed the audience that it being Gurpurab that day, the Guru was with him and invoked the name of Waheguru. Ishmeet understood that Asha would only read what was written – praising her would not do any help.

Share

Online judge

Sometime back I came across Sphere Online Judge (SPOJ). A ‘judge’ is a mechanism through which you can verify your programming solutions.
The SPOJ contains a huge range of programming problems. You need to code a solution in one of the supported languages (from Unix shell scripting to C to Java to esoteric languages like B**fcuk). When you submit the solution, the judge will execute your code with predetermined input and match against predetermined output. If all goes well, your solution will be accepted. Otherwise, it will say ‘compilation error’, or ‘incorrect output’ or ‘time limit exceeded’. It will not tell you the input values or expected output, nor will it tell you the input for which your program is going wrong.

Its very interesting, the problems are very challenging and I gave some problems a try. At the end of it, you can see the report generated by the judge specific to you:

http://www.spoj.pl/users/hardeeps/

Also, you can see the submission history:

http://www.spoj.pl/status/hardeeps/

or, showcase a signed certificate that the judge provides:

http://www.spoj.pl/status/hardeeps/signedlist/

You can see that I have tried various languages, including Java, Bash (Unix shell) and Perl. I am yet to prove my programming skills in C. 🙂

The result AC means that the solution is accepted. The certificate also shows that I was able to calculate PI correctly to 2500 places of decimal in less than 24 seconds – I used the identity described in this blogpost.

I have tried to understand how to verify the digital signature on the certificate but have not been successful. The author informs me that a PHP function is being used to generate the certificate. It would be good if someone can get this working (please post in comments).

Share

Rupee Devaluation

How do currency exchanges work?

Sometimes there would be news that the Reserve Bank of India (RBI) has decided to devalue the Rupee. I would interpret “devalue” to mean that the value of Rupee against other currencies has been decreased. However, I also knew that the Rupee changes value everyday and not just those days when there was news from RBI. I would feel curious as to how then the Rupee kept changing its value everyday.

I understood the answer only when I did my MBA. Here is a short summary:

In the good old days, every country would specify its currency in terms of Gold. However, due to the problems associated with this mechanism, especially during times of Gold scarcity – this system was scrapped.

What happens now is that each country chooses what control it wants to have over its currency. It can let the currency float – which means that the market will determine the price based on demand, supply, balance of trade etc. This is the least intrusive method and requires little action from the country’s central bank.

The other is choice the central bank can make is to tie the currency to a fixed set of other currencies. This includes deciding upon a weighted mean formula (based on the currencies in the set) to determine the currency value. All that is needed then, is to find out the values of those currencies on a daily basis and calculate the value of our currency using the formula. This lets the country control its export and import prices to some extent. In such a case, the central bank needs to have the currency partially or fully convertible – and carry out currency exchanges based on the determined formula.

When RBI changes the formula, that is called devaluation – although the rates change daily based on the formula.

Share

Value of Pi

Pi
Pi

Wikipedia defines Pi or p as “a mathematical constant which represents the ratio of any circle’s circumference to its diameter in Euclidean geometry”. As a kid I used to be interested in the calculation of Pi. The value of Pi, to 100 places of decimal is:

3.1415 9265 3589 7932 3846 2643 3832 7950 2884 1971 6939 9375 1058 2097 4944 5923 0781 6406 2862 0899 8628 0348 2534 2117 0664

Value to 100 places
Value to 100 places

I have calculated this using the Unix command bc. The command for this is based on an identity (that I think is credited to Ramanujan) and is:
24*a(1/8)+8*a(1/57)+4*a(1/239)
where a stands for the arctangent function.

The formula is:

pi = 24*arctan(1/8)+8*arctan(1/57)+4*arctan(1/239)

First, load the bc language with associated library using “bc -l”. Then, set the scale to the number of digits using “scale=100”. Afterwards run the identity I gave above.

To experimentally calculate Pi experimentally, there are two ways: One using a random number generator and the other by physically measuring the circumference of a given circle.

Consider a square of length unity. Within that, a circle is drawn, having unity diameter. If a point is taking within the square at random, the probability of that point also lying within the circle is Pi*(1/2)*(1/2) which is Pi/4. Now, start taking points at random (x,y) and see if x*x+y*y<=1/4 or not (if it is, then that means the point lies within the circle). Maintain the count of total number of points taken (t), and the number that fell within the circle(c). Now Pi can be calculated as:

Pi=4*c/t

The other method is to take a circular bottle (measure the radius r) or tin and tie a thread around its circumference. Measure the circumference(c). Now Pi is c/2r.

The following two sentences contain the value of Pi: the number of letters in each word indicates the corresponding number in the value of Pi.

1. “May I have a large cup of coffee.”

2. “How I want a drink, alcoholic of course, after the heavy chapters involving quantum mechanics.”

This makes the value of Pi easy to remember.

Lastly, how useful are the digits of Pi as a source of random numbers? Not bad, according to a study: “while sequences of digits from pi are indeed an acceptable source of randomness – often an important factor in data encryption and in solving certain physics problems – pi’s digit string does not always produce randomness as effectively as manufactured generators do”.

Always wanting to do my own thing, I downloaded the value of Pi to one million places from a website, split it into (x,y,z) coordinates, each having 5 digit precision. Here is the graph that got generated:

Pi-randomness
Pi-randomness

A bell curve, but could have been better – seems to mimic the results from the study.

Share

Continuous Compounding

I feel there is something mathematically wrong with the way compound interest is defined today. We say “5% per annum, semi-annually compounded”. If I have money invested, mathematically speaking, it should get compounded all the time, not just in fixed intervals.

I would like to define the term “absolute percentage” – the annual interest rate at which the money must grow all the time in order to reach the same amount as it does using normal interest rate calculation. Let me explain what that means.

If I invest amount ‘p’ for period ‘t’ years, at an ‘absolute’ interest rate of ‘rdash’, then,

a=p*e^(rdash*t/100)

where ‘e’ is a mathematical constant having value approximately 2.71828182845904.

This means, that amount ‘p’ will grow to amount ‘a’ in ‘t’ years if it compounds continuously.

Here, rdash is the absolute equivalent of rate r if
a=p*(1+r/100)^t

Here is a table that explains it. Try and understand, ok?

Share