DOE: WHAT IS IT?
International Pi Day
Today is Pi Day — it’s 3/14 as per the American style of writing dates. Wiki says that 22nd July is an alternate date for Pi Day. |
I was reminded of those lovely mnemonics involving π. Let me start with the simplest, and perhaps the most appropriate one due to one C. Heckman which goes thus: HOW I WISH I COULD CALCULATE PI. |
The number of letters in each word gives the sequence of digits for π, so π is 3.141592 as per the above mnemonic. |
The one which we knew since our high school days was of course MAY I HAVE A LARGE CONTAINER OF COFFEE, which gives the value of π to 8 digits which is 3.1415926. |
The mnemonic by Sir James Jeans is perhaps the most scintillating: HOW I WANT A DRINK, ALCOHOLIC OF COURSE, AFTER THE HEAVY LECTURES INVOLVING QUANTUM MECHANICS which is π to 15 digits: 3.14159265358979. I first read this in the book by Posamentier and Lehmann referred to below. |
If we add the phrase “All of thy geometry, Herr Planck, is fairly hard,” to Sir James Jeans’s mnemonic, we get a total of 24 digits in all: 3.14159265358979323846264. |
Posamentier and Lehmann in their book their book “π: Biography of the World’s Most Mysterious Number” describe the wealthy Frenchman George Buffon’s mind blowing attempt in 1777 to calculate π which goes like this. |
Suppose you have a piece of paper with ruled parallel lines equally spaced at a distance d between lines, and a thin needle of length I where l < d. You then toss the needle onto the paper many times. |
Buffon claimed that the probability that the needle will touch one of the ruled lines is 2l/πd. However Buffon wasn’t a famous chap, so this claim languished in obscurity. |
About 35 yr later, the great French mathematician Pierre Simeon Laplace popularized it, and that’s how the world came to know of it. |
Posamentier and Lehmann say that we can try this out for ourselves. If for the sake of simplicity, we let l = d, then the value of π = 2/p where p is the probability that the needle touches a line. The value of p is given by |
p = no of times the needle touches a line/total number of tosses. |
An Italian mathematician by name Mario Lazzarini ACTUALLY did this in 1901. He tossed the needle 3408 times, and came up with a value of π = 3.1415929. |
BTW, the concept of Pi Day was first implemented in 1988 by American physicist cum curator cum artist who worked in the San Francisco Science Museum. |
UNESCO declared Pi Day as the International Day of Mathematics in Nov 2019. |
![]() |
Covid and Rev Bayes
I wrote this piece on LinkedIn in Oct 20 when we were just recovering from the pandemic. The Hindu of Chennai was publishing daily mortality rates. |
This is dedicated to the Most Rev Thomas Bayes (1701-1761) who was an English statistician and philosopher, not to mention a Presbyterian minister. The Reverend's theorem is used to calculate posterior probabilities given prior ones. In these dreadful Covid times, let me give a very simple application. |
Suppose you have developed a Covid test which has a sensitivity of 99%. A sensitivity of 99% means that if you were to administer this test to a hundred KNOWN cases of Covid, the test will give a positive signal for 99 people. In other words, the test will correctly identify 99 people as having the disease, however 1 person will pass through undetected. 1% therefore is the false negative rate. Tragic, isn't it? |
Another equally important concept is that of specificity. Suppose your test has a specificity of 98%. This means that if you were to administer this test to 100 people who DON'T have the disease, then it will pass 98 people. In other words, out of a hundred "clean" people, the test will correctly identify 98 as free of the disease, but will falsely identify 2 persons as having the disease when they don't! 2% is the false positive rate. Dangerous, isn't it?The one which we knew since our high school days was of course MAY I HAVE A LARGE CONTAINER OF COFFEE, which gives the value of π to 8 digits which is 3.1415926. |
So far so good. I congratulate you on your patience for having stayed this far! |
Let's define some stuff now. Let us denote the disease as D, and the probability of having the disease as p(D). The Hindu online edition on 4 Oct 2020 stated that the total number of ACTIVE Covid-19 cases in India is 937856. The Worldometer gives the total Indian population to be 1380 million (MM), so this works out to a prevalence of (0.938 MM/1380 MM) = 0.068%. In other words, p(D) = 0.068% = 0.00068 |
This value of p(D) is contested by many, who say that the actual prevalence is many times more, may be even a hundred times. Since I have no clue what the real number is, I will go with the calculated figure of zero point zero six eight percent. Some experts are of the opinion that the prevalence is about a hundred times more, say in the vicinity of 5-7%, probably with good reason too given the nature of our reporting and the state of our healthcare system. |
Let the event of getting a positive signal from the test you have developed be denoted as S. Since your sensitivity is 99%, it means that p(S | D) is 0.99; this is read as "the probability that the test gives a positive signal GIVEN THAT a person has the disease is 99%". This is an example of what is called conditional probability. |
Since your test has 98% specificity, it means that p(S | D') = 1 - 0.98 = 0.02 which reads as "the probability that the test signals positive GIVEN THAT a person DOESN'T have the disease is 2%". Why? Like I explained earlier, the test correctly passes 98 people as clean out of a total of 100 clean people, which means 2 clean persons out of a 100 clean are incorrectly classified as having the disease when they don't. In other words, the test gives a positive signal S when the person doesn't have the disease. |
Please note that any medical test of the YES/NO variety MUST indicate the sensitivity and the specificity. In our case we have assumed sensitivity to be 99% and specificity to be 98%. |
Bayes' Theorem tells you how to calculate p (D | S) if you know p(S | D), and of course p(D) and the specificity. For further details, look up page 55 of Applied Statistics and Probability for Engineers by DC Montgomery and George Runger, 6th edition. I will just give the result here: |
p(D | S) = { p(S | D) * p(D) } /{ p(S | D) * p(D) + p(S | D') * p(D')}
Plugging in the values of p(S | D) = 0.99, p(D) = 0.00068 (0.068% calculated above), p(S | D') = 0.02 and p(D') = 1 - 0.00068 = 0.99932, we have p(D | S) = 3.25%. This is also called the PPV or the Positive Predicted Value. |
What this means is that if a test is 99% sensitive and 98% specific, AND if the total incidence of the disease in the population is 0.068%, then the probability that the person has the disease GIVEN THAT he has tested positive is ONLY 3.25%. This implies that under these circumstances, if the test declares 100 people as having the disease, only 3 will actually have it!! |
![]() |
However if you change the value of p(D) from 0.068% to 7%, which is what many prophets say is the true prevalence rate, then the PPV works out to 78.8%. Under this circumstance, if the test declares 100 people as positive, only 79 actually have the disease! |
The result is STRONGLY dependent on the specificity of the test, and of course on the prevalence of the disease. |
Please don't blame Rev. Bayes, or worse still ME, if you don't like the numbers. The result that we are interested in i.e. the probability that a person HAS the disease GIVEN THAT he has tested positive, also called the Positive Predicted Value (PPV), is a function of (i) the prevalence of the disease in the general population (ii) the sensitivity of the test used and most importantly (iii) the specificity of the test. |
I have made an excel sheet for those who want to play around, let me know if you want it. |
Of course, it is very difficult to get an idea of p(D) at the outset. |
Terms and Conditions | Copyright © 2021. All rights Reserved | www.ekaagra.com