Open Side Menu Go to the Top
Register
Probability Problem Probability Problem

12-19-2014 , 05:30 PM
I'll be up front with this, this is a homework assignment for uni. Normally I hate asking for help for assigments but I've spent quite some time on it and have been able to get it correct, any input would be appreciated.

The reliability of a particular skin test for tuberculosis (TB) is as follows: If the subject has TB, the test comes back positive 98% of the time. If the subject does not have TB, the test comes back negative 99% of the time. (Another way to say this is that the sensitivity of the test is 0.98 , and the specificity of the test is 0.99 .) From a large population, in which 2 in every 10,000 people have TB, a person is selected at random and given the test, which comes back positive. What is the probability that the person actually has TB?

How do you go about calculating the exact value for P in this problem?

Thanks.
Probability Problem Quote
12-19-2014 , 05:48 PM
OK this has now changed from a query on he problem to checking an answer.

I ran a formula of (.0002)(.98)/[(.9998)(.01)+(.0002)(.98) to return an answer of 1.92% chance of patient having TB. This was one of my first attempts that I dismissed as clearly wrong. Upon further research though I worked out that there are going to be far more false positives than correct negatives given the rarity of the disease. If worked out over a sample of 1,000,000 people then I got an expected 196 people with TB that were correctly identified, vs 10000 people that were falsely identified as having it. Am I correct here?
Probability Problem Quote
12-19-2014 , 05:56 PM
You can post these under homework help next time or it may be moved now or even in probability forum.


Have you tried the standard approach http://en.wikipedia.org/wiki/Bayes%27_theorem


Ie P(A|B)= ( probability of A statement being true given B statement being true)=
=P(B|A)*P(A)/P(B)

You are looking for P(has it|+)=P(+|has it)*P(has it)/P(+)

For anyone else interested (since op added a result that looks ok) do not look the next if you want to find it on your own with some additional effort to understand it.

Spoiler:
Notice that P(+) = P(one has it )*0.98+P(one doesnt have it)*0.01=2/10000*0.98+(1-2/10000)*0.01=0.010194

P(+|has it)=0.98, P(has it)=2/10000

so overall P(has it|+)=2/10000*0.98/0.010194=1.923%

Which means you still need more tests because you have significant chance to be ok still and the test failing.
Probability Problem Quote
12-19-2014 , 06:06 PM
Sorry I thought I was in the probability forum, and I didn't realise there was a homework forum. (I was first planning on posting here, then remembered the probabilty forum existed and must've had two tabs open and closed the wrong one)

And thanks for the input also!
Probability Problem Quote
12-20-2014 , 12:39 PM
Wanted to move this to proper forum - for added content, discussion, or etc.
Probability Problem Quote
12-20-2014 , 01:15 PM
Now ask yourself how many tests you need to run back to back to make the probability the subject really has it over 90%.

And ask yourself this too. Can you design a method that gives you over 99% confidence over 99.99% of the time applied? (when you start with the kind of limits the test has already). Or do you run into some natural barrier?
Probability Problem Quote
12-21-2014 , 12:00 PM
1st Test - (.0002)(.98)/[(.9998)(.01)+(.0002)(.98) = .192 (1.92%)

After two tests we get (.192)(.98)/[(.9808)(.01)+(.192)(.98)] = .9504 (95.04%)

3rd Test - (.9504)(.98)/[(.0496)(.01)+(.9504)(.98)] = .9994 (99.94%)

So realistically there would need to be 4 tests carried out in succession in order to obtain a result with an accuracy of above 99.99%, is this correct?
Probability Problem Quote
12-21-2014 , 07:00 PM
Quote:
Originally Posted by fergrberger
1st Test - (.0002)(.98)/[(.9998)(.01)+(.0002)(.98) = .192 (1.92%)

After two tests we get (.192)(.98)/[(.9808)(.01)+(.192)(.98)] = .9504 (95.04%)

3rd Test - (.9504)(.98)/[(.0496)(.01)+(.9504)(.98)] = .9994 (99.94%)

So realistically there would need to be 4 tests carried out in succession in order to obtain a result with an accuracy of above 99.99%, is this correct?
Assuming test results on the same subject are independently successful.

Which they aren't, for skin TB tests.
Probability Problem Quote
12-21-2014 , 07:09 PM
Start by a definition of probabilities that is better (more clear) after 2 tests to avoid any errors. Then put numbers only when needed to avoid such errors.

For example after 2 tests you have 2 things to ask like if someone has it what is the chance to get 2 plus tests. But also you have to worry that someone that does have it gets a -- or -+ or +- result because the test failed him. Thats a side concern you can calculate later. You could ask things like what is the chance one has it if he has 1/2 positives or 2/2 negatives.

In the end for the ++ outcome you need to ask what is P(has it|++) which is ;

P(has it|++)=P(++|has it)*P(has it)/P(++)

So you need to calculate for example again from fresh what is P(++|has it) and P(++).

P(++)=P(has it)*0.98^2+P(dont have it)*0.01^2=2/10000*0.98^2+(1-2/10000)*0.01^2=0.000292

P(has it)=2/10000,P(++|has it)=0.98^2 so

P(has it|++)=P(++|has it)*P(has it)/P(++)=0.98^2*2/10000/(0.000292)=0.658

so you go to 65.8%

I mean if you were told someone has been selected randomly and had 2 tests back to back that were positive you have 65.8% he has it. It starts getting there but still its not super confident.

Basically see it this way too to visualize why its not as high as you calculated. A fraction of (1-2/10000)*0.01^2=0.00009998~0.0001 of people will show 2 positives even if they dont have it. Those that have it start as 2/10000 and after 2 runs will be 2/10000*0.98^2~0.00019

So you are like having a population where in relative terms one has a weight of 0.00019 and the other 0.0001 or 10 out of 29 are seen as having it without being true. Hence the intuitive way to see the 65.5% above. Basically some 65.5% of people that passed 2 tests plus are the real thing and the others are still the error, false double positives.

You now of course have to worry also about those that had it and that gave a second or first negative and are with -- or -+or +-, who after 2 tests do not show 2 positives but are in fact ill. The process has failed those. Those are like 2/10000*0.02^2+2*2/10000*0.98*.01~0.000004 ie 4 in 1 mil people are failed by the process. That of course is a small number and there are other ways to find they are ill anyway but in some other illnesses or tests for future illnesses that population is a concern to have been left undetected.

Notice however that if someone has 2 tests and one is positive (most of the above rare people) that person is already concerned (although they dont need to be very concerned) and will likely go for a 3rd anyway to make sure.

However any strategy of fixed number of tests will have a number of people that fail anyway to pass all tests introducing uncertainty. However the false all positives will be quickly reduced to tiny levels in these processes after many tests.

Last edited by masque de Z; 12-21-2014 at 07:14 PM.
Probability Problem Quote

      
m