Ask a probabilist - Page 17 - Science, Math and Philosophy Forum

I see I left out some important information. Delivery time on a new component is 2 weeks.

So the problem should be: How many spare parts do I need to ensure that I wont run out during a two week period?

I made a excel-sheet with the following input.
Failure rate during the two week is 1/260=0.0038..
I have 16 components so I use 9/130 as "failure rate".

I get the following:

I have used the following formula
=POISSON(A2;0,0038*18;True)

From the chart I can read that I'd need two components to be 99.95% certain that we'll be out of spares.

Am I in the wrong?

Quote

05-11-2010 , 06:48 PM

#402

Vael

Carpal \'Tunnel

Join Date: Feb 2009 Posts: 6,377

What's the best introductory reading for learning probability? (preferably online & free :>) I want to really understand the fundamental concepts, not just be able to do all sorts of calculations.

Quote

05-12-2010 , 07:36 AM

#403

genjix

journeyman

Join Date: Apr 2009 Posts: 314

The thing is, the Poisson distribution is a discrete distribution measuring lots of events where each event is very unlikely like X = {Red Vehicle On Road, Blue, LightGreen, DarkGreen, Black, Silver, ...} and you want to find n occurrence of blue vehicles in a fixed time limit.

Quote

05-15-2010 , 09:27 AM

#404

L.Von Mises

stranger

Join Date: May 2010 Posts: 2

Quote:

Originally Posted by jason1990

I would like to recommend that you seriously consider a minor in math, instead of statistics. It might be more helpful to you, both in terms of what you learn and how it is perceived by potential employers. You might enjoy it more. And you might be able to take some probability and statistics as a math minor, which would essentially give you the best of both worlds.

Just curious:

Suppose a budding probabilist has room for an extra course and all that is available is topology, geometry, and algebra. Furthermore, assume said student has already taken intro classes in these three areas already. Which of these three, (just these three), would be most beneficial for a probabilist? Not sure it is relevant but the textbooks for these courses are by Munkres, Spivak & Lee, and Artin, respectively.

Thanks in advance for any/all feedback.

Quote

05-16-2010 , 07:00 PM

#405

jason1990

old hand

Join Date: Sep 2004 Posts: 1,889

Quote:

Originally Posted by genjix

[snipped]

It seems you are okay now with the first part of your question. Regarding the second part, note that the derivation in the archived posting is incomplete. It correctly shows that the risk of ruin, r, satisfies

r = q + pr².

But this equation has two solutions: r = q/p and r = 1. If p ≤ 1/2, then the only solution which is in the interval [0,1] is r = 1, so the risk of ruin is 1. But if p > 1/2, then both solutions are in [0,1]. An additional argument is needed to show that, in this case, the risk of ruin is r = q/p, and not r = 1. One could prove this, for example, by using the argument you outlined with the recurrence relations, and then letting N → ∞.

Note that many people actually believe that r = 1, even when p > 1/2. It is fairly common to see people present the naive argument that "whatever can happen will happen, given infinite time".

Quote

05-16-2010 , 07:26 PM

#406

jason1990

old hand

Join Date: Sep 2004 Posts: 1,889

Quote:

Originally Posted by NeedExpertHalp

Probably don't need to be a probabilists to solve it, but I still need some help with it.

Problem:
There are 16 installed units on a machine. They fail independtently, and has a average lifespan of 10 years.

What assumption do I need to make about the failure rate?

How many spare units do you need to have a 99.95% chance of having more spare parts then broken units.

Is my problem as presented a decent way to find how many spare parts you need to have if you're willing to take a certain risk (in this case you're out of spare parts 1 of 1000 times), or am I doing it wrong?

Quote:

Originally Posted by NeedExpertHalp

I have used the following formula
=POISSON(A2;0,0038*18;True)

From the chart I can read that I'd need two components to be 99.95% certain that we'll be out of spares.

Am I in the wrong?

For 1 ≤ j ≤ 16, let N_j be the number of times the j-th part fails in the two-week period. (For example, N_j = 2 means it failed, we replaced it, then its replacement failed, we replaced that, and then we were good until the end of the two-week period.) Let N = N₁ + ... + N₁₆, and let k be the number of spare parts we have. We wish to choose (the smallest) k such that P(N ≤ k) ≥ 0.9995.

Let us model each N_j as a Poisson distributed random variable with mean 1/260. Since the N_j's are independent, this implies N is also Poisson distributed with mean 16/260. (It appears you used 18/260, for some reason.) Then, in the syntax of Excel,

P(N ≤ k) = POISSON(k,16/260,TRUE).

In this case, although the probabilities differ slightly from what is in your calculations, it is still true that k = 2 is the smallest choice of k that works, and we have P(N ≤ 2) ≈ 0.999962908.

Quote

05-16-2010 , 08:56 PM

#407

jason1990

old hand

Join Date: Sep 2004 Posts: 1,889

Quote:

Originally Posted by Vael

What's the best introductory reading for learning probability? (preferably online & free :>) I want to really understand the fundamental concepts, not just be able to do all sorts of calculations.

I feel inspired to give you a very specific reading recommendation, despite knowing nothing about your background.

I suggest that the first thing you do is begin to read Probability Theory: The Logic of Science. I have linked you to the first three chapters, but if you ever need more, you can find it here.

If you reach a point where you are weary of this book and want to stop reading it, then simply stop. Otherwise, try to finish at least the first two or three chapters.

When you have had enough of this book, then start reading a basic, standard textbook such as this one: Introduction to Probability. Supplemental material for this textbook can be found here.

Quote

05-16-2010 , 09:49 PM

#408

tcc1

veteran

Join Date: Jan 2010 Posts: 3,003

The contents of this thread remind me how little I understand about mathematics, and how foreign and complex it can sound for someone with no training it.

Quote

05-17-2010 , 06:45 AM

#409

jason1990

old hand

Join Date: Sep 2004 Posts: 1,889

Quote:

Originally Posted by L.Von Mises

Suppose a budding probabilist has room for an extra course and all that is available is topology, geometry, and algebra. Furthermore, assume said student has already taken intro classes in these three areas already. Which of these three, (just these three), would be most beneficial for a probabilist? Not sure it is relevant but the textbooks for these courses are by Munkres, Spivak & Lee, and Artin, respectively.

I am going to go with topology. In probability, even at the elementary level, one must deal with various non-standard modes of convergence (in probability, in distribution). Studying topology might give a person helpful insight and intuition in these matters. At the more advanced level, especially in stochastic processes, one must frequently deal with random variables taking values not in the real line, but in some more abstract topological spaces.

Quote

05-17-2010 , 08:48 AM

#410

LongLiveYorke

grinder

Join Date: May 2008 Posts: 527

Quote:

Originally Posted by tcc1

The contents of this thread remind me how little I understand about mathematics, and how foreign and complex it can sound for someone with no training it.

Your medical threads to the same to me, though I find them very interesting.

Quote

05-17-2010 , 12:57 PM

#411

L.Von Mises

stranger

Join Date: May 2010 Posts: 2

Quote:

Originally Posted by jason1990

Ah, good to know. Thank you very much for the insight and quick answer, much appreciated!

Quote

05-17-2010 , 03:55 PM

#412

Vael

Carpal \'Tunnel

Join Date: Feb 2009 Posts: 6,377

Quote:

Originally Posted by jason1990

Started with the first book today, it's great & exactly what I'm interested in. Thank you!

Quote

05-17-2010 , 07:18 PM

#413

FBGHooper

adept

Join Date: Jun 2005 Posts: 974

I have a full ring situation to ask about. Is it still cool to throw out a question here? I'll just hope so and do it anyway. I've done the math, but would like someone else to confirm my calculations.

QUESTION: GIVEN THAT SOMEONE AT A FULL RING (10 SEAT) TABLE IS DEALT AA, WHAT IS THE LIKELIHOOD OF SOMEONE ELSE BEING DEALT KK?

The question arises from playing a lot of low limit holdem. Anecdotal evidence indicates to me that it happens more frequently than I think most would expect. Again, this isn't heads up, and the hero in the hand doesn't have to be involved. It's only that we see AA vs KK at a table we are playing on.

Odds of pocket pair: 1 in 17
Odds of AA: 1 in 221 , that's 13x17

In a ten seat full ring game we should expect to see AA dealt to SOMEONE (not specifically you) once every 22.1 hands that are dealt by the dealer.

For the other 9 players, they don't have the same 1:221 odds of getting kings because two of the aces are out of the deck. It becomes (4/50)(3/49) or 1:204.16 , 204.16/9 = 22.69

By my logic we should expect to see two random players in a full ring game have AA vs KK once approximately every 501.34 hands.

Thanks for any and all feedback.

Quote

05-17-2010 , 09:24 PM

#414

bellatrix

Rigged for her pleasure

Join Date: Dec 2005 Posts: 4,993

Quote:

Originally Posted by FBGHooper

That sounds right. I still need to go through CT's calculation in that other thread., there's a distribution problem as he's putting only the probabilities on the first two players and not at any random players at the table.

One thing you are wrong about is this:

Quote:

Not sure why you are using combination/permutation notation to show this. You have 52 possible cards for the first one, and 51 possible cards for this second one. 52x51=2652 possible hole card combinations. Do you disagree with this?

You have to divide by 2 as A

= A

. That's it

Quote

05-17-2010 , 10:04 PM

#415

Cranberry Tea

Carpal \'Tunnel

Join Date: Jan 2009 Posts: 6,241

Quote:

Originally Posted by bellatrix

You have to divide by 2 as A

= A

. That's it

You could theoretically do it without the 2! and say A

!= A

, but it felt natural to have them the same.

Quote:

Originally Posted by bellatrix

there's a distribution problem as he's putting only the probabilities on the first two players and not at any random players at the table.

I think you're right, and the answer should be mulitplied by 90 bc there's 10 spots for the player with AA to be distributed, 9 for KK.

Quote

05-17-2010 , 11:14 PM

#416

FBGHooper

adept

Join Date: Jun 2005 Posts: 974

Quote:

Originally Posted by Cranberry Tea

I think you're right, and the answer should be mulitplied by 90 bc there's 10 spots for the player with AA to be distributed, 9 for KK.

I had this same thought, which would increase your answer by nearly a factor of 100 making it virtually the same as mine. I think we're on the same page.

Quote

05-17-2010 , 11:15 PM

#417

FBGHooper

adept

Join Date: Jun 2005 Posts: 974

Quote:

Originally Posted by bellatrix

= A

. That's it

That makes sense. Thanks for pointing that out.

Quote

05-18-2010 , 04:39 AM

#418

DocOfDan

journeyman

Join Date: Jan 2005 Posts: 246

Quote:

QUESTION: GIVEN THAT SOMEONE AT A FULL RING (10 SEAT) TABLE IS DEALT AA, WHAT IS THE LIKELIHOOD OF SOMEONE ELSE BEING DEALT KK?

are we not really asking here what is the probability that KK is dealt to one of 9 players from a 50 card deck (i.e. the two aces removed)?
i.e. ~ 1 in 23?

Quote

05-18-2010 , 05:31 AM

#419

DocOfDan

journeyman

Join Date: Jan 2005 Posts: 246

Maybe I can throw a question into the mix (I asked it elsewhere but got no responses!)

A simplified version of the problem is as follows:

I have n products on sale and I want to predict which of these will be top seller in a given week. For each of the n products I have 4 (binary) attributes - lets call them A, B, C & D.

N can vary from ~4 to ~20

Attributes A & B are available for all products, while attributes C & D will be unvailable/not applicable for new products (i.e they are based on historical performance, e.g. C might be 'has been a top seller before' etc.)

I have large amounts of historical data.

My initial approach
==============
For simplicity, if we neglect attributes C & D, so that we have no missing data, I would create a 'rating' for each product, which would be a linear combination of A and B, e.g. R_i = alpha_1 * A_i + alpha_2 * B_i

I would define the probability of product i being the top seller as:

p_i = Exp(R_i)/Sum_i[Exp(R_i)]

and run a regression on my data to get the most appropriate parameter values for alpha

The heart of the problem
===================
What do I do about (partially) missing data?
Should I

a) Define a new indicator variable E ('is new product'), set C_i and D_i to zero for all new products and redefine R_i = Sum(alpha_1 * A + alpha_2 * B + alpha_3 * C + alpha_4 * D + alpha_5 * E) etc

or

b) take some other approach?

Quote

05-18-2010 , 09:25 PM

#420

FBGHooper

adept

Join Date: Jun 2005 Posts: 974

Quote:

Originally Posted by DocOfDan

are we not really asking here what is the probability that KK is dealt to one of 9 players from a 50 card deck (i.e. the two aces removed)?
i.e. ~ 1 in 23?

No, but I think you're correct in pointing out that GIVEN was a poor choice of words. What I'm asking about is a compound probability problem. Or rather, the likelihood of both A and B occurring.

What is the probability that (1) a player was dealt AA and (2) another player was dealt KK?

Quote

05-19-2010 , 03:53 AM

#421

DocOfDan

journeyman

Join Date: Jan 2005 Posts: 246

Quote:

Originally Posted by FBGHooper

Yes, I agree

It is the difference between asking "how many hands would I typically have to wait to see someone get dealt kings when someone else gets aces" and "what fraction of the time does someone get kings when someone else has aces"

Quote

05-27-2010 , 11:19 AM

#422

jason1990

old hand

Join Date: Sep 2004 Posts: 1,889

Quote:

Originally Posted by FBGHooper

QUESTION: GIVEN THAT SOMEONE AT A FULL RING (10 SEAT) TABLE IS DEALT AA, WHAT IS THE LIKELIHOOD OF SOMEONE ELSE BEING DEALT KK?

Quote:

Originally Posted by FBGHooper

... I think you're correct in pointing out that GIVEN was a poor choice of words. What I'm asking about is a compound probability problem. Or rather, the likelihood of both A and B occurring.

What is the probability that (1) a player was dealt AA and (2) another player was dealt KK?

Quote:

Originally Posted by FBGHooper

By my logic we should expect to see two random players in a full ring game have AA vs KK once approximately every 501.34 hands.

Let

A = "Someone was dealt AA, and someone else was dealt KK."

Then we want P(A). Let

A_i = "Player i was dealt AA, and someone else was dealt KK."

Then by inclusion-exclusion,

Now let

B_i = "Player 1 was dealt AA, and Player i was dealt KK."

Then

Finally, let

C_i = "Player 1 was dealt AA, Player 2 was dealt AA, and Player i was dealt KK."

Then

Combining these, and observing that P(B₂ ∩ B₃) = P(C₃), we have

P(A) = 90P(B₂) - 720P(C₃) + 45(28)P(C₃ ∩ C₄).

Since B₂ = "Player 1 was dealt AA, and Player 2 was dealt KK.", we can make the notation more suggestive by writing B₂ = A,K. Similarly, let us write C₃ = A,A,K and C₃ ∩ C₄ = A,A,K,K. With this notation, the formula may become more intuitive, as it says

P(A) = 90P(A,K) - 720P(A,A,K) + 45(28)P(A,A,K,K).

Your logic only computes the first term, 90P(A,K). But the other terms are small and have only a slight effect on the final answer. I calculated a final answer of 0.0019805, which is about 1 in 504.9.

Quote

05-27-2010 , 01:01 PM

#423

jason1990

old hand

Join Date: Sep 2004 Posts: 1,889

Quote:

Originally Posted by DocOfDan

I have n products on sale and I want to predict which of these will be top seller in a given week. For each of the n products I have 4 (binary) attributes - lets call them A, B, C & D...

My initial approach
==============
For simplicity, if we neglect attributes C & D, so that we have no missing data, I would create a 'rating' for each product, which would be a linear combination of A and B, e.g. R_i = alpha_1 * A_i + alpha_2 * B_i

I would define the probability of product i being the top seller as:

p_i = Exp(R_i)/Sum_i[Exp(R_i)]

For even more simplicity, let us suppose we have only two products, Product 1 and Product 2. Suppose we know B₁, A₂, and B₂. But all we know about A₁ is that it is very likely to be 0. So P(A₁ = 0) = 1 - ε. Let

T = "Product 1 is the top seller."

For us, the odds of T are approximately

Now suppose we receive some information J which implies we were wrong about A₁. In other words, P(A₁ = 1 | J) = 1 - ε. Then our odds of T change to

But according to Bayes' theorem,

Therefore,

Very loosely speaking, we can interpret this as follows. Suppose α₁ > 0. We had a product in our hands (Product 1) and we thought that A₁ was zero. The probability that we would discover we were wrong about that would be higher if the product in our hands was the top seller, than if it was not. In fact, it is roughly exp(α₁) times higher.

But more importantly, the ratio of these probabilities did not depend on B₁. It is in this sense that the attributes have independent effects on the product, and this behavior is built right into the structure of the model.

Your option (a) below does not seem consistent with this behavior, so it does not seem like a viable choice.

Quote:

Originally Posted by DocOfDan

What do I do about (partially) missing data?
Should I

a) Define a new indicator variable E ('is new product'), set C_i and D_i to zero for all new products and redefine R_i = Sum(alpha_1 * A + alpha_2 * B + alpha_3 * C + alpha_4 * D + alpha_5 * E) etc

or

b) take some other approach?

Quote

05-27-2010 , 03:47 PM

#424

egj

grinder

Join Date: Sep 2004 Posts: 673

Quote:

Originally Posted by DocOfDan

My initial approach
==============
For simplicity, if we neglect attributes C & D, so that we have no missing data, I would create a 'rating' for each product, which would be a linear combination of A and B, e.g. R_i = alpha_1 * A_i + alpha_2 * B_i

I would define the probability of product i being the top seller as:

p_i = Exp(R_i)/Sum_i[Exp(R_i)]

and run a regression on my data to get the most appropriate parameter values for alpha

This looks similar to logistic regression. Is your initial approach in fact identical to logistic regression? If not, maybe you should consider logistic regression.

Quote

05-27-2010 , 04:04 PM

#425

egj

grinder

Join Date: Sep 2004 Posts: 673

Quote:

Originally Posted by egj

This looks similar to logistic regression. Is your initial approach in fact identical to logistic regression? If not, maybe you should consider logistic regression.

On second thought, I wonder if you are framing the problem the right way. You seem to be setting up a regression where the dependent variable is the probability of a binary event (being the top-seller or not). If you have raw sales data, I wonder if you should instead be doing a regression where the dependent variable is the real-valued quantity "total sales". Your predicted top-seller would then of course just be the product with the highest predicted total sales.

Quote

Page 17 of 24

First

7 12 13 14 15 16 17 18 19 20 21 22

Last

Post Reply Subscribe

...

Page 17 of 24

First

7 12 13 14 15 16 17 18 19 20 21 22

Last