The Riddler - Gambling and Probability - Probability Theory Forum

Two Plus Two Forums Other Topics Probability

The Riddler

Post Reply Subscribe

...

03-04-2016 , 12:03 PM

Didace

Carpal \'Tunnel

Join Date: Nov 2009 Posts: 19,863

This is from a weekly column on FiveThirtyEight called The Riddler. How would one go about solving this?

Two players go on a hot new game show called “Higher Number Wins.” The two go into separate booths, and each presses a button, and a random number between zero and one appears on a screen. (At this point, neither knows the other’s number, but they do know the numbers are chosen from a standard uniform distribution.) They can choose to keep that first number, or to press the button again to discard the first number and get a second random number, which they must keep. Then, they come out of their booths and see the final number for each player on the wall. The lavish grand prize — a case full of gold bullion — is awarded to the player who kept the higher number.

Which number is the optimal cutoff for players to discard their first number and choose another? Put another way, within which range should they choose to keep the first number, and within which range should they reject it and try their luck with a second number?

Quote

03-04-2016 , 01:04 PM

md46135

grinder

Join Date: Mar 2015 Posts: 420

based only on intuition and logic i'd say you have to keep the number if it's >(?=)0.5. if you get 0.49, you have a better chance to get a better number in the second try than a worse one so you should discard 0.49. if you get 0.51, second try is more likely to give you a worse number so you would keep that one. i don't know if this is correct, but makes sense to me.

0.5 is kind of 'whatever', since the chance for a bigger number is the same as for a lower one.

Quote

03-04-2016 , 01:45 PM

RustyBrooks

Carpal \'Tunnel

Join Date: Feb 2006 Posts: 24,647

The other guy gets to replace also. Can he beat your strategy?

Quote

03-04-2016 , 02:20 PM

md46135

grinder

Join Date: Mar 2015 Posts: 420

if what i posted is GTO solution for this problem, opponent has to play exactly the same and it would be EV=0 for both players. i think it can also be EV=0 for both players if they play exactly the same way non-GTO, for example both change when it's lower than 0.6.

but if two players play differently, one has to have an edge. let's say that one player changes the number when it's lower than 0.6. he should then have a bigger quantity of low numbers on a large sample than the guy playing GTO and he would be EV-. GTO strategy would be EV+ against that player, but there would exist a strategy that's even better than GTO against that player, to exploit him. that strategy would have a limit lower than 0.5 for us keeping the number we get.

Quote

03-04-2016 , 03:50 PM

whosnext

Carpal \'Tunnel

Join Date: Mar 2009 Posts: 6,730

Quote:

Originally Posted by md46135

Nobody is doubting your "if-then" logic. But simply saying that x=0.5 is the best strategy is highly doubtful.

Consider if you knew the other player's cutoff = Y. Would you still choose your own cuttoff X = 0.5? If your cutoff is X and your opponent's cutoff is Y, what is the probability that you'll win?

Anyway, I will post my answer in the next post.

Quote

03-04-2016 , 04:03 PM

whosnext

Carpal \'Tunnel

Join Date: Mar 2009 Posts: 6,730

Spoiler:

Last edited by whosnext; 03-04-2016 at 04:58 PM.

Quote

03-04-2016 , 05:34 PM

md46135

grinder

Join Date: Mar 2015 Posts: 420

interesting, it's not as simple as it first seemed

i wrote a matlab program to test it. i tried it for cutoff 0.5 and here are the results.

Spoiler:

hero cutoff is fixed at 0.5. on x axis you have opponent's cutoff from 0 to 1 with step 0.01. for every opponent's cutoff (101 of them total) i ran 100000 trials and plotted hero's chance of winning on y axis. if you compare it to the red 50 line, you can see hero's probability of winning at somewhere near 0.6 goes below 50%.

here are the exact numbers

Spoiler:

it does worst against opponent cutoff 0.59, so i ran it for 0.59. results in spoiler.

Spoiler:

you can see there are less values below the red line and they're not far below it.
again, exact results

Spoiler:

this one does worst against 0.57. finding the minimum of that curve should give an exact result.

Quote

03-04-2016 , 06:43 PM

whosnext

Carpal \'Tunnel

Join Date: Mar 2009 Posts: 6,730

You are in the right ballpark.

Quote

03-05-2016 , 12:41 AM

NewOldGuy

Pooh-Bah

Join Date: Mar 2009 Posts: 5,935

Quote:

Originally Posted by md46135

0.5 is kind of 'whatever', since the chance for a bigger number is the same as for a lower one.

But you are playing against a thinking opponent. There is an equilibrium somewhere but it likely isn't 0.5.

Quote

03-05-2016 , 06:24 AM

#10

md46135

grinder

Join Date: Mar 2015 Posts: 420

i think i have found an exact solution for this problem. it seems to hit the numbers pretty good. anyway, here it is:

suppose you have two players - hero and villain. let h be hero's cutoff point (meaning that if a generated number is lower than that point, it gets discarded). also, let v be villain's cutoff point.

like whosnext said, there are 4 possible scenarios: both players keep, hero discards villain keeps, hero keeps villain discards and both discard. i used geometric probability (calculus) and a bit of conditional probability.

1) both players keep
(1-h)*(1-v) is the chance for both of them to keep. chance that hero's number is greater is 0.5*(1-v^2) - v*(1-v), but this has to be divided by (1-h)*(1-v) because it's conditional probability and (1-h)*(1-v) is the total observed geometric area. i hope that my writing makes sense

so, the chance for both players to keep and hero to win is simply 0.5*(1-v^2) - v*(1-v)

2) hero keeps villain discards
again, similar thinking. (1-h)*v for that scenario to occur multiplied by 0.5*(1-h^2) for hero to win divided by (1-h) because of conditional probability.

probability for this case v*0.5*(1-h^2)

3) hero discards villain keeps
h*(0.5* (1-h^2) - v*(1-h) )

4) both discard
h*v*0.5

total probability for hero to win is the sum of all 4 of those, and that's
(h*v)/2 + v*(v - 1) - v^2/2 + h*(v*(h - 1) - h^2/2 + 1/2) - (v*(h^2 - 1))/2 + 1/2

Quote

03-05-2016 , 06:34 AM

#11

md46135

grinder

Join Date: Mar 2015 Posts: 420

some results:
if v=0.5, this is p(h).

Spoiler:

exact maximum: probability to win = 0.5070, h=0.607625
check my post from yesterday, seems correct.

also, it seems equilibrium point is if both play with cutoff 0.618034

Last edited by md46135; 03-05-2016 at 06:48 AM.

Quote

03-05-2016 , 08:54 AM

#12

whosnext

Carpal \'Tunnel

Join Date: Mar 2009 Posts: 6,730

Spoiler:

Quote

03-05-2016 , 09:22 AM

#13

OmahaDonk

Pooh-Bah

Join Date: Apr 2011 Posts: 4,911

Quote:

Originally Posted by md46135

1) both players keep
(1-h)*(1-v) is the chance for both of them to keep. chance that hero's number is greater is 0.5*(1-v^2) - v*(1-v), but this has to be divided by (1-h)*(1-v) because it's conditional probability and (1-h)*(1-v) is the total observed geometric area. i hope that my writing makes sense

How did you get this value?

Quote

03-05-2016 , 09:53 AM

#14

md46135

grinder

Join Date: Mar 2015 Posts: 420

i only said the numbers looked close to simulation results, i'm not an expert (obviously) and i don't know if it's 100% correct. i derived the first case the way i've written it on this image.

Spoiler:

(1-h)*(1-v) is the probability that you are somewhere in that green rectangle, meaning that both players keep their numbers. you have to multiply that with p(v<h) to get the probability for hero to win.

p(v<h) should be red area (triangle) divided by green area (rectangle). when you calculate that, h just disappears from the formula.

Last edited by md46135; 03-05-2016 at 10:00 AM.

Quote

03-05-2016 , 10:22 AM

#15

heehaww

Pooh-Bah

Join Date: Aug 2011 Posts: 5,081

I haven't done the math yet, just came up with an intuitive guess, one that's too simple for me to be very confident in.

Pick a cutoff g that, once reached, gives Villain a 50% chance of beating g when aiming for g.

1 - g^2 = .5
g = sqrt(.5) =~ .707

If Villain has any higher cutoff, say .75, then Hero profits because of the times Villain starts with something like .73 and then squanders it (which results in Villain surpassing g <50% of the time).

If Villain has any lower cutoff, say .65, then Hero profits because some of the 50% of the time Hero doesn't reach g, Hero will still surpass .65, so overall Hero will surpass Villain >50% of the time.

I'll be more confident once I do the math. I think a proof will require some basic calculus.

Quote

03-05-2016 , 11:47 AM

#16

kapw7

veteran

Join Date: Jan 2005 Posts: 3,156

I get the same number like md46135 which is actually 1/phi, phi being the golden ratio
but in a different way which can probably be formalised.

Let a the cutoff. The expected value is E(X)=E(A)+E(B) where A is drawing twice (so P(A)=a) and B drawing once. E(A)=a*0.5 as the second try has expectation 0.5
Also E(B)=(1-a)*(1+a)/2 since P(B)=1-a and (1+a)/2 is the "average" (This can be formalised by considering uniform distribution and integrating from a to 1)

then E(X)=-0.5*(a^2-a-1)

Here is where i got stuck as I thought you have to find the maximum which leads to a=0.5. But you should instead set E(X)=a which leads to the familiar (cf. Fibbonacci sequence) equation a^2+a-1=0 and the answer above

So getting the same result as md and the (inverse) golden ratio makes me more confident. I still need to read in more depth md's answer, mine could maybe have some oversimplifications

Quote

03-05-2016 , 01:23 PM

#17

Cbrewer4

adept

Join Date: Feb 2014 Posts: 750

Seems to me in GTO it would be that we lose (1-X) times on try 1 with X being where we chose to hold, then in his second try if we are both rational he will then win (1-X)X . So our opponents total probability is (1-X)+(1-X)X or 1-X^2

Quote

03-05-2016 , 02:15 PM

#18

whosnext

Carpal \'Tunnel

Join Date: Mar 2009 Posts: 6,730

Spoiler:

Quote

03-05-2016 , 02:27 PM

#19

md46135

grinder

Join Date: Mar 2015 Posts: 420

the paint image i posted is valid only for the first case, others are different. i don't know, but it would be very lucky to just stumble onto the right solution with 4 different integrals...

Quote

03-05-2016 , 07:26 PM

#20

kapw7

veteran

Join Date: Jan 2005 Posts: 3,156

Quote:

Originally Posted by whosnext

kapw7, this is a clever solution technique. If I am following you are essentially saying that at the equilibrium point, a player's average value will be equal to the equilibrium cutoff point. I haven't thought of it that way.

Can anybody shed light on why this condition is true (I am not doubting that it is true, I just need some enlightenment)?

You are right it is handwavy. This (that it is a necessary AND sufficient condition) has to be proved and I could not think of some convincing argument for this. You probably have to go through double integrals etc but then it doesn't look that charming anymore. For me the best part of the problem is that maximizing the final score is not the best strategy. Hero can choose a=0.5 and maximize his final score but then villain can still beat him by having a higher number of wins. So hero will be winning more comfortably the times he is ahead but villain will be making more wins overall (but more narrow).

Quote

03-06-2016 , 09:46 PM

#21

David Sklansky

Administrator

Join Date: Aug 2002 Posts: 17,073

If my opponent and me have the same strategy, cutoff Q, and if it is unexploitable that means that when I flip against his unknown number I will have the same chance of winning as when I stand those times I am exactly at the cutoff and choose to stand. When I stand with Q my chances are the probability he needs to flip, which is Q, times the probability my cutoff hand will beat his flip, which is also Q.

So my chances are Q squared

When I have a flipping hand, my chances are 1/2 those Q times he flips (thus Q/2) plus half the value of the distance between Q and 1 (eg .15 if Q was .7), which is [(1-Q)2] times the probability he stands pat (which is 1-Q)

So my chances by flipping is (Q+1-2Q+Qsquared)/2

But that has to equal the chance of winning if you are dealt exactly Q and stand. Do you see why?

Thus Qsquared = (1-Q+ Qsquared)/2

Turning into Qsquared + Q-1=0

Which means that Q is the [(square root of 5)-1]/2

Quote

03-08-2016 , 05:31 PM

#22

Precept24

Carpal \'Tunnel

Join Date: Feb 2005 Posts: 8,605

Damn. That's awesome. Well done.

Quote

03-11-2016 , 10:18 AM

#23

Didace

Carpal \'Tunnel

Join Date: Nov 2009 Posts: 19,863

The Answer from FiveThirtyEight

Let C be the optimal cutoff the players use. The key observation is that if the first number revealed is exactly C, then the probability of winning by keeping C equals the probability of winning by pressing the button again — you are indifferent. We can compute each of these probabilities, keeping in mind that the other player is also using C as their cutoff.

probability player 1 wins by keeping C = probability player 2 gets a number below C for both presses = C⋅C

probability player 1 wins by pressing again = (probability player 2 presses again) * (probability player 1 wins by pressing again | player 2 presses again) + (probability player 2 keeps first number) * (probability player 1 wins by pressing again | player 2 keeps first number) = (C)⋅(1/2)+(1−C)⋅((1−C)⋅1/2)

As noted above, the above two are equal, so

C⋅C=(C)⋅(1/2)+(1−C)⋅((1−C)/2) which simplifies to

C2+C−1=0 The quadratic formula gives the solution.

C=(5√−1)/2=0.618034…

Note that this cutoff is the golden ratio minus one, known as the golden ratio conjugate. So using the golden ratio gives the best chance to win the gold bullion!

Looks a lot like Sklansky's answer to me!

Quote

03-11-2016 , 02:50 PM

#24

R Gibert

adept

Join Date: Jan 2006 Posts: 936

Quote:

Originally Posted by Didace

The Answer from FiveThirtyEight

Let C be the optimal cutoff the players use. The key observation is that if the first number revealed is exactly C, then the probability of winning by keeping C equals the probability of winning by pressing the button again — you are indifferent. We can compute each of these probabilities, keeping in mind that the other player is also using C as their cutoff.

probability player 1 wins by keeping C = probability player 2 gets a number below C for both presses = C⋅C

probability player 1 wins by pressing again = (probability player 2 presses again) * (probability player 1 wins by pressing again | player 2 presses again) + (probability player 2 keeps first number) * (probability player 1 wins by pressing again | player 2 keeps first number) = (C)⋅(1/2) + (1 − C)⋅((1 − C)⋅1/2)

As noted above, the above two are equal, so

C⋅C = (C)⋅(1/2) + (1 − C)⋅((1 − C)/2) which simplifies to

C² + C − 1 = 0 The quadratic formula gives the solution.

C = (√5 − 1)/2 = 0.618034…

Note that this cutoff is the golden ratio minus one, known as the golden ratio conjugate. So using the golden ratio gives the best chance to win the gold bullion!

Looks a lot like Sklansky's answer to me!

I made a couple of corrections and did a little reformatting in the posted solution above.

You missed the following from the article:

...I’m also proud to report that David Sklansky, poker god, correctly solved this problem, although he was not the randomly chosen winner. Sorry, David.

Quote

Post Reply Subscribe

...