CoTW: Why all-in-EV is a horrible measure of overall luck - Page 2 - Micro Stakes Pot Limit and No Limit

First of all, luck-a and luck-b are not events, but statistics. Furthermore events that are mutually exclusive are never independent (except for trivial anomalities like events with probability 0). Most importantly however I seriously doubt that luck-a and luck-b as random variables are independent. If you catch better hands as per luck-b you're bound to get it in more often which influences luck-a.

Thanks for pointing out my imprecise language regarding events and statistics.

Lets take my 107k hands mentioned in the OP as an example. In that sample I had 389 AIEV situations and 13733 VPIPs for non-AIEV and let us assume that with the same luck-a and luck-b that my next 107k hands would break down the same way (389 AIEV, 13733 non-AIEV VPIPs)

changing luck-b clearly can change how many AIEVs I get in the sample (e.g. perhaps I get 420 AIEVs next 107k hands instead of 389)
changing luck-b does not affect whether I run above or below EV for each all-in. I.e. the fact that when I have AA vs KK, the KK villain is a drooler more often (improved luck-b) does not change my equity when we get it AI preflop nor does it affect whether I run hot or cold in this situation.
running hot or cold for AIEV does not affect whether droolers or nits wake up with medium, 2nd or 3rd best hands when I have a strong hand.

Luck-a and luck-b are mutually exlusive in the sense that luck-b is defined as "all the luck other than luck-a". I stick by my (implicit?) assertion that no part of luck-b changes the probability that my AA vs your KK preflop will win 81% of the time. I.e. that luck-a and luck-b are independent.

Quote:

All in all it looks interesting, but a lot of the things you stated are rather murky.

Please be specific so I have the possibility of addressing a murky area. Of course it is possble that I leave the area murky because it is too much work to address it. After all, it is easier to look where the light is good.

EDIT: I see that my definition of luck-a is too tersely worded. luck-a is defined as how far above or below EV your run in AIEV situations. I.e. luck-a is what you see when you look at an AIEV graph vs your actual results.

Quote

07-07-2010 , 03:30 PM

#28

mpethybridge

Carpal \'Tunnel

Join Date: Jul 2005 Posts: 16,997

Quote:

Originally Posted by zachvac

No it's not and it's a super-common misconception among a lot of people. If you think that money won is a better indication than the all-in ev line is of expected winnings you have a super flawed understanding of what luck means and what all-in ev is. I have since read the entire post and it doesn't refute my point at all. Money won uses 0% of luck. Even if all-in ev incorporates 1% of luck incorporating that 1% will still be more accurate than only using the 0% of money won.

This of course all is ignoring card removal which may be a flaw in all-in ev.

edit: btw not criticizing the OP in any way and I would assume that OP agrees with me? I agree all-in ev is a small amount of luck but it's the only fully quantifiable luck and quantifying 1% > 0%

I don't have a lot of time, so I will probably come back to this thread. The most important thing I have to say so far is +1 to Zachvac. He is absolutely correct. If all you have to go on is money won and ev adjusted money won, using ev adjusted money won is preferable, simply because controlling for some luck gives you a better result than controlling for no luck.

The second thing I would say is that the OP is correct, but is stating a very limited thesis. All in EV is a horrible measure of all luck in poker. But it doesn't purport to be anything like that. It only purports to be a measure of a specific type of luck, and its somewhat flawed methodolgy of measuring that specific type of luck makes it less reliable than perfect knowledge would be, but does not mean that it is useless or counterproductive to use.

The third thing I would say is that the OP is correct that there are a ton of ways to run bad. I just finished a database analysis for a guy whose win rate with AA was 1/3 below average, and, after detailed analysis, it turned out that the reason for this was that he was running bad in the frequency his opponents had a hand they could call him with.

Quote

07-07-2010 , 04:40 PM

#29

BaldAdonis

grinder

Join Date: Jul 2009 Posts: 499

Quote:

Originally Posted by funkyj

Problem 2:

Given luck-a and luck-b definitions above ...

If you could run -1/2 standard deviation below EV for one of the types of luck and +1 standard deviation above EV for the other type of luck for the rest of your life, which would you chose to run hot in, luck-a or luck-b?

This is interesting because the right answer depends a lot on playing style. If you are 20bb shortstacking, then you'll be in a lot more AI situations with a few cards to come, so being 1 std dev up on your chances there would be fantastic. It would certainly outweigh the benefit of opponents having strong hands when you have very strong hands, like KK v AA, because you can only win 20bb when those hands do show up.

Quote

07-07-2010 , 04:58 PM

#30

funkyj

Carpal \'Tunnel

Join Date: Jun 2008 Posts: 6,416

Quote:

Originally Posted by mpethybridge

Quote:

Originally Posted by Funkyj's original post

The problem with all-in-EV is not that it is a bad statistic but that many folks treat all-in-EV as if it is a measure of their overal poker luck

fixed my post. Not everyone misuses AIEV but many do.

Quote:

Originally Posted by mpethy

... , but does not mean that it is useless or counterproductive to use.

Agreed. One of the areas we are in the dark with respects to AIEV is what portion of poker luck this is measuring.

I think that comparing EV adjusted stddev/100 and regular stddev/100 might provide some insight into how big a factor AIEV variance is in a player's overall results. I need to learn how to calculate this new (?) stat.

Until we have a better idea what fraction of overall luck AIEV is factoring out we don't know how much clearer controlling for AIEV luck is making the overall picture.

If mere results were the best measure of skill then we would have to agree that Hellmuth is the world's best NLHE tournament player ever by a huge margin...

By all means, prefer your AIEV line over your actual winnings line. I stick my my claim that comparing results graphs (actual or AIEV adjusted) to determine who is the better player for all but very large samples is a bad idea.

Now days I use my EV graph as a psychological tool to:

avoid overconfidence (sorry buddy, you are just running hot)
patience (yes, you have been running as bad in AIEV as you think you have)
motivate study (ugh, the graph of your AIEV only hands is horrible -- you need to stop stacking off bad)

Using AIEV for anything more than a tool to direct further investigation is flawed.

A while back in an epeen coaching thread where some high stakes (?) player was saying mpethy was not qualified to coach 200NL players for leak finder sessions he (mpethy) explained that seeing certain combinations of stats was not a guarantee that a leak was present. Instead it is a signpost that he should review hand histories from a particular type of situation to see if there was truly a leak of if the unusual stats were the result of this other luck (all the area outside of the well lit AIEV area). This is what I'm saying about results graphs (actual and AIEV) -- it doesn't tell us much by itself. It is merely a signpost that we may want to look into somethings.

Quote

07-07-2010 , 05:08 PM

#31

funkyj

Carpal \'Tunnel

Join Date: Jun 2008 Posts: 6,416

Quote:

Originally Posted by BaldAdonis

Even shortstacking, luck-b affects what ranges your opponents have when they call your shove. You are correct that luck-a has a bigger impact on short stackers since a larger percentage of their VPIP hands are all in.

If you are a shortstacker and you shove and get called 10 times do you want those 10 calls to be from droolers and otherwise competent full stack players who don't know how to play effectively against short stacks or do you want those 10 calls to be from folks who play perfectly in the BTN, SB, BB against your HJ shove?

luck-b greatly impacts what winrate the short stacker's EV adjusted line shows.

Quote

07-07-2010 , 06:26 PM

#32

mpethybridge

Carpal \'Tunnel

Join Date: Jul 2005 Posts: 16,997

Quote:

Originally Posted by funkyj

Agreed. One of the areas we are in the dark with respects to AIEV is what portion of poker luck this is measuring.

I think this is unknowable. The potential number of ways you can run bad is basically infinite, right? I mean, there is going to be a normal distribution of players who semi-bluff raise a flush draw with T8s--statistically speaking, some poor bastard is going to be in against AQ of his suit every time.

The one thing you do know about all in ev is that it tracks your luck in a spot that is, on average, a way bigger pot than most situations. So, while me may not be able to quantify what percentage of luck all in ev represents, we can describe it qualitatively as a pretty damn big deal.

Quote:

Originally Posted by funkyj

I think that comparing EV adjusted stddev/100 and regular stddev/100 might provide some insight into how big a factor AIEV variance is in a player's overall results. I need to learn how to calculate this new (?) stat.

No comment. I am not a math guy. I took a few stats course twenty-odd years ago in college, and that knowledge has almost completely atrophied. So good luck, but I'm no help at all.

Quote:

Originally Posted by funkyj

Until we have a better idea what fraction of overall luck AIEV is factoring out we don't know how much clearer controlling for AIEV luck is making the overall picture.

This may be true, but to me, it goes into the "so what," category. If you could figure it out, I'd give you mad props, but I don't think it is important to our development as players.

Quote:

Originally Posted by funkyj

If mere results were the best measure of skill then we would have to agree that Hellmuth is the world's best NLHE tournament player ever by a huge margin...

Wait, you mean he's not?

Quote:

Originally Posted by funkyj

By all means, prefer your AIEV line over your actual winnings line. I stick my my claim that comparing results graphs (actual or AIEV adjusted) to determine who is the better player for all but very large samples is a bad idea.

I disagree. I think you just have to remember confidence intervals and the probabilities of a normal distribution when you do the comparison. If I am winning at 2ptbb at $50 and you are winning at 3ptbb, then the odds are that you are a better player than I am. Using SDs and sample sizes we can actually quantify the probabilities, but the simple fact of the matter is that most of the time the graphs will be a correct gauge.

You calling this a bad idea leads me back to the point I seem to make a lot on these forums, which is that I think people have unreasonably high standards of proof with respect to poker. It is inherently a game of incomplete information, yet when people talk about win rates and how quickly HUD stats converge and whatnot, now, all of a sudden, anything less than a 95% confidence interval is dismissed as unreliable. It makes me smile, is all.

If you are going through life expecting to make your decisions at the 95th% confidence interval, you are going to have a hard life (and lose at poker along the way, too).

Quote:

Originally Posted by funkyj

Now days I use my EV graph as a psychological tool to:

avoid overconfidence (sorry buddy, you are just running hot)
patience (yes, you have been running as bad in AIEV as you think you have)
motivate study (ugh, the graph of your AIEV only hands is horrible -- you need to stop stacking off bad)

Using AIEV for anything more than a tool to direct further investigation is flawed.

I basically agree with this section of your post, I think. The last sentence seems overstated to me. All in EV controls for a specific type of variance. There are lots of times you want to control for variance to the extent you can. Basically, any time you are using your tracker to analyze your game, you want to be looking at your EV adjusted win rate, not your actual win rate, because you are incrementally closer to an accurate description of your profitability.

Basically, the ev adjusted win rate filters out a specific type of noise. It is always useful to filter out that type of noise, even if it leaves other types of noise behind.

The only bad use of ev adjusted win rates that people engage in is using it as an excuse to not work on their game--"My EV adjusted win rate is 2ptbb/100, therefore, I am beating the game even though I am b/e, therefore I don't need to work on my game." That thinking is flawed because their win rate is close enough to b/e that maybe they are running hot enough in other situations to give them a 2ptbb win rate when, in fact, they are a loser in the game. and it is flawed simply because it is lazy.

Plenty of people use all in ev in just this way. It's a leak that needs plugging in their game.

Quote:

Originally Posted by funkyj

A while back in an epeen coaching thread where some high stakes (?) player was saying mpethy was not qualified to coach 200NL players for leak finder sessions he (mpethy) explained that seeing certain combinations of stats was not a guarantee that a leak was present. Instead it is a signpost that he should review hand histories from a particular type of situation to see if there was truly a leak of if the unusual stats were the result of this other luck (all the area outside of the well lit AIEV area). This is what I'm saying about results graphs (actual and AIEV) -- it doesn't tell us much by itself. It is merely a signpost that we may want to look into somethings.

This was Spino1i in the witch hunt thread in the coaching advice forum. It wasn't really an e-peen thread.

But, yeah, what I said in that thread is true. When I use all in ev, I use it like this. OK, your win rate with AA is x; that is 1/3 below normal for a solid winning reg. OK, let's check your ev adjusted winrate. OK, it is higher than your raw win rate, so we know you are running bad, at least to an extent. Now let's check to see if you got it in good in those spots. Then check other spots, etc. etc.

So checking all in ev is an important step in almost every analysis that I do. But that is all it is--a step. It is very rarely the be all and end all of an analysis.

But, by the same token, sometimes it is. If I see a guy whose win rate is 1/3 below average for winning regs, but his ev adjusted win rate is right at average, that's it. I am saying game over, you are just running bad with AA, and I doubt you have a significant leak there. Usually, though, the answer is not that clear cut.

Quote

07-07-2010 , 06:36 PM

#33

mpethybridge

Carpal \'Tunnel

Join Date: Jul 2005 Posts: 16,997

Quote:

Originally Posted by funkyj

Here is a query to mpethy: What proportion of the average leak finder client's list of 3 biggest leaks identified in a leak finder session are leaks involving all-in-EV type hands?

For players playing, say, NL $100 and below, I would say that for 80-90% of them, at least one of their three biggest leaks is a situation that is significantly influenced by all in luck. Usually, I see precisely that their second and fourth biggest leaks are both situations that are heavily influenced by all in ev.

If you are wondering how I can be this precise, it is because almost everybody up to NL $100 has all 3 of the three biggest leaks, and almost everybody playing $100 has the 4th biggest leak.

With players playing above $100, spots that can be influenced by all in ev become less of a leak, and I would say that the average NL $200 player may only have one small leak that can be substantially affected by all in ev. Sometimes I see 2, but in these cases they are usually the third and fourth biggest leaks, not the second and fourth biggest leaks as they are for NL $100 and below.

Quote

07-07-2010 , 07:05 PM

#34

Husker

Carpal \'Tunnel

Join Date: Feb 2007 Posts: 12,883

Quote:

Originally Posted by mpethybridge

Spill the beans/ leaks

Quote

07-07-2010 , 07:36 PM

#35

Vanilla Thunder

NVG All-Star

Join Date: Mar 2010 Posts: 16,783

good stuff, keeeeep it real

Quote

07-07-2010 , 10:03 PM

#36

venice10

Referee

Join Date: Nov 2007 Posts: 25,852

Quote:

Originally Posted by funkyj

Until we have a better idea what fraction of overall luck AIEV is factoring out we don't know how much clearer controlling for AIEV luck is making the overall picture.

I filtered for all hands that I won. The I broke them out into hands where there was an all in and call on any street. About 20% of the total winnings was in AI situations. While not 4 digit accurate, I think one can say that the luck measured by AIEV is about 20% of the total luck one has.

I think that's enough to want to include it in any analysis, but not enough to complain that you can't win because of it.

Quote

07-07-2010 , 10:33 PM

#37

stry67

veteran

Join Date: Feb 2009 Posts: 2,542

Quote:

Originally Posted by venice10

Very good post overall. You can't do a damn thing about running bad. However in the micros, you've got so many leaks that if you fixed them, you can't lose over the medium term.

Much, much more important than spareing more than 5 seconds of thought to your all in EV numbers.

Quote

07-07-2010 , 10:34 PM

#38

stry67

veteran

Join Date: Feb 2009 Posts: 2,542

Quote:

Originally Posted by Husker

Spill the beans/ leaks

+1000. Give it up Mpethy!!!

Quote

07-07-2010 , 11:37 PM

#39

300zxrider

veteran

Join Date: Jul 2008 Posts: 3,085

Quote:

Originally Posted by stry67

+1000. Give it up Mpethy!!!

+1001

Quote

07-08-2010 , 08:08 AM

#40

Pahvak

centurion

Join Date: Dec 2008 Posts: 149

Quote:

Originally Posted by funkyj

How often are you dealt AA, KK? There is a normal distribution for this.
How often do your pocket pair flop sets? Normal distribution.

WTF??? The first one is discrete distribution and we all know that normal distribution is not discrete.

Well the second one. You either flop a set or you dont, its called bernoulli distribution or binomial whatever you prefer.

And are you sure poker winnings (pot sizes) are normally distributed? Have you made some statistical tests? I have and the tests failed. Even Mathematics of Poker assumes that winnings are normally distributed, but I have never seen any proof of this.

Don't just assume that there's only one distribution in the world that describes our life... And if we don't know the distribution, then we can't say much about confidence intervals etc.

Quote

07-08-2010 , 08:28 AM

#41

spadebidder

Actually Shows Proof

Join Date: Aug 2008 Posts: 7,905

Quote:

Originally Posted by Pahvak

That's a bit nitty. For reasonable sample sizes both events will closely approach a Gaussian distribution ("normal") to the point where the difference can be ignored unless you need very high confidence levels.

I agree with your other point, and I don't think it is likely that poker winnings are normally distributed. There's no reason they should be.

Last edited by spadebidder; 07-08-2010 at 08:33 AM.

Quote

07-08-2010 , 08:53 AM

#42

Pahvak

centurion

Join Date: Dec 2008 Posts: 149

Quote:

Originally Posted by spadebidder

How one can say that getting AA, KK is normally distributed? Random events which are normally distributed will take real number values. So how is reasonable sample size gonna help?

Quote:

Originally Posted by spadebidder

I agree with your other point, and I don't think it is likely that poker winnings are normally distributed. There's no reason they should be.

Ty. And this is far more important than the AA/KK point.

Quote

07-08-2010 , 09:11 AM

#43

Cangurino

Carpal \'Tunnel

Join Date: Apr 2008 Posts: 13,476

Quote:

Originally Posted by Pahvak

This was one of my main concerns as well when reading the OP. Terms like "normal distribution", "standard deviation", etc. are used a lot without explanation, motivation, or description of the related parameters.

E.g.:

Quote:

You could run 1 standard deviation above all-in-EV for your entire life and 2 standard deviations below expectation in all the forms of luck that EV does not measure and, if you thought all-in-EV was the beginning and end of luck you would think you were running hot but suck at poker.

What is one standard deviation of all-in EV for a lifetime of poker? We assume that it's close to normally distributed with a mean of 0, but how far is it spread out? This really depends on the playing style.

The main reason why all-in EV is useful is that in theory you can compute it accurately. Some of the other factors are hard to quantify. However, things like getting dealt aces or hitting sets happen much more often than all-in pots, so we would expect them to converge a lot faster. Moreover, getting dealt aces is worth only something like 5ptbb. Winning an all-in pot is worth 100ptbb.

Just as an example, I played 185222 hands this year. I was supposed to see 10895 pocket pairs, or 838 of each rank. In fact I saw 10996 pairs, 888 times 66, and 778 times 55. Statisticians can probably deduce from that if I ran incredibly hot in that aspect, or if it is just a slight deviation from the expected value.

I kind of forgot what my point was, but I'm posting this anyways.

Quote

07-08-2010 , 09:22 AM

#44

diseage

centurion

Join Date: Feb 2009 Posts: 135

Quote:

Originally Posted by Husker

Spill the beans/ leaks

Quote

07-08-2010 , 09:40 AM

#45

pokerbiker

veteran

Join Date: Sep 2009 Posts: 2,589

Quote:

Originally Posted by funkyj

Here is a list of aspects (situations) of no limit hold’em where there is an expected value and normal distribution that is not measured by all-in-EV. The point of this exercise is to see how much bigger (qualitatively, if not quantitatively) the elephant of poker luck is than the “elephant tail” of all-in-EV that folks like to use as their main barometer of luck.

How often are you dealt AA, KK? There is a normal distribution for this.
How often do your pocket pair flop sets? Normal distribution.
...

i do agree that there are a lot of other aspects for poker luck, but i do not support your conclusio and it also lacks evidence, compared to your overal statistical approach.

----

i am pretty sure your database will be the best example; analyse your database, how often you got dealt aa/kk vs how often you should have; how often you flopped a set vs how often you should have. you should be quite near the theoretical optimum.

the things you mentioned are influenced by poker luck, but the sample size is WAY WAY bigger than the ones with all-in; so the likelyhood of a deviation is smaller, compared to having a smaller sample size.

furthermore, there is a huge difference in the absolut amount of bb which are influenced by this deviation. getting dealt less aces, influences the winnings by ~0.6bb/ace; winning one all-in less influeces the winnings by ~50bb (average pot AI?).

so yes, while there are other influencing factors to poker luck, ai-ev does have one of the biggest impacts on the overall winnings.

---

that is also the problem with your luck-a/luck-b excercise; of course you would choose the non-aiev luck to your favourite. but it's not about choosing, its about the likelyhood of a deviation therein and its influence on the winnings. this is directly influences by (1) the sample size and (2) the impact on the winnings.

Quote

07-08-2010 , 11:56 AM

#46

DDAWD

Carpal \'Tunnel

Join Date: Aug 2009 Posts: 6,879

Quote:

Originally Posted by spadebidder

PTR was trying to determine at one time whether FR or 6max games are tougher. They did it by graphing winnings of players. Both were normally distributed, I believe.

I'm pretty fuzzy on the details, so if someone could link to this if they know how to find it, that would be good.

Quote

07-08-2010 , 12:11 PM

#47

DDAWD

Carpal \'Tunnel

Join Date: Aug 2009 Posts: 6,879

http://www.pokergurublog.com/content...-6-max-softest

I couldn't find the original article.

I'm not posting this to hijack to a discussion on 6max vs FR. Just to show the graphs which are at the bottom of the link. Not quite a normal distribution, but close.

Quote

07-08-2010 , 12:49 PM

#48

SammyG-SD

Carpal \'Tunnel

Join Date: Aug 2007 Posts: 12,600

Quote:

Originally Posted by Pahvak

WTF??? The first one is discrete distribution and we all know that normal distribution is not discrete.

there are several Field Award winning mathematicians that will disagree with this. Some believe any thing study is discrete.

Gets into my theory that anything studied and analyzed will be periodic even if its not in nature.

Quote

07-08-2010 , 01:37 PM

#49

spadebidder

Actually Shows Proof

Join Date: Aug 2008 Posts: 7,905

Quote:

Originally Posted by DDAWD

Not quite a normal distribution, but close.

I wouldn't call it close at all. All three have the expected negative skew to about 65/35 (losers/winners) and a fat tail on the left. And the full ring graph has a lot of extra kurtosis (sharp peak). Heads-up is the closest to normal if we center it on -4bb/100 but it still has a bit of a fat tail on the left. None of these could be called normal distributions.

They look a lot like Gumbel distributions to me.

Quote

07-08-2010 , 01:48 PM

#50

Money022

old hand

Join Date: Nov 2008 Posts: 1,809

Quote:

Originally Posted by mpethybridge

The second thing I would say is that the OP is correct, but is stating a very limited thesis. All in EV is a horrible measure of all luck in poker. But it doesn't purport to be anything like that. It only purports to be a measure of a specific type of luck, and its somewhat flawed methodolgy of measuring that specific type of luck makes it less reliable than perfect knowledge would be, but does not mean that it is useless or counterproductive to use.

Thank you for pointing out something I could not have said better myself.

It's just a stat, it is what it is. Sure I look to see if I'm above or below my expected value but I don't consider this one stat by any stretch of the imagination to be an indicator of a person's poker success or abilities. They have nothing to do with one another. I think the EV graph was added so math nerds could have something else to contemplate and analyze. No offense to math nerds.

Hell it's a great topic for discussion but I don't see why this was a topic for a CoTW? I guess it's still a "concept".

Either way, props to the OP for spending the time to formulate his thoughts on the topic with such detail.

Quote

Page 2 of 7

First

1 2 3 4 5 6 7

Last

Post Reply Subscribe

...

Page 2 of 7

First

1 2 3 4 5 6 7

Last