Winning a tournament takes less skill than making the final table - Poker Theory

Two Plus Two Forums Poker Strategy Poker Theory & GTO

Winning a tournament takes less skill than making the final table

Post Reply Subscribe

...

11-16-2017 , 02:49 AM

robert_utk

Not From the UK

Join Date: Jan 2005 Posts: 4,822

I believe that it could be shown that it requires less skill to win a tournament than it does to make the final table.

This comes from my pet theory about what I call "Decision Gates".

In addition to our own style and skill, to play perfect poker we need to know three things about our opponents. We need to know their bias (or preference of poker), we need to know the sophistication (or naivete) of a player, and we need to properly understand the decisions that player is making with that bias/sophistication combination.

That last part is where decision gates happen. A player that has just made a decision involving many possible choices has stepped through the decision gate and will reap the reward or suffer the consequences. Not all decisions in poker, played by humans, are equal. Some spots in a poker hand are just more difficult, with more viable options that may have similar perceived EV. In cash poker, I believe this communicates actionable information about ranges. If you know what poker a player plays, how well they can play it, AND what level of difficulty decision they have just made, you should be able to play perfectly against that player, unless you are just worse at those same poker decisions.

But what about tournament poker? There are certainly difficult decisions within each hand of poker played, but there is also an overarching tournament drama playing out around the player, and most of that is totally out of the control of that player. All that player can do is make the best decisions possible. But, the decision gates get narrower after the tournament starts. Each decision, regardless of any other factors, gets more difficult in a tournament, up to a certain crisis point deep in the money and then releases while the added tournament pressure on the gates diminishes.

Right off at the start, lets put forth a working definition of "tournament poker skill". It will be shown that this is not necessary, I believe, but lets do it anyway.

Let "tournament poker skill" be: The ability of a player to correctly choose the most optimal series of bets, call, folds, or raises that maximize her chip stack and accumulate the largest possible chip stack, given the random series of starting hands she is dealt by the dealer(s).

Tournament: Any tournament where up to 10 percent of the field will constitute a "final table". So, 9-max poker will be a tournament starting with at least 90 participants.

So, why do we care how much skill it takes at any given point of a tournament anyway?

Not only should this be interesting, but it is also relevant to tournament strategy. Poker players vary by skill, and since poker is a game of skill then the beginning of a tournament is the softest field of players, by definition. Sure weak players can go deep and even win a tournament. But on average, the field of players starts out the softest and gets tougher as the tournament progresses. Plus, the payouts are heavily weighted to pay the final table participants. As always in poker, it's the money that matters.

Can it be shown that there should be a tipping point in the course of a tournament where the skill to reach the final table is equal to the skill to win once at that final table? If so then the optimal poker decisions before such a point may be different than the poker decisions after such a point.

So, if the field gets tougher as the tournament progresses how can it be said that Winning a tournament takes less skill than making the final table?

Well, not cumulatively, that is to say that it is obvious that if we could sum up all the more difficult decisions a player has made over the course of a tournament, this number can only be added to once the player does indeed make a final table. I am not talking about simple addition. What I am saying is that THE DECISIONS GET EASIER once a player has reached the final table versus the decisions the player had to make at an earlier stage of the tournament. Even though the payouts are huge at the final table, the decisions are easier. This would seem to conflict with a lot of advice about how to play the middle stages of tournament poker.

"Tournament Skill", regardless of how that is defined, will have a nonlinear function across time as the hands of poker are played. The decisions of a participant will be the most *easy* at the start of the tournament and increase in difficulty to a peak point somewhere in the tournament and then drop back down in difficulty as the later stages of the tournament are reached. By the time the final table is reached, the remaining decisions to win 1st place in the tournament will require less skill than the same duration of hands required under the peak of the difficulty curve.

This should hold true for all participants, regardless of skill, regardless of stack size, regardless of the money bubble, regardless of luck.

Whatever happened that a player has made it to the final table, that player has pulled it off, and cleared the high hurdles of tournament poker, but has not won the big money yet.

For starters, lets take a look at the first hand of poker dealt. Every player has identical chip stacks. There are no antes. Every player probably has at least 100BB, usually lots more. The field is the softest it will ever be. Decisions made in this hand will be the easiest in the whole tournament. Whatever difficulty a participant experiences at tournament poker, this is the starting point.

Now let us jump forward to the point in the tournament where 80 percent of the field remains. Still far away from the money, still plenty of average BB's in chip stacks. What is more difficult? Well now we have chip stack dynamics. Now we have antes. Now, some players will begin to make decisions for their tournament life because they will be all in and potentially covered by a larger stack. Now players are getting some info about how different players play and may choose to respond to that info. But still, being so early in the tournament the players do not (and can not) have to consider the exact consequences of each decision. They are just gathering chips, and trying not to spew them off in turn.

Lets jump ahead to 60 percent of the field remaining. Average BB count is falling, while average chip stack continues to rise. Some stacks are huge, most are below average. The short stacks will have under 10BB. The biggest stacks are now becoming much more likely to earn a paycheck for a days work. Some players are correctly mucking the same holdings that other players are correctly shoving.

But still, it gets more difficult.

Now, lets talk about the ten percent before the bubble, and the ten percent after the bubble. On many sites this could be the play from 40 percent down to 20 percent of the field remaining. The dynamics involved with stack sizes when nearing the money bubble make having an average stack the most difficult. Short stacks are just waiting to shove, which is a binary and largely solved decision. Large stacks are feasting on the average stacks. Furthermore, most players have the average stacks or below, which is not a strong position to be in during money bubble poker.

Remember, regardless of the individual skill level of the participants, each decision made by the players has gotten more and more difficult, thus far in the tournament. There is simply more and more relevant information continually piling up that should be considered.

So, then what is the high point of difficulty in the tournament?

I believe it is in the money, after the short bubble squeakers are cleared out. The shortest stacks to survive and make the money (purposefully or by circumstance) will be shoving wide to either double up or take a minimum payout and go home. After these players are cleared out, we are STILL not at the final table, and the payouts STILL are minuscule. To make it this far and then get put out may in fact be a net loss of utility, depending on whatever else a player could have been playing to earn money at poker instead of devoting so much time to a tournament. This is it. This is maximum investment for the least reward. This is where the field is simultaneously full of skilled opponents and littered with landmines.

The dollar per decision here is one way to look at it. If you advance in pay from 1.2 BI to 1.4 BI, you may have to win two flips just to do so. If you win a flip with 26BB stack and get a 52BB stack, and you are still at least 18 places away from the final table, maybe 27 places away depending on the tournament size. After 2 increases of blind levels, you will likely be below 20BB and still not at the final table. Your chips stack is shrinking, and YOU MAY NOT EVEN GET THE OPPORTUNITY TO DO ANYTHING ABOUT IT!!

This is an important point, I think. The number of hands played in the tournament is finite, and you are running out of opportunities to play poker. At the start of the tournament the number of hands to be played is sufficiently large that the player can ignore the relative opportunity cost of choosing to play a hand versus choosing to muck a hand. Once again, regardless of the player, the skill required to make each decision is higher because now in the money the number of future decisions is getting much smaller. Each decision will have a greater influence on that players tournament, more than any other point in the tournament...

...even than at the final table.

...even when heads up to win the tournament.

This last part should be relatively simple, compared to the maze of gates that the player navigated about 20 places ago. Once you arrive at the final table, your role is clearly defined by your chip stack. Also, this is the last time you will face a full table of opponents. Each eliminated opponent means more clarity of decision. Shorthanded play is wider ranges and higher variance, but this is because the hands being dealt are closer in effective strength. The money pay jumps are clearly defined and are now noticeable amounts of money that matter. The number of hands left to play is still getting less and less, however the player now has the clarity of the payouts and the clarity of the endgame of the tournament with which to make those last few decisions.

And when it gets to heads up poker with the highest payouts of the tournament on the line, THAT is when the least skill is required. I know some people might be offended by this, but heads up tournament poker is not a strong indicator of skill. The likelihood of winning a tournament between the final two participants is highly correlated to the size of their stack when reaching the final two.

The real feat was getting the opportunity to have such a winning stack, back in the money just after the bubble. Back when the player survived the gauntlet of crushing tournament decision gates.

Quote

11-16-2017 , 08:52 AM

Didace

Carpal \'Tunnel

Join Date: Nov 2009 Posts: 19,866

Winning a tournament takes different skills than merely making the final table.

Quote

11-16-2017 , 11:12 AM

recondite7

old hand

Join Date: Jul 2005 Posts: 1,971

Your analysis is spot on, but I think 99% of players intuitively knew this anyway.

Quote

11-16-2017 , 12:41 PM

ArtyMcFly

Carpal \'Tunnel

Join Date: Dec 2014 Posts: 13,256

Aren't you just saying that early on stacks are deep and there are more opponents, so therefore the game has more variables and is therefore more complex than later on when there are shorter stacks and fewer opponents?
I don't really like the idea of trying to quantify skill when the game itself is constantly changing. i.e. A 'GTO final table ICM bot' could play perfectly on a final table, but it wouldn't do as well on level 1 in comparison to a Deepstack bot designed to play optimally with 300bb. And vice versa.
In addition, the stacks might be shorter on a final table, so the mistakes there might be measured in fractions of big blinds (you can't make a 35bb mistake if you only have 20bb), but a fraction of a big blind will vary from "insignificant" in the early stages to "tens of buy-ins" on the FT.
Thus it's hard to argue that one phase of a tournament is more important or requires more skill than another. The most skillful human would be the one that can navigate all phases of a tournament better than most.

Quote

11-16-2017 , 02:06 PM

robert_utk

Not From the UK

Join Date: Jan 2005 Posts: 4,822

Quote:

Originally Posted by ArtyMcFly

If it can be shown that there is a point inside the tournament that is more difficult and requires more skill (however we define skill) than at both the beginning and the end of the tournament, then there must be a subset of decisions under this point that are crucial to the overall performance in the tournament. Decisions made here will close entire branches of future events that can lead to victory or failure. To use the exact same decision process here, versus early or later in the tournament will be -EV versus recognizing such a focalpoint. maybe, I think.

Quote

11-18-2017 , 01:00 PM

BaseMetal2

adept

Join Date: May 2012 Posts: 1,060

I am not sure at which point of a large field tournament could be considered maximum difficulty and at what point could skill become most valuable.

I think you are saying when the field size drops from 100% through 80%, 60%.. down to perhaps 20% the decisions are getting harder but by 10% or FT stage "THE DECISIONS GET EASIER ".

Quote:

Originally Posted by robert_utk

But, the decision gates get narrower after the tournament starts. Each decision, regardless of any other factors, gets more difficult in a tournament, up to a certain crisis point deep in the money and then releases while the added tournament pressure on the gates diminishes.

I think one problem with this viewpoint is that the size and typical complexity of the later decisions are strongly affected by the earlier ones. A poker tournament has a very strong compound interest, if you win some chips early there is a much stronger chance of getting deeper and of also having far more choices or decisions at these middle stages (when well stacked) so the skill of all the earlier play affects the later very strongly.

I would say that HU play when shallow does perhaps give the smallest decision trees and even the greats at this stage are less of a threat than at the start of a final table but I think they are probably the greatest threat from the very first hand as their skill will compound.

Here is, I hope a not too distracting similar related question:
If for every future big field tournament you play an entity could improve your luck by a 1% hand equity boost for a section of (up to) 10 vpip hands would you want this to begin at the start, the middle, near FT, or HU, stage?

Quote

11-18-2017 , 02:56 PM

robert_utk

Not From the UK

Join Date: Jan 2005 Posts: 4,822

Quote:

Originally Posted by BaseMetal2

Here is, I hope a not too distracting similar related question:
If for every future big field tournament you play an entity could improve your luck by a 1% hand equity boost for a section of (up to) 10 vpip hands would you want this to begin at the start, the middle, near FT, or HU, stage?

That is a great way to reformulate the question! But instead of luck, lets say that a player increased their relative skill by 1 percent such that they have a range versus range extra 1 percent advantage for a duration of ten hands vpip.

I think that is saying the same thing, but moving decision information sets from the random luck tree into the decision tree of a player is one way to define skill in poker anyway, especially in heads up pots.

So, if we take this proposition, when would we want it? At the final table where the payouts are huge, or at some point earlier when it can be compounded into much larger chip stacks at the final table?

Lets look at the actual events of the tournament, just on stack sizes and with no regard for tournament skill, however that could be defined.

If we start at the beginning of the tournament, we will be hard pressed to follow each stack in a large field tournament and show how they are related. They are related, I believe, such that every outcome of every hand somehow affects the future of every other stack. But the futures are so many that the perceptible effects are lost in a haze of futures. Also, similar futures are almost interchangeable, every player that doubles up early in the tournament can go into the double up futures together and share a similar chance to win the tournament.

Early in a tournament, a player can notice on another table that a player has doubled up, and not care and not change strategy even if they did care (maybe that player is a nemesis).

Now, lets do the opposite and start at the end of the tournament and follow the chip stacks backward.

Now we clearly see how the big stacks got bigger and the small stacks got smaller. We know who doubled up whom*. We can take the payouts and assign a monetary value to each stack. We can look back for crucial coinflips (or crucial folds) and say, THERE is where they really needed a 1 percent boost that could compound into a result in the tournament that is much better than just 1 percent better.

Lets say we are at the final three tables. With 27 players remaining, a player could notice that their table has much smaller stacks than the other two tables. Or, a player could notice that an opponent who is clearly recognized as a threat to win any tournament has just doubled up or been crippled. A player could, in theory, adjust their strategy to a more optimal strategy with this information.

Does a 1 percent skill/luck advantage pay us off more at the very beginning of a large field tournament, or at the final table, or somewhere in between? Great question! I think it is "in between".

I friggin' googled the who/whom grammar and probably still got it wrong

Quote

11-18-2017 , 08:08 PM

BaseMetal2

adept

Join Date: May 2012 Posts: 1,060

Quote:

Originally Posted by robert_utk

Does a 1 percent skill/luck advantage pay us off more at the very beginning of a large field tournament, or at the final table, or somewhere in between? Great question! I think it is "in between".

I too don't really know my guess would also be somewhere in the middle but perhaps early/middle. Maybe it's possible to do some sort of simple simulation to shed some light on it but I suspect it may not be easy.

As far as value of stacks in tournaments go icm is pretty good but in big field MTTs before any payouts happen a fairly good estimate imo is just simply that the value of chips is quite stable/flat until itm.

At the first hand (ignoring any skill edge) one starting stack is worth (prize pool)/(number of players)

At the mid stages I don't think the value changes much at all, 1 starting stack is still worth approx. (prize pool)/(number of players). The thing that changes is the average stack size is maybe 2 or 3 times the size of the starting stack. Now if 50% of starting players have lost then the current average stack = (prize pool)/(0.5 x number of starters) and this means that 1 starting stack still = (prize pool)/(number of players).

The value of an individual chip hasn't changed much.

Near the itm bubble the spread of different sized stacks will affect the valuation.

After you go passed the itm bubble the value of each chip starts dropping as the remaining prize pool to fight for is dropping. At the HU stage the last two will fight it out for about 10% of the initial prize pool and at this HU stage we can get an accurate value of chips so:
Getting all the chips in HU you get all of this 10% prize pool left so at this stage all the chips are worth 10% of the total prize pool.
As this is HU there is no icm like effect so 1 starting stack is now worth 10% of the initial 1st hand value.

The changing value of chips as the field shrinks during the itm stages also may affect when you would want a skill boost. Although at HU you are playing for high $ value, perhaps contesting 20 or 30 buyins, this is still only 10% of the prize money.

If an early skill boost of 1% may get you into the money 1% extra, could this early boost also get you to HU 1% more often, if so it's not worth waiting for HU.

If you wait for HU you get a 1% boost on this last 10% only so it seems to me that it would be better earlier than HU and by this logic anytime before itm is better than after itm.

Quote

11-18-2017 , 08:18 PM

robert_utk

Not From the UK

Join Date: Jan 2005 Posts: 4,822

I think you have convinced me that stack size can no longer be left out of my theory. A difficult decision in a tournament must depend at least partly on our chip stack. With a polar stack, very large or very small, decisions would seem to be easier.

So, adding chip stack into the theory would mean that there is some “trouble size” of a tournament stack and if you are happily with a large stack or unhappily with a small stack at least you should have more clarity to your decisions.

Quote

11-19-2017 , 07:02 AM

#10

ArtyMcFly

Carpal \'Tunnel

Join Date: Dec 2014 Posts: 13,256

Perhaps the decisions that are most important to get right are simply the ones where the ICM pressure is greatest. e.g. On/near the bubble, or on a payjump. A lot of the best players simply make better (more profitable) decisions at those crucial stages. (e.g. Using a big stack to hoover up chips when short-stacks are clinging on to get a mincash, or 3-betting constantly against the second biggest stack because he can't risk going broke when others are much shorter, or alternatively folding very often when survival has a higher EV than chip accumulation).
To put it another way, a "GTO chip EV bot" (that has mastered cashgame play) might play perfectly in terms of chip EV ranges, but if it has no understanding of ICM, it's not going to have a very good ROI in tournaments, because it will take "flips" that are +EV in terms of chips, but -EV in terms of actual money. Such a bot might win a lot on level 1 with 100bb, and also do well HU (if it gets there and it's trained for that) but commit ICM suicide on the final table bubble.

Last edited by ArtyMcFly; 11-19-2017 at 07:11 AM.

Quote

11-19-2017 , 10:42 PM

#11

robert_utk

Not From the UK

Join Date: Jan 2005 Posts: 4,822

Quote:

Originally Posted by ArtyMcFly

I was thinking of ways to model poker players in a tournament with a bot. Then I was wondering how to make some bots better than others. ICM popped as a great way to do this. As usual, ArtyMcFly, you are one step ahead!

Although this was not where I thought we would be going ITT, this is really interesting, but lets take it a few steps farther and it could really be awesome.

First, let us call the bots with strong Chip EV skills Standard Bots. Then the bots with Chip EV plus ICM will be called +ICM Bots. The +ICM Bots are just as strong as the standard bots, but have an additional ICM subroutine. If an action fails ICM analysis, the +ICM bots won't act until a more money-biased action satisfies ICM analysis.

Now we run simulations of 90 bots where 60 bots are standard and 30 bots are +ICM.

Let us theorize about the results:

Which bots earn the most money in the long run? Since the +ICM bots are otherwise just as strong as the standard bots, the +ICM bots must have a better ROI, by definition.

Which bots have the higher average finishing position? Hmm. Not so clear. This is because ICM is about getting paid for your gambles, not about maximizing your chip stack. If ICM did inform and create larger chip stacks, then ICM would be a clearly dominant strategy to win the tournament overall. This is not the case. Even though we can not perceive all the ways these bot's stacks are related, they are in fact related. If a +ICM bot makes a ICM fold that benefits a standard bot, this increases the overall chance that a standard bot will have a higher finish position on average, but get there by making ICM suicidal risky gambles.

Which bots would WIN IT ALL more often? This would have to be the standard bots. They care not for laddering, or getting paid, they just want the most chips period. They sacrifice ICM, simply because they do not know ICM. They lose money more often, but embrace luck to the fullest.

Or do they?

Now let us say that each bot has a name that is known to all other bots and describes the bot as standard or +ICM. Further, the SOME of the +ICM bots get another subroutine that allows them to differentiate between the ranges of hands that the standard bots are playing, and the correct ranges that the +ICM bots are playing. Lets call these bots ICM plus exploitive, or +ICMe. We make 10 of the +ICM bots into +ICMe bots. The standard bots get no extra subroutine since they are unaware of ICM and can not differentiate the ranges anyway.

Now the +ICMe bots know who they are playing against. Teaching any bot to know how to react to different opponents should be an important step in modeling human players.

Since they are just as strong in every other aspect, they can play perfectly, at least in heads up pots, against every other bot in the tournament. Now these bots will win the most money with the best ROI, and win the tournament just as often as well. These will outperform all other bots.

All fine and dandy, but how would any of this be more than just an analysis of ICM and a demonstration of its effects on ROI?

Well we are not satisfied with just knowing which bots to wager our money on in fantasy bot tournament poker. Our office pool only pays out the winning bot, no money for second place. So we sneak into the lab and make a secret improvement to one of the standard bots, that we call Bracelet Bot. This bot only wants to win it all.

How do we program such a bot?

We teach the bot biases. We grant the bot aversions.

It turns out that having biases is not by definition irrational. It can be a part of a Von Neumann-Morganstern utility function. So long as other axioms are met, a VNM Rational agent will follow its utility function that happens to maximize utility, and the agent does not even have to be aware of this. Having aversions is almost automatic in any VNM u function that is complicated and describes complex decisions. A VNM rational bot could know that 2+2=4 but that 2u+2u =/= 4u, and all of this can still satisfy VNM rationality.

We make the Bracelet bot averse to having zero chips.

If zero chips is the worst utility, then having 2 chips is second worst, so on and so forth. We teach the bot that chips increase in value the farther down they are located in the stack such that bets are placed with the chips from the top of the stack. Larger bets risk more valuable chips farther down the stack. The relationship is non-linear, such that a very large stack has chips that are near the top that are worth almost the same as chips located nearby in the stack, where the bottom has a run of chips that are dramatically different in value.

Now we teach the bot that having more chips later in the tournament is worth more than having chips earlier in the tournament. This is equalized by the M factor, such that each stack is viewed in multiples of Big Blinds. The bot knows that 100BB is worth more later than 100BB is worth now. So, whatever risk/reward gambles the bot takes are informed by how many BB the bot has and when it has those BB. The Bracelet bot is just as good at standard non-ICM range analysis as any other bot, it just has the win it all subroutine.

Lastly, we are really wanting to skew the results so we grant the Bracelet bot the knowledge of ICM but teach it to ignore ICM. It will only know what bots are playing what strategy and balance ranges accordingly.

Such a bot could have the worst overall ROI of any of the bots. Remember, the happy go lucky standard bots are already pretty good at winning tournaments. And the +ICM bots are great earners, while the +ICMe bots are perfect exploiters. Bracelet bot is going to pass up on lots of gambles based on its aversion to going bust.

However, Bracelet bot will win the most tournaments overall and sacrifice a lot of money finishes along the way.

Or will it?

Last edited by robert_utk; 11-19-2017 at 10:48 PM.

Quote

11-21-2017 , 05:05 PM

#12

BaseMetal2

adept

Join Date: May 2012 Posts: 1,060

I am not sure why Bracelet Bot would play any differently from a chip Ev Bot although I didn't really understand the aversion part mentioned of the utility, and also whether these chip EV bots would also have the knowledge of the other bot ranges via their names.

Isn't Bracelet Bot's goal to collect all the chips and chip EV bot also seems to want to collect them, what is the difference?

If BBot has an aversion to losing and so maintains an 'edge' to reduce risk I don't think this would actually be sensible in an effectively simulated massive number of repeated runs.

In a real world situation you don't take a big risk for a very small improvement, you wait for better spots (as you have a skill edge). In an effectively simulated run of infinite repeated tournies you would be better off grabbing any, even micro, improvement at all and at any stage, there is no merit to maintaining an 'edge'. (In my mind I think this is so but I could be completely off here and it is assuming you can somehow calculate the value of your state and the micro improvement mentioned.)

Quote

11-22-2017 , 05:10 PM

#13

robert_utk

Not From the UK

Join Date: Jan 2005 Posts: 4,822

Quote:

Originally Posted by BaseMetal2

The Chip EV bots en masse are the more likely winners of the tournament. But to make any individual Chip EV bot the MOST likely to win the tournament, we make it averse to having zero chips. To what extent this bias overrides its normal tendency to gamble can be tested in the lab, and we can use experimental runs to settle on a good mix of aversion and gamble.

Bracelet Bot wears a nameplate of a regular Chip EV bot. However, it also knows the strategy of the ICM and ICM+ bots, but it cares not for money finishes, it just factors that into the same brute force calculations as all the other bots.

As such, that bot is breaking one of my rules about bots:

All bots wear name plates that describe their solution.

But hey, we are trying to win the office pool!

Quote

11-24-2017 , 09:03 PM

#14

just_grindin

Pooh-Bah

Join Date: Dec 2007 Posts: 5,263

Quote:

Originally Posted by BaseMetal2

To me it seems the difference between Bbot and ChipBot is:

1. ChipBot is naive to a priori information relevant to ICM. It does not understand how an opponent would use ICM factors to derive ranges and actions that would be different than maximizing chip EV and cannot anticipate those decisions given a scenario.

2. Bbot values his tournament life more than accumulating chips. It is unclear how that translates to decisions and risk aversion except for maybe all in situations.

In my opinion the given abstractions are not very good in the sense the abstractions as of right now don't add insight into how to model or reason about poker.

At this point there is no way to determine how the strategies actually manifest themselves when playing a certain scenario. It is very unclear how much overlap there is between the abstracted strategies and how significant deviations would be when there are deviations.

The above is in no way meant as an insult. Just my thoughts on what I have read so far.

Quote

11-24-2017 , 09:50 PM

#15

robert_utk

Not From the UK

Join Date: Jan 2005 Posts: 4,822

Quote:

Originally Posted by just_grindin

Hey jg,

with regard to 2)..

How much Bracelet Bot values its tournament life varies by how many chips it has, and when it has them. So, it still values gaining chips just like the Standard Bots, but with some extra "weight" placed on avoiding getting knocked out of the tournament. This weight could be so slight as to be unnoticeable, or could be quite heavy and very noticeable. This is because the valuation of the chips is non-linear.

As for any use to be found in my abstractions, I think where I wound up is that you can model tournament strategy by granting all the bots the same strategy and then add some of the bots additional strategy and watch those new bots perform.

Thus, it is a way to prove whether a strategy is actually a winning strategy at all. If all the bots were equally smart or dumb at tournament poker, then you teach some of them ICM then those bots will finish in slightly higher positions in the tournament especially at the final table.

However, beyond that was all just more abstractions.

I would hope that the +ICMe bots would know that the Standard bots are still gambling at the final table and be able to exploit such behavior by entering the full range of a cash poker player into the ICM analysis...

And Bracelet bot would hopefully be able to prove my "weighted chips" theory and at least have the best shot to finish in first place.

So, the whole idea is to set up a method to test conjectured theory in an experimental laboratory of poker bots.

I am reallllly long winded, and sometimes this leads to lots of stuff on the page that was not thought out or should have been just redacted, so I completely get what you are saying. No need to qualify your statements, just_grindin, I did not take any offense at all.

-Rob

Quote

11-24-2017 , 10:21 PM

#16

just_grindin

Pooh-Bah

Join Date: Dec 2007 Posts: 5,263

I guess what I was trying to get at with my comments was that thinking in these abstractions doesn't seem to be that helpful because there's no way for us to reason about the differences in strategies, most notably because we don't even know what the baseline Chip Bot's strategy would look like, so it's hard to even say how significantly the rest of the bots strategy would vary.

Quote

11-25-2017 , 08:25 AM

#17

BaseMetal2

adept

Join Date: May 2012 Posts: 1,060

Quote:

Originally Posted by BaseMetal2

If BBot has an aversion to losing and so maintains an 'edge' to reduce risk I don't think this would actually be sensible in an effectively simulated massive number of repeated runs.

In a real world situation you don't take a big risk for a very small improvement, you wait for better spots (as you have a skill edge). In an effectively simulated run of infinite repeated tournies you would be better off grabbing any, even micro, improvement at all and at any stage, there is no merit to maintaining an 'edge'. (In my mind I think this is so but I could be completely off here and it is assuming you can somehow calculate the value of your state and the micro improvement mentioned.)

I no longer think this is right. Even in infinite runs it is probably better to have some edge and not take all simple +ve outcomes. If you could calculate the exact $gain then if on average you increase your $ value by $0.05 per hand it would be better not to try to get $0.01 if it could often knock you out, wait for a probably more valuable spot next hand.

This 'edge' is fairly complicated and it changes throughout the tournament but I suppose in a simulated world it is possible to calculate quite accurately and then once you have such knowledge you can take any action larger than this edge.

For a chipEv bot or an icmBot the edge would be 0.0 as the ev calc or icm alg would just give a value, an edge here doesn't make sense as they are following simple algorithms.

An agent with a long term goal could have a much different way to calculate the current state though and this would effectively generate an 'edge'.

For example the Bbot could track how it's own stack size after number of hands played translates into a tournament win probability and use this as value rather than chips when calculating the EV formula.

After playing millions of games BBot at any point would know the value of it's current stack on average.
eg, for say hand 50:
Stack, Prob of winning tourn
1, 0.00000001
2, 0.000000021
3, 0.000000031
::
5000, 0.1
::
20000, 0.4
::
49999, 0.999
50000, 1.0

For each hand using the ev formula but using the above 'TWin Probs' as the input values may give a better result than just the chip counts as input. The difference with the standard chip calc would be the 'edge'.

This stack vs number of hands played value is still a very simple viewpoint. A perfect 'edge' calc would be much more complicated thing involving number of players left and many more factors.

Quote

11-25-2017 , 02:12 PM

#18

robert_utk

Not From the UK

Join Date: Jan 2005 Posts: 4,822

Quote:

Originally Posted by BaseMetal2

Agreed. The best strategy will be very complex that is for sure. It is a process of adding more and more complex strategy and seeing if it improves over the previous strategy.

One advantage of simple logarithmic chip value is that you can put it on a logx and represent it with simple numbers that humans could implement at the table.

Quote

11-30-2017 , 02:51 PM

#19

6471849653

old hand

Join Date: Oct 2004 Posts: 1,634

It is pretty easy to get in money and some further, but after that the skill level needed increases hugely. If the final table is a flip, is another story.

Quote

Post Reply Subscribe

...