Suppose you want to measure the difference between two players' strategies (e.g. you think someone is multiaccounting and not varying their play much, or someone's playstyle is so unusual compared to the average reg that it's suspicious). You could often do this by eye via HEM or similar if you had enough data, but I'm curious how you would model it formally.
If you're using this measure for more detailed analysis you also want it to observe other properties, for instance:
- Distance between parts of the distribution matters as well as the total distance: if two strategies are mostly similar but one player takes a certain action much more often, that's still strange
- We might want to assign extra 'weight' to certain actions where a discrepancy is a good indication of what you're trying to prove
- Even where there's an obvious way of 'ordering' parts of a distribution, that might not apply in all cases (e.g. 1/4 pot and 1/3 pot are 'closer' than 1/4 pot and 2/3 pot but that's misleading when building a range of bet sizes; or maybe the site you're using has a 1/2 pot button so, for most human players, bet sizes in the vicinity of 1/2 would be collapsed into 1/2 and a ~52% pot bet would 'differ' more from 1/2 than a 40% pot bet).
In some of the academic work on poker, earth mover's distance is used to group hands based on hand strength for the purpose of abstraction; someone suggested I should use it here, but it doesn't conform to the properties above (afaik?).
Some of the
independent exposes on bot use have used the squared Euclidean distance between common stats. This works well enough in those cases - mostly where the conclusion is clear from looking at the stats and they are just trying to add some rigour - but it has some apparent issues. It's sensitive to the choice of stats: adding or removing certain stats can shift the result, so you need some way of deciding which stats are useful. Also, many of those stats aren't independent of each other; a small change to the strategy will affect a bunch of stats at the same time, so the change in distance could be too large.
Please let me know if you have corrections to the above or thoughts on the problem.