Two Plus Two Publishing LLC
Two Plus Two Publishing LLC
 

Go Back   Two Plus Two Poker Forums > >

Notices

Sports Betting Discussions related to wagering on sporting events.

Reply
 
Thread Tools Display Modes
Old 06-14-2015, 10:41 PM   #1
sharks88
stranger
 
Join Date: Jun 2015
Posts: 1
A winning formula in Horse Racing (hopefully)

Hi all,

This is my first post - although I've been an avid user of the forum for some time, but with not much to add in the way of valuable information I've remained quiet till now.

I'll just preface this with the fact that I am not naive enough to think that I have just cracked the betting code or found the holy grail; however, I do believe I've found a genuine edge.

So here it is...

I concentrate on UK Horse Racing only, specifically only handicaps races. What I've done is generated a method for calculating variables ('pace', 'trainer form' etc. etc.) that I believe to be significant to finding the winner of a race at a better rate than the betting public. I've returned an 8% ROI from 2009-to date. Not much? Perhaps, but it's sustainable. Below are the vital statistics.

Runners: 8,577
Winners: 1,549
Strike Rate: 18.06%
P/L: 668.53
ROI: 7.79%

Exp Wins: 1,401
A/E: 1.11

So, what am I asking? While I'm no stranger to computation techniques I'm asking for advice on backtesting this method - specifically to see how viable it is going forward. I've performed a confidence interval on my "test" sample (I've split up into 3 equal parts; Test, Development and Forecast) and the results are below;
CumPL


Mean 95.50430493
Standard Error 0.836316975
Median 93.63225
Mode 90.195
Standard Deviation 44.70972765
Sample Variance 1998.959747
Kurtosis -0.41541084
Skewness 0.209192238
Range 212.404
Minimum -4
Maximum 208.404
Sum 272951.3035
Count 2858
Confidence Level(95.0%) 1.639845865

I also performed a CHI.SQ test with the following results;

Chi Sqr Exp Wins: A/E
0.52% 462.05 1.12

I'm not sure where to start with interpreting the above results? So if a better statistician can help I'd be ever so grateful. Or, suggest a way to better test the data, etc.


Thanks in advance.
sharks88 is offline   Reply With Quote
Old 06-18-2015, 03:06 PM   #2
Wamy Einehouse
rack 'em
 
Wamy Einehouse's Avatar
 
Join Date: Dec 2008
Posts: 4,652
Re: A winning formula in Horse Racing (hopefully)

I can possibly help, but need some more info if you're happy to answer the below:

8% roi here is your bets over that period correct (eg this number doesn't come the back test)?

If yes, what bookmaker/exchange were they with and what was the bet sizing/timing (was this on sp prices or the morning opens)?

Was it jump or flat and what's the typical grade?
Wamy Einehouse is offline   Reply With Quote
Old 06-19-2015, 08:10 AM   #3
jukofyork
Carpal \'Tunnel
 
jukofyork's Avatar
 
Join Date: Sep 2004
Posts: 11,595
Re: A winning formula in Horse Racing (hopefully)

Quote:
Originally Posted by sharks88 View Post
I've performed a confidence interval on my "test" sample (I've split up into 3 equal parts; Test, Development and Forecast)
1. How strict were you with this (ie: was the test set kept completely separate during development or did you ever go back to the development stage to "tune" anything and then return to the test set...)?

2. Did you come up with a single hypothesis, develop it and then test it (ie: as opposed to coming up with a whole bunch of possible hypotheses, developing them all and then seeing if any had good test results...)?

Assuming you answered yes to both of the above, then:

How variable are the odds and the bet-sizes? If you're testing wildly disparate odds and/or are using some sort of variable-sized staking system, then the parametric statistics you've posted above might be (very) overly optimistic... Try Bootstrapping and see what fraction of the bootstrapped samples have an ROI lower than zero, etc.

Also, is your system producing a predicted win probability (as opposed to just making binary yes/no selections)? If yes, then you can do further tests with this in mind (eg: compare Pseudo R-squared vs the public's implied probabilities, etc).

If you're interested then here are a couple books you might find helpful:

Precision: Statistical and Mathematical Methods in Horse Racing by C X Wong

Assessing and Improving Prediction and Classification by Timothy Masters

You also might find this interesting, with regard to the dangers of bias during systems creation:

Evidence-Based Technical Analysis: Applying the Scientific Method and Statistical Inference to Trading Signals by David Aronson

Juk

Last edited by jukofyork; 06-19-2015 at 08:15 AM.
jukofyork is online now   Reply With Quote
Old 06-19-2015, 08:23 AM   #4
ExaMeter
Pooh-Bah
 
ExaMeter's Avatar
 
Join Date: Jul 2006
Posts: 3,985
Re: A winning formula in Horse Racing (hopefully)

Juk,

could you briefly explain why #2 would be a problem please?
ExaMeter is offline   Reply With Quote
Old 06-19-2015, 08:48 AM   #5
jukofyork
Carpal \'Tunnel
 
jukofyork's Avatar
 
Join Date: Sep 2004
Posts: 11,595
Re: A winning formula in Horse Racing (hopefully)

Quote:
Originally Posted by ExaMeter View Post
Juk,

could you briefly explain why #2 would be a problem please?
Your test set is no longer an unbiased estimate of future performance... For a really simple example to see why then consider this:

Imagine you generate a single sample from a uniform random number generator (ie: between 0 and 1). Then on average the number will be 1/2 (ie: it's an unbiased estimate of the mean of the distribution).

Now consider that you generate two uniformly distributed random numbers and take the maximum of them. Then on average the number found to be the maximum will be 2/3 (and is thus no longer an unbiased estimate - see the picture in the second post of this thread to see why).

and it gets even worse if you were to take the maximum of 3 or more uniformly distributed random numbers...


So now imagine instead of random numbers you have a bunch of systems/models that you've data-mined, but (unbeknown to you) none have any predictive power at all and they also all show significant variance about their test results (ie: due to the test sample size being too small, etc). If you take the best system produced from these, then the performance estimated on the test set will likely be nothing like what you'll get in reality (ie: it's been biased to give an overly optimistic estimate).

One way round the above problem is to save yet another independent "final" test set and then re-test the best model you've found on it, BUT: for this test to be truly unbiased you have to resist the temptation to go back and try the second best if this proves to be a poor performer, etc...

You can also try to use the statistical methods outlined in the David Aronson book linked above to correct the overly optimistic estimates found by mass-testing hypotheses.

If it's still not clear then the books by Timothy Masters and David Aronson listed above give a much better explanation of this than I can.

Juk

Last edited by jukofyork; 06-19-2015 at 08:57 AM.
jukofyork is online now   Reply With Quote

Reply
      

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off


Forum Jump


All times are GMT -4. The time now is 11:32 AM.


Powered by vBulletin®
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.
Copyright 2008-2020, Two Plus Two Interactive