Official Advanced Math Help Thread - Sports Betting - SportsBetting Strategy and Forum

Two Plus Two Forums Sports and Games Sports Betting

Official Advanced Math Help Thread

Post Reply Subscribe

...

01-27-2012 , 07:44 PM

illfuuptonight

banned

Join Date: Jul 2010 Posts: 241

After hours of searching I still can't find anything on the internet pertaining to my question. So hopefully the statisticians here can lend a hand.

Simplest example:

I think a basketball player's shooting % is determined by 1-his distance from the hoop(feet), how well he is being guarded on a scale from 0 to 1, and his height(inches).

I believe that shot distribution when the player is 17 feet from the hoop, being guarded at .67 level, and is 75 inches tall goes as such:
-2% of the time he shoots at each level from 20%-29%. So, 2x10=20%
of the distribution.
-6% of the time he shoots at each level from 30% to 34%.
-4% at each level from 35% to 39%
-3% at each level from 40% to 49%
-0% at every other level.

Quick question: Is that the correct way to do this or would it be better to say something like 40% of the time he shoots at level 100% and 60% he shoots at level 0%?

Now say I believe that shot distribution when the player is 14 feet from the hoop, being guarded at .54 level, and is 77 inches tall goes as such:
-2% of the time he shoots at each level from 30%-39%.
-6% of the time he shoots at each level from 40% to 44%.
-4% at each level from 45% to 49%
-3% at each level from 50% to 59%
-0% at every other level.

Last one...I believe that shot distribution when the player is 12 feet from the hoop, being guarded at .33 level, and is 78 inches tall goes as such:
-2% of the time he shoots at each level from 35%-44%.
-6% of the time he shoots at each level from 45% to 49%.
-4% at each level from 50% to 54%
-3% at each level from 55% to 64%
-0% at every other level.

I see player who is 75 inches tall, 17 feet from the hoop, and being guarded at .67 level(first player described) miss a shot.

Now I can update that particular distribution using Bayesian Inference. But how do I update the other distributions given that same information(the miss)? It isn't reasonable to think I should have to wait to see a 78 inch player shoot from 12 feet being guarded at .33 before I can update that particular distribution.

Quote

01-28-2012 , 01:48 AM

whytetittie

journeyman

Join Date: Sep 2011 Posts: 264

Dear diary,

No one is going to do your homework for you.

Quote

01-29-2012 , 06:58 AM

THAKID

Carpal \'Tunnel

Join Date: Mar 2009 Posts: 9,873

Subscribed

Quote

01-29-2012 , 10:02 PM

poloboy4

old hand

Join Date: Jul 2009 Posts: 1,718

-get data
-run regressions
-create formulas
-????
-profit

Quote

01-29-2012 , 10:25 PM

gfxhjk

Carpal \'Tunnel

Join Date: Apr 2005 Posts: 10,757

get data
run regressions
make horrible bets relative to reality but good relative to your regression
-????
-insurmountable debt

Quote

01-29-2012 , 10:57 PM

THAKID

Carpal \'Tunnel

Join Date: Mar 2009 Posts: 9,873

U are forced to bet real $ while creating your formula ?

Quote

01-29-2012 , 11:00 PM

gfxhjk

Carpal \'Tunnel

Join Date: Apr 2005 Posts: 10,757

prolly easier to just wait for thousands of trials to pass to make sure its ok

Quote

01-30-2012 , 03:18 AM

-ev?

newbie

Join Date: Sep 2011 Posts: 18

For the last 5 WTA Tennis Seasons I try to figure out if the market has overall become sharper, in order to find out if back testing relative to closing odds is justified over the whole period or only the last 1-2 years.

I have roughly 2500 games per season.

What statistical test would you use for this?

I squared the pinnacle vig free winning lines distance from 1 for all games of a year and divided it by the amount of games, essentialy calculating the variance from theoretical expected values.

I.e. if Pinnacle vig free is 1.5 decimal, (1-0.666)^2.

2011 0,1772005046
2010 0,1784442168
2009 0,1414215339
2008 0,1786097325
2007 0,1677804943

mean: 0,1686912964

Over the mentioned sample size of n~2500 this should be enough to resonably assume that WTA betting hasnt gotten sharper over the last 5 years, correct?

Last edited by -ev?; 01-30-2012 at 03:39 AM.

Quote

01-30-2012 , 11:53 AM

TomG

Pooh-Bah

Join Date: Jun 2004 Posts: 4,294

Quote:

Originally Posted by -ev?

you should probably first take the square root of that variance-like metric so it's in the same units as the mean. then compare it to the mean. but overall that seems like a sensible approach to measuring it.

Quote

01-30-2012 , 12:03 PM

#10

TomG

Pooh-Bah

Join Date: Jun 2004 Posts: 4,294

actually, it's probably just better to take the squared difference between the opening no-vig decimal odds and the closing no-vig decimal odds.

Quote

01-30-2012 , 01:10 PM

#11

illfuuptonight

banned

Join Date: Jul 2010 Posts: 241

Quote:

Originally Posted by illfuuptonight

, he said, hopeful that rsigley and/or TomG could help him out with the OP question.

Quote

01-30-2012 , 01:35 PM

#12

TomG

Pooh-Bah

Join Date: Jun 2004 Posts: 4,294

dunno maybe rsigley would want to answer. i'm not really sure. total bases still uses frequentist reasoning

but there's a lot of advanced stat/math textbooks on the subject if you're willing to learn the info is there

Quote

01-30-2012 , 07:35 PM

#13

Chipp

journeyman

Join Date: Nov 2004 Posts: 352

-ev?,

Consider log scoring and limits.

Quote

01-31-2012 , 04:06 PM

#14

NegativeZero

veteran

Join Date: Jul 2004 Posts: 3,358

Quote:

Originally Posted by -ev?

This could be explained by other factors, such as increased distribution of near-pick events, which would increase the variance.

Quote

01-31-2012 , 05:42 PM

#15

caspermatic

adept

Join Date: Nov 2007 Posts: 810

If I wanted to learn how to:

-get data
-run regressions
-create formulas
-????
-profit

Where would I start?

For what it's worth, the reason I'm interested isn't even close to what the OP is doing.

Quote

01-31-2012 , 09:12 PM

#16

-ev?

newbie

Join Date: Sep 2011 Posts: 18

Negative,

These are the avg odds on the favorite for the years:
avg smaller odds
2011 1.4416368078
2010 1.4468241578
2009 1.3874011706
2008 1.4382330472
2007 1.4656253219

This is the distance to 2 for the favorite squared

avg (2-smaller odds)^2
2011 0.3728082248
2010 0.3654913837
2009 0.4282047519
2008 0.3769435009
2007 0.4215296897

Looks like games are balanced out.

I also mistakenly used the short version for vig free, here is updated with the root:

Variance Std
2011 0.1961877702 0.4429308864
2010 0.1935183146 0.4399071659
2009 0.1948075505 0.4413700834
2008 0.1918060137 0.4379566345
2007 0.1801504604 0.4244413509
arith mean: 0.1912552205 0.4373273608

Over PM someone asked for the data source, http://www.tennis-data.co.uk/alldata.php.

Data comes without pinnacle open odds, I will use the log scoring as proposed tommorw and post results, but so far it looks like for backtesting purposes WTA data is valid for the last 5 years.

Quote

02-02-2012 , 12:32 AM

#17

gfxhjk

Carpal \'Tunnel

Join Date: Apr 2005 Posts: 10,757

do you really think regression can explain a complex model like sports?

Quote

Post Reply Subscribe

...