Cliff Notes
- Can you build a simple predictive model that is worth anything with a basic knowledge of statistics?
TL;DR
Hi all -- I am very new to, but very interested in, the art of betting sports and would like to build a basic predictive model as part of my betting process. However, I'm wondering if it's even possible to build a basic model with any value given my basic math knowledge.
I am an accountant by trade. The highest math courses I've ever taken are business statistics and business calculus in college, and that was 15 years ago at this point (damn I got old quick).
I was, perhaps overly optimistically, hoping to develop a simple betting model using just a basic knowledge of statistics. And I have no delusions about ever developing something as complex as Emmanuel Perry's SALAD model which he uses to predict the outcome of NHL games -- but when I read his blog post describing the model, I was intimidated:
Quote:
Salad is an ensemble of 11 unique sub-models: 4 bagged logistic regression models, 1 XGBoost gradient-boosted trees model, 2 neural networks, 1 bagged naive Bayes model, 2 CatBoost gradient-boosted trees models and 1 random forest using fuzzy logic. Different feature sets were used among these sub-models in an effort to further increase diversity within the ensemble. A bagged logistic regression model was used for a stacking algorithm. My approach in building these ensemble components was loosely to optimize the mean score of the various models while minimizing collinearity. In plain terms, each of the sub-models should perform well on its own and no two sub-models should be overly alike in their output.
Can I build a model that will be useful with just a basic understanding of stats or do you need an applied statistics M.S. to really make something you can rely at all on? I'm not looking to do this professionally, just as a challenging hobby, but I'm apprehensive that building a model worth anything is possible given the jargon above.