Two Plus Two Publishing LLC Two Plus Two Publishing LLC
 

Go Back   Two Plus Two Poker Forums > >

Notices

Programming Discussions about computer programming

Reply
 
Thread Tools Display Modes
Old 04-09-2017, 03:57 PM   #51
Priptonite
Carpal \'Tunnel
 
Priptonite's Avatar
 
Join Date: Jan 2009
Location: Seattle
Posts: 12,471
Re: Data Science As A Career?

Tackled the ridge regression section in my text book this morning and cleared up quite a few things lol.

Quote:
Originally Posted by Sholar View Post
Yes. One way to think of it in terms of the "bias-variance" tradeoff (which you probably haven't encountered yet...) another is as a Bayesian prior that the predictors are useless (and so your prior is that they take a zero coefficient).

We've dug into bias-variance tradeoff a bit. My understanding is that variance refers to how much your model would change given a different training set. A very flexible method (KNN w/ K = 1) will have high variance, while something like linear regression will be low variance. Bias refers to error introduced from model assumptions (e.g. linear relationship between response an predictors). Here, flexible methods result in low bias while more constrained methods have higher bias.

This doesn't seem like a good intuition...it's more that they reduce error "more" than they contribute to a penalty...

...and this seems even more strange. How are they getting "inflated"?

Yeah... so I was not understanding ridge regression very well. For whatever reason I wasn't really thinking in terms of setting lambda then calculating the set of coefficients that minimize the objective function. I was tunnel visioned on the penalizing term and thinking, "I understand I want to penalize large coefficients to reduce over-fitting, but why do I want to let my small, insignificant coefficients stick around easier?" Somehow the thought didn't occur me, "they're already small and insignificant, you don't need to penalize coefficients that aren't contributing to your model in the first place."I was thinking about the MSE/Penalty trade off, but for those coefficients there's no real trade-off lol.
Thanks for the comments, Sholar. I think it's really helpful for me to engage in conversation about these things. It's really easy to sit in class and nod along thinking you understand something, but when forced to try to explain it yourself realize that's not the case.
Priptonite is offline   Reply With Quote
Old 04-10-2017, 10:40 AM   #52
cannabusto
Pooh-Bah
 
Join Date: Jan 2013
Posts: 3,509
Re: Data Science As A Career?

I think you're understanding it better now. Just to be clear, variance is error caused by the model being too sensitive to tiny changes in training data.

Can you explain what the issue is with overfitting? What happens when a model is overfit? This is key to understanding why to use methods like ridge and LASSO instead of OLS regression.
cannabusto is offline   Reply With Quote
Old 04-13-2017, 11:33 PM   #53
Priptonite
Carpal \'Tunnel
 
Priptonite's Avatar
 
Join Date: Jan 2009
Location: Seattle
Posts: 12,471
Re: Data Science As A Career?

Quote:
Originally Posted by cannabusto View Post
I think you're understanding it better now. Just to be clear, variance is error caused by the model being too sensitive to tiny changes in training data.

Can you explain what the issue is with overfitting? What happens when a model is overfit? This is key to understanding why to use methods like ridge and LASSO instead of OLS regression.

The issue with overfitting is that your model starts to capture the variance in your training set. For example, let's say the true relationship between your variables is y = x + norm(0,1). We know that a linear model is appropriate, but fitting a 5th order polynomial to your training set will probably produce a "better" fit. But when you then go to your test set, your 5th order polynomial might behave worse than a standard linear model if either the training or test set had a significant outlier, for example.
Priptonite is offline   Reply With Quote
Old 04-14-2017, 09:29 AM   #54
cannabusto
Pooh-Bah
 
Join Date: Jan 2013
Posts: 3,509
Re: Data Science As A Career?

That's a very good explanation. Although the test set outlier would affect the prediction accuracy if either model. It's all about avoiding modeling the noise in the training set.

Sent from my SM-G930P using Tapatalk
cannabusto is offline   Reply With Quote
Old 04-14-2017, 03:59 PM   #55
Priptonite
Carpal \'Tunnel
 
Priptonite's Avatar
 
Join Date: Jan 2009
Location: Seattle
Posts: 12,471
Re: Data Science As A Career?

Quote:
Originally Posted by cannabusto View Post
That's a very good explanation. Although the test set outlier would affect the prediction accuracy if either model. It's all about avoiding modeling the noise in the training set.

Sent from my SM-G930P using Tapatalk
Good point lol
Priptonite is offline   Reply With Quote
Old 06-14-2017, 01:46 PM   #56
Priptonite
Carpal \'Tunnel
 
Priptonite's Avatar
 
Join Date: Jan 2009
Location: Seattle
Posts: 12,471
Re: Data Science As A Career?

Finished up my machine learning class last week. Thought it was a great class and really enjoyed it. I am, however, kind of conflicted on how I feel about certain aspects of ML.

In the last two to three weeks we started talking about neural nets, random forests, bagging, boosting, ensembles. Caveat: We didn't cover them in depth, so I'm sure I'm missing a lot of the nuance, but sometimes it felt a little like "throw **** at a wall and see what sticks." Very early on in the class we talked about inference vs prediction, and I guess I'm more drawn to inference problems? When I think about what I want to be able to do as a data scientist, it's to have someone bring me a problem and to be able to tell them what's happening in their data, not just build a neural net that's uninterpretable.

Am I way off base here?
Priptonite is offline   Reply With Quote
Old 06-14-2017, 02:03 PM   #57
cannabusto
Pooh-Bah
 
Join Date: Jan 2013
Posts: 3,509
Re: Data Science As A Career?

If the stakeholder cares about finding out what drives the outcome, then you would never fit a RF.

If the stakeholder doesn't care about that but instead cares about predicting that outcome with great accuracy, then maybe you would.
cannabusto is offline   Reply With Quote
Old 06-14-2017, 02:13 PM   #58
Priptonite
Carpal \'Tunnel
 
Priptonite's Avatar
 
Join Date: Jan 2009
Location: Seattle
Posts: 12,471
Re: Data Science As A Career?

Right, it's all about your customer's needs. I guess my question is really about the data science job market and what the majority of companies are looking for.
Priptonite is offline   Reply With Quote
Old 06-14-2017, 03:12 PM   #59
cannabusto
Pooh-Bah
 
Join Date: Jan 2013
Posts: 3,509
Re: Data Science As A Career?

I have no idea about that. I would guess inference often matters though.
cannabusto is offline   Reply With Quote
Old 06-14-2017, 07:59 PM   #60
zoltan
Carpal \'Tunnel
 
zoltan's Avatar
 
Join Date: Jul 2007
Location: Off my lawn you little punk!
Posts: 19,694
Re: Data Science As A Career?

In my industry, clients don't like black boxes. One of our big selling points is that our models (and to a degree, their development) is pretty transparent. FWIW, we tend to go down the random forest / logistic path most of the time.
zoltan is offline   Reply With Quote

Reply
      

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off


Forum Jump


All times are GMT -4. The time now is 08:09 PM.


Powered by vBulletin®
Copyright ©2000 - 2017, Jelsoft Enterprises Ltd.
Copyright © 2008-2010, Two Plus Two Interactive
 
 
Poker Players - Streaming Live Online