Open Side Menu Go to the Top
Register
Data Science As A Career? Data Science As A Career?

06-01-2016 , 07:36 PM
Just accepted admission to University of Washington's Data Science Masters program, starting in September. Really excited to get started. Can provide updates/feedback if anyone is interested.
Data Science As A Career? Quote
06-11-2016 , 02:25 AM
congrats! please do update.
Data Science As A Career? Quote
06-12-2016 , 02:56 PM
Quote:
Originally Posted by Priptonite
Just accepted admission to University of Washington's Data Science Masters program, starting in September. Really excited to get started. Can provide updates/feedback if anyone is interested.
Saw several talks by UW data scientists (staff) recently (Higher Ed conference). Good work being done there.
Data Science As A Career? Quote
06-22-2016 , 01:02 AM
Quote:
Originally Posted by Priptonite
Just accepted admission to University of Washington's Data Science Masters program, starting in September. Really excited to get started. Can provide updates/feedback if anyone is interested.

id be interested to hear how that goes for you too. good luck!
Data Science As A Career? Quote
06-24-2016 , 02:10 PM
Quote:
Originally Posted by Priptonite
Just accepted admission to University of Washington's Data Science Masters program, starting in September. Really excited to get started. Can provide updates/feedback if anyone is interested.
subscribed
Data Science As A Career? Quote
10-03-2016 , 01:11 PM
Alright, so my master's program officially started this past Wednesday. Here is the program website. They offer both part-time (1 course/quarter) and full-time study (2 courses/quarter). I'll be doing the part-time track. My course progression looks like this:

Intro to Stats and Probability
Applied Stats and Experimental Design
Statistical Machine Learning
Data Visualization and Exploratory Analytics
Data Management
Software Design for Data Science
Scalable Data Systems and Algorithms
Human-Centered Data Science
Capstone Project

My stats background isn't super strong so the first two classes will be a welcome review for me, even if they might be somewhat dull. I'm really excited to get to machine learning and data viz, though.

Classes are 5 hours long, which is ****ing brutal. Our stats professor is bringing in a bunch of guest lecturers (local data science professionals, university researchers) to break up the monotony, but it's still a crazy amount of time to be in a classroom. We also have a lab session for the last hour of class where we work in R. The first one was extremely basic, I'm hoping that we're asked to do a lot of R in homeworks as we progress. One of my big concerns going into the program was/is that my programming isn't up to snuff. I took an intro java class in college but otherwise just have a handful of Coursera classes in R and Python.

I don't think this quarter will have much in the way of interesting projects, so I'm not sure I'll have much content to post. I can do a "who am I, how did I get here?" post if people are interested.
Data Science As A Career? Quote
10-03-2016 , 08:00 PM
Quote:
Originally Posted by Priptonite
I can do a "who am I, how did I get here?" post if people are interested.
I'd be interested. Best of luck w/ everything!
Data Science As A Career? Quote
10-04-2016 , 01:48 AM
@Priptonite : You should do a separate thread so that those of us interested in following your progress can follow more closely imo.
Data Science As A Career? Quote
03-15-2017 , 01:47 AM
haven't forgotten about updating progress! just wrapping up the 2nd quarter, will post something soon
Data Science As A Career? Quote
03-15-2017 , 11:23 AM
Great, interested to hear about it. And feel free to pm me any time about your coursework. May be able to help.
Data Science As A Career? Quote
03-15-2017 , 04:09 PM
Alright, so at this stage I've finished the first two classes:

Intro to Stats and Probability
Applied Stats and Experimental Design

I think these were a pretty logical place to start for the program and as I mentioned previously it'd been a while since I'd taken stats classes, so I was glad to have these on the course schedule. I think some other programs might assume you have more stats knowledge up front and not start from the bottom up.

I don't think there's too much to say about the content of the courses. Basic probability theory, common distributions, moments, MGFs, hypothesis testing, linear regression, GLMs, robust methods, bootstrapping/jacknifing. We did very little programming in the first course, but all of our homeworks were exclusively in R for the 2nd. I really love rmarkdown now, which is a package I knew nothing about.

Overall I have mixed feelings. It's funny being an adult (does 26 count as an adult?) and going back to school vs being a college student. Neither class had any tests, everything was just homework based. As a college student I would've loved that. Knowing that I'm going to try to build a career on this degree in ~2 years though... There were a lot of times throughout the quarters where I felt that if you'd asked me a question on what we learned 3 weeks ago I'd have to go to a book/online to get you the answer. Not having to study old material and only having class once per week really reduced how much information I retained. Obviously that can be remedied by me having good study habits regardless of what's asked of me, working on that

Another thing I mentioned in an earlier post was concern with programming skill. Thus far it hasn't been a problem, but we did pretty simple R. Our professor took the approach of, "I'm not going to teach you about packages because I want you to learn the basic way to do things." Additionally, the course isn't designed to teach you R, just what you need to know to solve the HW problems. I was a bit bummed that we didn't spend more time learning R, but a classmate posted this R for Data Science online book and I've been working through that. I don't know that we're going to get into much Python in the program, which is a bit concerning.

That being said, the mental approach I'm taking is that what I really need to get out of this program is the fundamental stats knowledge, knowledge of how to approach a data science problem, and then an understanding of SOME tools that I can work it. No program will teach you all of SAS, R, Python, etc (I assume lol). It's just daunting thinking of applying to jobs at the end and being able to only work in R.

Looking ahead class-wise, I think the next class is the big one, Statistical Machine Learning. I'm really psyched for it and am going to try to develop a habit of studying beyond what's asked of me for homework. One of my concerns with the program is that after ML I dive into the "data" track. Visualization, databases, software design. I'm worried that during that year I'm going to forget all of my stats from this year.

Looking ahead in general, I think I'm very aware that I need to be doing things on the side. Next year for example, I might need some personal projects to keep my stats sharp. Additionally, I don't particularly know that I'm going to have a project portfolio to show potential employers at the end of all of this (I'll have at least the 1 capstone project), and my current resume is lacking in data analytics. I'm planning on starting some sort of blog or at least a github page and developing a project portfolio. I figure after the ML class I can jump into some Kaggle competitions or something. I've also got an idea for a baseball analytics project I'd like to do. Hoping to get that spun up in the next month or so. I'm really struggling with finding enough free time to tackle these personal initiatives though.

I'm feeling pretty good, overall. This is the stuff that makes me tick and while the 5 hour classes can be brutal I also find myself really engaged and excited. My worry at this point is if I'm going to know enough and have enough experience at the end of the program to get a real data science job. That being said, I'm only 2 classes in and am probably making a mountain out of a molehill.

Happy to answer any questions, and still planning on writing some sort of 'how did I get here?' post.
Data Science As A Career? Quote
03-16-2017 , 10:20 AM
very nice writeup, ty
Data Science As A Career? Quote
03-16-2017 , 10:37 AM
Perhaps do your side projects in python if you're worried about that.

And I feel you on having to look everything up. I still look up everything before I do it, just to be sure. You have class 0 times a week once you graduate.
Data Science As A Career? Quote
03-17-2017 , 05:02 AM
I last worked two years ago in this field at a major bank. There was an absolute critical shortage of people who knew stats, could code in SQL/R, and had decent presentation and communication skills in English. These folks were needed in nearly every department of the bank in some shape or fashion. There wasn't even a question of finding people with a good finance or banking background in addition to these other skills as they were nearly all making way more than this job paid, and this was a six figure salary job already.

It was so bad that we were hiring people who were good at one of these three and passable in another with the idea of training them up or having someone else do their work in the third category. I was always really shocked at the stats PhDs that didn't know SQL at all and weren't particularly good at R or SAS as we gave most people the option to use whatever they wanted including Stata and stuff.

This was two years ago and I have no idea what things are like now; I imagine it is a little tougher but not much.

Last edited by The IRS; 03-17-2017 at 05:07 AM.
Data Science As A Career? Quote
03-17-2017 , 02:38 PM
'The IRS', what are you some kind of spook?
Data Science As A Career? Quote
03-17-2017 , 09:18 PM
Quote:
Originally Posted by The IRS
I last worked two years ago in this field at a major bank. There was an absolute critical shortage of people who knew stats, could code in SQL/R, and had decent presentation and communication skills in English. These folks were needed in nearly every department of the bank in some shape or fashion. There wasn't even a question of finding people with a good finance or banking background in addition to these other skills as they were nearly all making way more than this job paid, and this was a six figure salary job already.

It was so bad that we were hiring people who were good at one of these three and passable in another with the idea of training them up or having someone else do their work in the third category. I was always really shocked at the stats PhDs that didn't know SQL at all and weren't particularly good at R or SAS as we gave most people the option to use whatever they wanted including Stata and stuff.

This was two years ago and I have no idea what things are like now; I imagine it is a little tougher but not much.
but sql can be learned in like 2 days. or just look it up as you go. or am I misunderstanding what sql is?
Data Science As A Career? Quote
03-17-2017 , 09:32 PM
If you think you can master it in 2 days, then probably you misunderstand it.
Data Science As A Career? Quote
03-20-2017 , 04:07 PM
learning and being able to use is not the same as mastering. still, sql seems pretty straightforward. at least the way I use it.
Data Science As A Career? Quote
03-20-2017 , 09:06 PM
Quote:
Originally Posted by Victor
learning and being able to use is not the same as mastering. still, sql seems pretty straightforward. at least the way I use it.
I agree that you can probably understand the basics in 2 days. But there are many people I know who use SQL fairly frequently that don't really understand it. And I have seen many professionals make very egregious mistakes in query design, and in table design.

For example, I once did a contract job where I was fixing the billing system for a power company. They were in real trouble because their system was taking up to a week to calculate their bills. There were some queries where for a single customer, calculating a bill could take hours. A few would essentially never finish. They were billing these customers by hand.

The DML was complete trash - it was implemented in a way that was difficult to query.

And to top it off, every sub-report of the bill was required to be in one query, which led to some severe gymnastics to get everything in one place, resulting in queries that would literally take hours to complete.

When I was done, the system could bill about 1 customer per second.

This is how bad you can **** up SQL: you can make a very high dollar product that is 3 orders of magnitude slower than it should be.
Data Science As A Career? Quote
03-22-2017 , 10:07 AM
ok that is absurd.

and while that is obv a problem with implementing sql, I think there should be other controls in place to avoid such issues. like, I just can not imagine releasing something that awful from dev/qa into production.
Data Science As A Career? Quote
03-22-2017 , 12:33 PM
Heh, this was way worse than "released to production"

It was a very high dollar billing system sold to, I assume, big government and corporate clients.

This stuff is, or at least was, completely standard. Everything is ****. I never ran out of things to do as a travelling fix-it programmer.
Data Science As A Career? Quote
03-22-2017 , 05:39 PM
Quote:
Originally Posted by RustyBrooks
Heh, this was way worse than "released to production"

It was a very high dollar billing system sold to, I assume, big government and corporate clients.

This stuff is, or at least was, completely standard. Everything is ****. I never ran out of things to do as a travelling fix-it programmer.
+1. You will be surprised what even billion dollar companies will ship out the door for software.

Talking as a business consumer of said software not as a developer that has seen it happen.
Data Science As A Career? Quote
03-22-2017 , 06:14 PM
And on the other end, if i want to change a single word it requires 3 signoffs and a few weeks.
Data Science As A Career? Quote
04-08-2017 , 07:18 PM
2 weeks into the machine learning class and **** is getting real lol. Week 1 was an intro: supervised vs unsupervised learning, regression vs classification problems, a few examples, etc.

Then this week we jump right into defining loss functions, ridge regression, gradient descent w/ backtracking line search, and cross validation. Still wrapping my head around ridge regression and how the penalizing results in a better model.

I think it's basically a protection against overfitting? So you're minimizing your squared error + penalization term. Your high influence predictors will lower your squared error, but up your penalty. I guess what I'm missing is the low influence predictors. They give high squared error but low penalties. Seems weird that you'd want to "inflate" these.
Data Science As A Career? Quote
04-08-2017 , 09:35 PM
Quote:
Originally Posted by Priptonite
Still wrapping my head around ridge regression and how the penalizing results in a better model.

I think it's basically a protection against overfitting?
Yes. One way to think of it in terms of the "bias-variance" tradeoff (which you probably haven't encountered yet...) another is as a Bayesian prior that the predictors are useless (and so your prior is that they take a zero coefficient).

Quote:
So you're minimizing your squared error + penalization term. Your high influence predictors will lower your squared error, but up your penalty.
This doesn't seem like a good intuition...it's more that they reduce error "more" than they contribute to a penalty...

Quote:
I guess what I'm missing is the low influence predictors. They give high squared error but low penalties. Seems weird that you'd want to "inflate" these.
...and this seems even more strange. How are they getting "inflated"?
Data Science As A Career? Quote

      
m