Saturday, July 4, 2015

Resources for Getting Started With Data Science

Many of my MBA students who pursue jobs in product management, on growth teams, or as founders want to build data analytics and data science skills.

Peter Jamieson, a former student of mine who is now a data scientist at Pixability, a Boston-based video ad analytics startup, shared the following suggestions for getting started with data science:

Here's a list of some of the resources that have been helpful for me as I've gotten up to speed in data science. 

My go-to tool for working with data these days is Python. It can be tough to get everything you need to get started. Fortunately, Anaconda and Enthought both offer free distributions that are nearly plug-and-play. 

IPython is a tool, included in those downloads, that (among other features) lets people manipulate snippets of code in their browser. Becoming comfortable with launching and using IPython notebooks helps you take a number of online courses and share code. 

There is an incredible amount of content out there and it can be hard to sift through it all. I've picked some highlights, some of which are focused on business understanding, some on implementation. 

Books:
  • Data Science for Business:  Some math but no programming; a good resource for getting started that provides business use cases.
  • DataSmart:  Implements popular data science models in Excel and contains an intro to R. I'd recommend this for people without a programming background who are just starting out.
  • Here's a list of more advanced machine learning titles from Quora -- very technical, not for the faint of heart.
  • Other (free) books covering everything from coding to managing data science teams.

Blogs:

Online Classes:

Lectures/Powerpoints:

Other Lists -- places to look if you can't find what you want above:

Thanks to Peter for sharing this list. Readers: if you have items to add, please use the comments section.

Addendum, Oct. 26, 2015: LearnDataSci.com has compiled a list of free ebooks about data science (thanks, Brendan!).