Category: learning

Become a Data Science Professional

Become a Data Science Professional

Amit Ness gathered an impressive list of learning resources for becoming a data scientist.

It’s great to see that he shares them publicly on his github so that others may follow along.

But beware, this learning guideline covers a multi-year process.

Amit’s personal motto seems to be “Becoming better at data science every day“.

Completing the hyperlinked list below will take you several hundreds days at the least!

Learning Philosophy:

Index

Bayesian Statistics using R, Python, and Stan

Bayesian Statistics using R, Python, and Stan

For a year now, this course on Bayesian statistics has been on my to-do list. So without further ado, I decided to share it with you already.

Richard McElreath is an evolutionary ecologist who is famous in the stats community for his work on Bayesian statistics.

At the Max Planck Institute for Evolutionary Anthropology, Richard teaches Bayesian statistics, and he was kind enough to put his whole course on Statistical Rethinking: Bayesian statistics using R & Stan open access online.

You can find the video lectures here on Youtube, and the slides are linked to here:

Richard also wrote a book that accompanies this course:

For more information abou the book, click here.

For the Python version of the code examples, click here.

Google’s Responsible AI Practices

Google’s Responsible AI Practices

As a company that uses a lot of automation, optimization, and machine learning in their day-to-day business, Google is set on developing AI in a socially responsible way.

Fortunately for us, Google decided to share their principles and best practices for us to read.

Google’s Objectives for AI applications

The details behind the seven objectives below you can find here.

  1. Be socially beneficial.
  2. Avoid creating or reinforcing unfair bias.
  3. Be built and tested for safety.
  4. Be accountable to people.
  5. Incorporate privacy design principles.
  6. Uphold high standards of scientific excellence.
  7. Be made available for uses that accord with these principles.

Moreover, there are several AI technologies that Google will not build:

Google’s best practices for Responsible AI

For the details behind these six best practices, read more here.

  1. Use a Human-centered approach (see also here)
  2. Identify multiple metrics to assess training and monitoring
  3. When possible, directly examine your raw data
  4. Understand the limitations of your dataset and model
  5. Test, Test, Test,
  6. Continue to monitor and update the system after deployment
Learn to style HTML using CSS — Tutorials by Mozilla

Learn to style HTML using CSS — Tutorials by Mozilla

Cascading Stylesheets — or CSS — is the first technology you should start learning after HTML. While HTML is used to define the structure and semantics of your content, CSS is used to style it and lay it out. For example, you can use CSS to alter the font, color, size, and spacing of your content, split it into multiple columns, or add animations and other decorative features.

https://developer.mozilla.org/en-US/docs/Learn/CSS

I was personally encoutered CSS in multiple stages of my Data Science career:

  • When I started using (R) markdown (see here, or here), I could present my data science projects as HTML pages, styled through CSS.
  • When I got more acustomed to building web applications (e.g., Shiny) on top of my data science models, I had to use CSS to build more beautiful dashboard layouts.
  • When I was scraping data from Ebay, Amazon, WordPress, and Goodreads, my prior experiences with CSS & HTML helped greatly to identify and interpret the elements when you look under the hood of a webpage (try pressing CTRL + SHIFT + C).

I know others agree with me when I say that the small investment in learning the basics behind HTML & CSS pay off big time:

I read that Mozilla offers some great tutorials for those interested in learning more about “the web”, so here are some quicklinks to their free tutorials:

Screenshot via developer.mozilla.org/en-US/docs/Learn/CSS/CSS_layout/Introduction
David Robinson’s R Programming Screencasts

David Robinson’s R Programming Screencasts

David Robinson (aka drob) is one of the best known R programmers.

Since a couple of years David has been sharing his knowledge through streaming screencasts of him programming. It’s basically part of R’s #tidytuesday movement.

Alex Cookson decided to do us all a favor and annotate all these screencasts into a nice overview.

https://docs.google.com/spreadsheets/d/1pjj_G9ncJZPGTYPkR1BYwzA6bhJoeTfY2fJeGKSbOKM/edit#gid=444382177

Here you can search for video material of David using a specific function or method. There are already over a thousand linked fragments!

Very useful if you want to learn how to visualize data using ggplot2 or plotly, how to work with factors in forcats, or how to tidy data using tidyr and dplyr.

For instance, you could search for specific R functions and packages you want to learn about:

Thanks David for sharing your knowledge, and thanks Alex for maintaining this overview!

An ABC of Artificial Intelligence Concepts

An ABC of Artificial Intelligence Concepts

Yet another great resource by one of the teams at Google in collaboration with Oxford:

An ABC of Artificial Intelligence-related concepts!

The G is for GANs: Generative Adverserial Networks.

Want to know what GANs are all about?

Just read along with Google’s laymen explanation! Here’s an excerpt:

The P is for Predictions.

Currently the ABC is only available in English, but other language translations come available soon.

Check it out yourself!