Welcome to my repository of data science, machine learning, and statistics resources. Software-specific material has to a large extent been listed under their respective overviews: R Resources & Python Resources. I also host a list of SQL Resources and datasets to practice programming. If you have any additions, please comment or contact me!
LAST UPDATED: 21-05-2018
Courses:
- Udacity: Introduction to Descriptive Statistics
- Khan Academy: Linear Algebra
- edX: Introduction to Probability: Part 1 – the Fundamentals (MIT)
- edX: Introduction to Computer Science with Python (MIT)
- Youtube: Artificial Intelligence 2010/2015 (MIT)
- Coursera: Data Science Specialization (John Hopkins)
- Coursera: Machine Learning (Stanford)
- Coursera: Applied Data Science with Python (Michigan)
- Coursera: Applied Machine Learning in Python (Michigan)
- Introduction to Statistical Learning (Hastie & Tibshirani, 2014)
- Statistical Computing for Scientists and Engineers (Notre Dame, 2017)
- Survey Data Collection and Analytics Specialization @Coursera
- UC Business Analytics R Programming Guide (Cincinatti)
Video:
- Deep Learning Book – Accompanying YouTube Video’s
- 3Blue1Brown YouTube Channel
- Deep Learning Demystified – Brandon Rohrer YouTube Channel
Books:
- Machine Learning, Neural and Statistical Classification (Michie, Spiegelhalter, & Taylor, 1994)
- Introduction to Machine Learning (Nilsson, 1998)
- Elements of Statistical Learning (Hastie, Tibshirani, & Friedman, 2001)
- Information Theory, Inference, and Learning Algorithms (MacKay, 2003)
- Data Mining Techniques for Marketing, Sales and CRM (Berry & Linoff, 2004)
- Data Mining: Practical Machine Learning Tools and Techniques (Witten & Frank, 2005)
- Gaussian Processes for Machine Learning (Rasmussen & Williams, 2006)
- Inductive Logic Programming and Its Application to the Temporal Expression Chunking Problem (Poveda & Borràs, 2007)
- Introduction to Machine Learning (Shashua, 2008)
- Modeling with Data (Klemens, 2009)
- Mining of Massive Datasets (Leskovec, Rajamaran, & Ullman, 2010)
- An Introduction to Data Science (Stanton, 2012)
- Think Bayes: Bayesian Statistics Made Simple (Downey, 2012)
- Machine Learning with R (Lantz, 2013)
- Introduction to Statistical Thought (Lavine, 2013
- Introduction to Data Technologies (Murrel, 2013)
- Introduction to Statistical Learning (James, Witten, Hastie, & Tibshirani, 2013)
- Think Stats: Exploratory Data Analysis in Python (Downey, 2014)
- Understanding Machine Learning: From Theory to Algorithms (Shalev-Schwartz & Ben-David, 2014)
- Big Data, Data Mining, and Machine Learning (Dean, 2014)
- Mining the Social Web: Data Mining Facebook, Twitter, LinkedIn, Google+, Github and More (Russel, 2014)
- Data Mining and Analysis: Fundamental Concepts and Algorithms (Zaki & Meira Jr., 2014)
- Regression Models for Data Science in R (Caffo, 2015)
- OpenIntro Statistics (Diez, Barr, & Cetinkaya-Rundel, 2015)
- Bayesian Reasoning and Machine Learning (Barber, 2016)
- Deep Learning (Goodfellow, Bengio, & Courville, 2016)
- R Programming for Data Science (Peng, 2016)
- Bayesian Methods for Hackers: Probabilistic Programming and Bayesian Inference (Davidson-Pilon, 2016)
- Introduction to Empirical Bayes (Robinson, ???)
- Mining Massive Datasets (Leskovec, Rajaraman, & Ullman, ???) @Stanford
- A Programmer’s Guide to Data Mining: The Ancient Art of the Numerati (Zacharski, ???)
- Data Science Live Book (Casas, 2017)
- Statistical Foundations of Machine Learning (Bontempi & Taieb, 2017)
- Machine Learning – Complete Guide (Wikipedia, ???)
- Foundations of Data Science (Adhikari & DeNero, ????) @Berkeley
- Foundations of Data Science* (Blum, Hopcroft, & Kannan, 2017) @Cornell
- Computer Age Statistical Inference (Efron & Hastie, 2017)
- A Course in Machine Learning (Daumé III, 2017)
- R for Data Science (Grolemund & Wickham, 2017)
- Machine Learning God (Stavrev, 2017)
- Overview of survey books and courses via freerangestatistics.info
- Feature Engineering and Selection (Kuhn & Johnson, 2018)
Sentiment Lexicons:
- Sentiment Lexicons for 81 Languages: Sentiment Polarity Lexicons (Positive vs. Negative)
- SentiWordNet: Sentiment WordNet Project
- Thai Sentiment Analysis Toolkit: Positive, negative and swear words in Thai
- German Sentiment Analysis Toolkit: 3468 German words sorted by sentiment
- Vader Lexicon: Lexicon use for the Vader Sentiment Algorithm
- Opinion Lexicon: For Sentiment Analysis
- VerbNet: VerbNet Lexicon, Version 2.1
Cheatsheets:
- Google Developers Machine Learning Glossary
- Neural Network Architectures Cheatsheet by Asimovinstitute.org
- Neural Network Cell Inner Processes Cheatsheet by Asimovinstitute.org
- Neural Network Full Process Cheatsheet by Asimovinstitute.org
- Tensorflow Cheatsheet by Altoros.com
- Statistics & Probability Cheatsheet
- Statistics Cheatsheet by MIT
- Linear Algebra Cheatsheet by minireference.com
- Machine Learning Algorithms Cheatsheet by Scikit-Learn.org
- Machine Learning Algorithms Cheatsheet by Microsoft Azure
- Machine Learning Equations & Tricks Cheatsheet by github.com/soulmachine
- Supervised Learning Python Implementations by github.com/rcompton
- Machine Learning Algorithms R Implementation by Ajitesh Kumar
Other:
- Google Fonts – huge collection of text fonts
- Checkmycolours.com – check whether your colours have enough contrast
- Vischeck.com – check whether your images are colorblind-friendly
- Coblis – Color Blind Simulation
- Color Oracle – color blind simulation
- Chrome color enhancer – customizable color filter for website browsing