The book covers the basic foundations up to advanced theory and algorithms. I copied the table of contents below. It’s kind of math heavy, but well explained with visual examples and pseudo-code.
Moreover, the book contains multiple exercises for you to internalize the knowledge and skills.
As an added bonus, the professors teach a number of machine learning courses, the lecture slides and materials of which you can also access for free via the book’s website.
Machine learning is one of the fastest growing areas of computer science, with far-reaching applications. The aim of this textbook is to introduce machine learning, and the algorithmic paradigms it offers, in a principled way. The book provides a theoretical account of the fundamentals underlying machine learning and the mathematical derivations that transform these principles into practical algorithms. Following a presentation of the basics, the book covers a wide array of central topics unaddressed by previous textbooks. These include a discussion of the computational complexity of learning and the concepts of convexity and stability; important algorithmic paradigms including stochastic gradient descent, neural networks, and structured output learning; and emerging theoretical concepts such as the PAC-Bayes approach and compression-based bounds. Designed for advanced undergraduates or beginning graduates, the text makes the fundamentals and algorithms of machine learning accessible to students and non-expert readers in statistics, computer science, mathematics and engineering.
The Open Source Society University offers a complete education in computer science using online materials.
According to their GitHub page, the curriculum is suited for people with the discipline, will, and good habits to obtain this education largely on their own, but who’d still like support from a worldwide community of fellow learners.
Intro CS: for students to try out CS and see if it’s right for them
Core CS: corresponds roughly to the first three years of a computer science curriculum, taking classes that all majors would be required to take
Advanced CS: corresponds roughly to the final year of a computer science curriculum, taking electives according to the student’s interests
Final Project: a project for students to validate, consolidate, and display their knowledge, to be evaluated by their peers worldwide
Pro CS: graduate-level specializations students can elect to take after completing the above curriculum if they want to maximize their chances of getting a good job
It is possible to finish Core CS within about 2 years if you plan carefully and devote roughly 18-22 hours/week to your studies. Courses in Core CS should be taken linearly if possible, but since a perfectly linear progression is rarely possible, each class’s prerequisites are specified so that you can design a logical but non-linear progression based on the class schedules and your own life plans.
Both in science and business, we often experience difficulties collecting enough data to test our hypotheses, either because target groups are small or hard to access, or because data collection entails prohibitive costs.
Such obstacles may result in data sets that are too small for the complexity of the statistical model needed to answer the questions we’re really interested in.
This unique book provides guidelines and tools for implementing solutions to issues that arise in small sample studies. Each chapter illustrates statistical methods that allow researchers and analysts to apply the optimal statistical model for their research question when the sample is too small.
This book will enable anyone working with data to test their hypotheses even when the statistical model required for answering their questions are too complex for the sample sizes they can collect. The covered statistical models range from the estimation of a population mean to models with latent variables and nested observations, and solutions include both classical and Bayesian methods. All proposed solutions are described in steps researchers can implement with their own data and are accompanied with annotated syntax in R.
The “world wide web” hosts millions of datasets, on nearly any topic you can think of. Google’s Dataset Search has indexed almost 25 million of these datasets, giving you a single entry point to search for datasets online. After a year of testing, Dataset Search is now officially out of beta.
After alpha testing, Dataset Search now includes filter based on the types of dataset that you want (e.g., tables, images, text), on whether the dataset is open source/access. For dataset on geographic area’s, you can see the map. The quality of dataset’s descriptions has improved greatly, and the tool now has a mobile version.
As I wrote about Project Euler and CodingGame before, someone recommended me CodeWars. CodeWars offers free online learning exercises to develop your programming skills through fun daily challenges.
In line with Project Euler, you are tasked with solving increasingly complex programming challenges. At CodeWars, these little problems you need to solve with code are called kata.
Kata take a test-driven development approach: the programs you write need to pass the tests of the developer who made the kata in the first place. Only then are you awarded with honour and can you earn your ranks and progress to the more complex kata.
Sounds fun right? I’m definitely going to check this out, as they support a wide range of programming languages, each with many kata to solve!