Category: reading

Understanding Machine Learning (free e-book)

Understanding Machine Learning (free e-book)

Shai Shalev-Shwartz and Shai Ben-David of the Hebrew University of Jerusalem made their machine learning book free to download.

The book covers the basic foundations up to advanced theory and algorithms. I copied the table of contents below. It’s kind of math heavy, but well explained with visual examples and pseudo-code.

Moreover, the book contains multiple exercises for you to internalize the knowledge and skills.

As an added bonus, the professors teach a number of machine learning courses, the lecture slides and materials of which you can also access for free via the book’s website.

Machine learning is one of the fastest growing areas of computer science, with far-reaching applications. The aim of this textbook is to introduce machine learning, and the algorithmic paradigms it offers, in a principled way. The book provides a theoretical account of the fundamentals underlying machine learning and the mathematical derivations that transform these principles into practical algorithms. Following a presentation of the basics, the book covers a wide array of central topics unaddressed by previous textbooks. These include a discussion of the computational complexity of learning and the concepts of convexity and stability; important algorithmic paradigms including stochastic gradient descent, neural networks, and structured output learning; and emerging theoretical concepts such as the PAC-Bayes approach and compression-based bounds. Designed for advanced undergraduates or beginning graduates, the text makes the fundamentals and algorithms of machine learning accessible to students and non-expert readers in statistics, computer science, mathematics and engineering.

About the book

If you want to reward the professors for their efforts, please do buy a hardcopy version of book.

Table of contents

Part I: Foundations

  • A gentle start
  • A formal learning model
  • Learning via uniform convergence
  • The bias-complexity trade-off
  • The VC-dimension
  • Non-uniform learnability
  • The runtime of learning

Part II: From Theory to Algorithms

  • Linear predictors
  • Boosting
  • Model selection and validation
  • Convex learning problems
  • Regularization and stability
  • Stochastic gradient descent
  • Support vector machines
  • Kernel methods
  • Multiclass, ranking, and complex prediction problems
  • Decision trees
  • Nearest neighbor
  • Neural networks

Part III: Additional Learning Models

  • Online learning
  • Clustering
  • Dimensionality reduction
  • Generative models
  • Feature selection and generation

Part IV: Advanced Theory

  • Rademacher complexities
  • Covering numbers
  • Proof of the fundamental theorem of learning theory
  • Multiclass learnability
  • Compression bounds
  • PAC-Bayes

Appendices

  • Technical lemmas
  • Measure concentration
  • Linear algebra
JavaScript for R — ebook

JavaScript for R — ebook

The R programming language has seen the integration of many languages; C, C++, Python, to name a few, can be seamlessly embedded into R so one can conveniently call code written in other languages from the R console. Little known to many, R works just as well with JavaScript—this book delves into the various ways both languages can work together.

https://book.javascript-for-r.com/

John Coene is an well-known R and JavaScript developer. He recently wrote a book on JavaScript for R users, of which he published an online version free to access here.

The book is definitely worth your while if you want to better learn how to develop front-end applications (in JavaScript) on top of your statistical R programs. Think of better understanding, and building, yourself Shiny modules or advanced data visualizations integrated right into webpages.

A nice step on your development path towards becoming a full stack developer by combining R and JavaScript!

Yet most R developers are not familiar with one of web browsers’ core technology: JavaScript. This book aims to remedy that by revealing how much JavaScript can greatly enhance various stages of data science pipelines from the analysis to the communication of results.

https://book.javascript-for-r.com/

Want to learn more about JavaScript in general, then I recommend this book:

Book: What we know about people in the workplace?

Book: What we know about people in the workplace?

My former colleague at Tilburg University, dr. Brigitte Kroon, summarizes decades of scientific evidence in the field of human resource mangement in her new bookEvidence-based HRM.

She published it open access, so everyone can access it for free.

Brigitte explains what science can (and can not) tell us about the most effective ways to organize and treat people in the workplace. She was able to nicely distill the practical insights from the theoretical frameworks and perspectives.

Read the rest yourself!

Human Resource Management is about managing the labor side of organizations. As labor resides in people, managing labor involves managing people. Because people can think and act in response to management, effective management of people involves a good understanding of psychology, sociology, laws, and economics. Any person in a managerial position should therefore have some basic understanding of human resource management. However, since not every organization is the same, and because the challenges that organizations face are different, there is no ‘one best practice suits all’ recipe for doing HRM. Hence, organizations need people who know where to find the best HRM interventions for the issues that they face.

Brigitte Kroon, Evidence-based HRM
https://www.openpresstiu.org/catalog
Big Book of R

Big Book of R

Oscar Baruffa collected over 100 books on the R programming language, in his Big Book of R.

Many of these books you will also find in my R resources list.

Yet, Oscar sectioned them in neat categories, such as:

Oscar could still use some help maintaining this resource, so have a look at its github to contribute.

Free Springer Books during COVID19

Free Springer Books during COVID19

Update: Unfortunately, Springer removed the free access to its books.

Book publisher Springer just released over 400 book titles that can be downloaded free of charge following the corona-virus outbreak.

Here’s the full overview: https://link.springer.com/search?facet-content-type=%22Book%22&package=mat-covid19_textbooks&facet-language=%22En%22&sortOrder=newestFirst&showAll=true

Most of these books will normally set you back about $50 to $150, so this is a great deal!

There are many titles on computer science, programming, business, psychology, and here are some specific titles that might interest my readership:

Note that I only got to page 8 of 21, so there are many more free interesting titles out there!

Join 1,337 other followers

Solutions to working with small sample sizes

Solutions to working with small sample sizes

Both in science and business, we often experience difficulties collecting enough data to test our hypotheses, either because target groups are small or hard to access, or because data collection entails prohibitive costs.

Such obstacles may result in data sets that are too small for the complexity of the statistical model needed to answer the questions we’re really interested in.

Several scholars teamed up and wrote this open access book: Small Sample Size Solutions.

This unique book provides guidelines and tools for implementing solutions to issues that arise in small sample studies. Each chapter illustrates statistical methods that allow researchers and analysts to apply the optimal statistical model for their research question when the sample is too small.

This book will enable anyone working with data to test their hypotheses even when the statistical model required for answering their questions are too complex for the sample sizes they can collect. The covered statistical models range from the estimation of a population mean to models with latent variables and nested observations, and solutions include both classical and Bayesian methods. All proposed solutions are described in steps researchers can implement with their own data and are accompanied with annotated syntax in R.

You can access the book for free here!