Tag: free

Solutions to working with small sample sizes

Solutions to working with small sample sizes

Both in science and business, we often experience difficulties collecting enough data to test our hypotheses, either because target groups are small or hard to access, or because data collection entails prohibitive costs.

Such obstacles may result in data sets that are too small for the complexity of the statistical model needed to answer the questions we’re really interested in.

Several scholars teamed up and wrote this open access book: Small Sample Size Solutions.

This unique book provides guidelines and tools for implementing solutions to issues that arise in small sample studies. Each chapter illustrates statistical methods that allow researchers and analysts to apply the optimal statistical model for their research question when the sample is too small.

This book will enable anyone working with data to test their hypotheses even when the statistical model required for answering their questions are too complex for the sample sizes they can collect. The covered statistical models range from the estimation of a population mean to models with latent variables and nested observations, and solutions include both classical and Bayesian methods. All proposed solutions are described in steps researchers can implement with their own data and are accompanied with annotated syntax in R.

You can access the book for free here!

Google’s Dataset Search: Direct access to 25 million interesting datasets

Google’s Dataset Search: Direct access to 25 million interesting datasets

I used to keep a repository of links to interesting datasets to learn data science. However, that page I can retire, as Google has launched its new service Dataset Search.

The “world wide web” hosts millions of datasets, on nearly any topic you can think of. Google’s Dataset Search has indexed almost 25 million of these datasets, giving you a single entry point to search for datasets online. After a year of testing, Dataset Search is now officially out of beta.

After alpha testing, Dataset Search now includes filter based on the types of dataset that you want (e.g., tables, images, text), on whether the dataset is open source/access. For dataset on geographic area’s, you can see the map. The quality of dataset’s descriptions has improved greatly, and the tool now has a mobile version.

Anyone who publishes data can make their datasets discoverable in Dataset Search by describe the properties of their dataset using a special schema on their own web page.

CodeWars: Learn programming through test-driven development

CodeWars: Learn programming through test-driven development

As I wrote about Project Euler and CodingGame before, someone recommended me CodeWars. CodeWars offers free online learning exercises to develop your programming skills through fun daily challenges.

In line with Project Euler, you are tasked with solving increasingly complex programming challenges. At CodeWars, these little problems you need to solve with code are called kata.

Kata take a test-driven development approach: the programs you write need to pass the tests of the developer who made the kata in the first place. Only then are you awarded with honour and can you earn your ranks and progress to the more complex kata.

Sounds fun right? I’m definitely going to check this out, as they support a wide range of programming languages, each with many kata to solve!

Python, Ruby, C++, Java, JavaScript and many other main programming languages are already supported, but CodeWards is also still developing kata for more niche or upcoming languages like R, Lua, Kotlin, and Scala.

Finland’s free online AI crash course

Finland’s free online AI crash course

Finland developed a crash course on AI to educate its citizens. The course was arguably a great local success, with over 50 thousand Fins taking the course (1% of the population).

Now, as a gift to the European Union, Finland has opened up the course for the rest of Europe and the world to enjoy.

All pictures are screenshots taken from the website

The course is even being translated into several local languages. At the time of writing, five Northern European languages are already supported, but additional translation efforts are still in progress.

Elements of AI takes six weeks and functions as a crash course and beginner introduction to the field of AI:

Online Workshop Tidy Data Science in R, by Jake Thompson

Online Workshop Tidy Data Science in R, by Jake Thompson

Here’s a website hosting for a five-day hands-on workshop based on the book “R for Data Science”.

The workshop was originally offered as part of the Stats Camp: Summer Statistical Institute in Lawrence, KS and hosted by the Center for Research Methods and Data Analysis and the Achievement and Assessment Instituteat the University of Kansas. It is designed for those who want to learn practical applications of R for data analysis.

You can download the Workshop files, but I suggest you do so via the original workshop webpage.

This workshop is designed for those who want to learn how to use R to analyze data. The material is based on Hadley Wickham and Garrett Grolemund’s R for Data Science. We’ll talk about how to conduct a complete data analysis from data import to final reporting in R using a suite of packages known as the tidyverse. The two goals of this workshop are: 1) learn how to use R to answer questions about our data; and 2) write code that is human readable and reproducible. We will also talk about how to share our code and analyses with others.

You should take this workshop if you are new to R, or to the tidyverse, and want to learn how to take advantage of this ecosystem to do data analysis. You’ll get the most from the workshop if you are primarily interested in applying pre-existing R packages and functions to your own data. We will give minimal tutorials on how to write your own functions; however, the main focus will be on using existing tools, rather than building our own.

About this workshop

=

Object-Oriented Programming with Java

Object-Oriented Programming with Java

Now that I’m slowly familiarizing myself in the world of Python, I am much more often confronted with classes and object-oriented programming (OOP). While R has its own OOP paradigms (yes, multiple, obviously, it’s R after all), I have never experienced the need to create my own classes. However, in other languages, like Python, Ruby, or Java, OOP is much more an essential of developers’ and programmers’ skillsets.

Now, I personally won’t start on learning Java anytime soon. Hence, I am just sharing this pearl of a resource with a wider audience right now. This MOOC by the university of Helsinki has been in my inbox for quite a while: Object-Oriented Programming with Java. If you understand Finnish, you can even take the 2019 Finnish version of the course.

During this course you will learn all the basics of computer programming, algorithms and object-oriented programming using the Java programming language. The course includes comprehensive course materials and plenty of programming exercises, each tested using our automatic testing service Test My Code.

Part 1 of the course will teach you all the basics of the Java language:

Part 2 continues with some more advanced topics:

While I have not taken the course myself yet, I have read a lot of good reviews about it. Moreover, what better way to learn a new language than by deep diving into it with a specialized topic like OOP. And it’s free! And taught by trained academics! What are you still doing here, start learning!

Free Python Tutorials & Courses, by CourseDuck

Free Python Tutorials & Courses, by CourseDuck

CourseDuck founder Michael Kuhlman was nice enough to point me to their overview of curated (&free!) Python courses and tutorials.

This overview is curated in the sense that all resources are rated by CourseDuck’s users. These ratings seem quite reliable, at least, I personally enjoyed their top-3 resources sometime the past years:

Note that all these courses, as well as the curated overview, come free of charge! A great resource for starting data scientists or upcoming pythonistas!