Cover image via Hacker Noon.
Norm Matloff is a professor of Computer Science at University College Davis. He recently updated his viewpoint on whether R or Python is the best language for Data Science. While I normally hate those opinionated comparisons, Norm’s outline of the two languages’ (dis)advantages is actually quite balanced and well-versed.
I strongly recommend that you read Norm’s original piece here.
I can mostly agree with Norm, although the blog reads as if he has a (slight) bias in favor of R. In his original blog, Norm discusses many different programming topics and provides detailed information on why he considers certain topics big wins, slight edges, or ties between the two programming languages.
In the table below, I’ve tried to summarize Norm’s opinions by converting his words to 0-100 scores per topic for a quicker overview. I’ve converted Norm’s words to scores: his huge win became 100-0, a big win 80-20, a win 70-30, an edge 60-40, and a tie 50-50.
Python | R | |
Elegance | 100 | |
Learning curve | 100 | |
Data Science libraries | 40 | 60 |
Machine Learning | 60 | 40 |
Statistical correctness | 20 | 80 |
Parallel computing | 50 | 50 |
C/C++ interface | 40 | 60 |
Object orientation, metaprogramming | 40 | 60 |
Language unity | 100 | |
Linked data structures | 70 | 30 |
Online help | 20 | 80 |
I personally started my career with R, so that’s definitely my favorite programming language. However, I think that Python is more convenient and faster on certain topics, and closer to more mainstream programming languages, which I why I’m currently learning it next to using R.
If you want to learn R, I can recommend you follow my quick 6-step guide to learning R programming. Alternatively, Norm points to his quick tutorial on R for non-programmers, and a tutorial on Python, for learners with a programming background.
Happy learning!
PS. This tweet by John summarizes the whole discussion quite well.