The following was reposted from minimaxir.com QUICK INTRODUCTION TO GGPLOT2 ggplot2 uses a more concise setup toward creating charts as opposed to the more declarative style of Python’s matplotlib and base R. And it also includes a few example datasets for practicing ggplot2 functionality; for example, the mpg dataset is a dataset of the performance of popular models of cars … Continue reading Short ggplot2 tutorial by MiniMaxir
Reposted from Variance Explained with minor modifications. This post follows an earlier post on the same topic. A year ago today, I wrote up a blog post Text analysis of Trump’s tweets confirms he writes only the (angrier) Android half. My analysis, shown below, concludes that the Android and iPhone tweets are clearly from different people, posting … Continue reading Variance Explained: Text Mining Trump’s Twitter – Part 2
Reposted from Variance Explained with minor modifications. Note this post was written in 2016, a follow-up was posted in 2017. This weekend I saw a hypothesis about Donald Trump’s twitter account that simply begged to be investigated with data: Follow Todd Vaziri ✔@tvaziri Every non-hyperbolic tweet is from iPhone (his staff). Every hyperbolic tweet is from … Continue reading Variance Explained: Text Mining Trump’s Twitter – Part 1: Trump is Angrier on Android
The Washinton Post is known for the lovely visualizations accompanying their stories. In a recent post, they visualized how long it would take you to get out of the downtown areas of various cities. They compared all the major U.S. cities and examined different leaving times. Unfortunately, I cannot copy the visualizations' text here, but … Continue reading Leaving town at rush hour? Here’s how far you’re likely to get from America’s largest cities.
Reposted from Kasia Kulma's github with minor modifications. Have you ever wondered whether the most active/popular R-twitterers are virtual friends? 🙂 And by friends here I simply mean mutual followers on Twitter. In this post, I score and pick top 30 #rstats twitter users and analyse their Twitter network. You’ll see a lot of applications of rtweet and ggraph packages, as … Continue reading Networks Among #rstats Twitterers
It’s easy to think that disasters as devastating as Typhoon Yolanda – the super typhoon that claimed over 7,000 lives in 2013 – only happen once in a lifetime. However, the Philippines got hit a few more times over the past century. Thinking.Machin.es provides an interactive history of almost every storm, earthquake, flood, volcanic eruption, landslide, … Continue reading 114 Years of Phillipine Disasters, Visualized.
This blog explains t-SNE (t-Distributed Stochastic Neighbor Embedding) by a story of programmers joining forces with musicians to create the ultimate drum machine (if you are here just for the fun, you may start playing right away). Kyle McDonald, Manny Tan, and Yotam Mann experienced difficulties in pinpointing to what extent sounds are similar (ding, dong) … Continue reading t-SNE, the Ultimate Drum Machine and more
It has been twenty years since the first Harry Potter novel, the sorcerer's/philosopher’s stone, was published. To honour the series, I decided to start a text analysis and visualization project, which my other-half wittily dubbed Harry Plotter. In several blogs, I intend to demonstrate how Hadley Wickham’s tidyverse and packages that build on its principles, such as tidytext (free book), have taken programming in R … Continue reading Harry Plotter: Celebrating the 20 year anniversary with tidytext and the tidyverse in R
Shazam is a mobile app that can be asked to identify a song by making it "listen"’ to a piece of music. Due to its immense popularity, the organization's name quickly turned into a verb used in regular conversation ("Do you know this song? Let's Shazam it."). A successful identification is referred to as a Shazam recognition. Shazam users can opt-in … Continue reading Geographical maps using Shazam Recognitions