Text Mining: Shirin’s Twitter Feed

Text mining and analytics, natural language processing, and topic modelling have definitely become sort of an obsession of mine. I am just amazed by the insights one can retrieve from textual information, and with the ever increasing amounts of unstructured data on the internet, recreational analysts are coming up with the most amazing text mining…

tidyverse 101: Simplifying life for useRs

Hadley Wickham‘s tidyverse has improved the workflow of analysts / data scientists, makes coding errors less likely and code more transparent. You’ve got to love the¬†figure below, representing a simplified workflow of the average analysis project. The tidyverse¬†provides assistance in each of the stages. Various packages provide functionality to perform analytical tasks more effectively, in…

tidyverse: Example: Trump Approval Rate

For those of you unfamiliar with the tidyverse, it is a collection of R packages that share common philosophies and are designed to work together. Most if not all, are created by R-god Hadley Wickham, one of the leads at RStudio. I was introduced to the tidyverse-packages such as ggplot2 and dplyr in my second…

Animated GIFs in R

Sometimes, it can be of interest to examine how two variables correlate over time. For example, how people in a social network (e.g., an organization) behave or move over the course of time. However, it can be hard to display multi-dimensional data in a single plot. Instead of including time as an additional dimension and…

Time Series Analysis 101

A time series can be considered an ordered sequence of values of a variable at equally spaced time intervals. To model such data, one can use time series analysis (TSA). TSA accounts for the fact that data points taken over time may have an internal structure (such as autocorrelation, trend, or seasonal variation) that should…