The repository consists of tools for multiple languages (R, Python, Matlab, Java) and resources in the form of:
Books & Academic Papers
Online Courses and Videos
Algorithms and Applications
Open-source and Commercial Libraries/Toolkits
Key Conferences & Journals
Outlier Detection (also known as Anomaly Detection) is an exciting yet challenging field, which aims to identify outlying objects that are deviant from the general data distribution. Outlier detection has been proven critical in many fields, such as credit card fraud analytics, network intrusion detection, and mechanical unit defect detection.
Last year witnessed the creation of many novel types of data visualization. Some lesser known ones, jokingly referred to as xenographics, I already discussed.
Two new visualization formats seem to stick around though. And as always, it was not long before someone created special R packages for them. Get ready to meet waffleplots and swarmplots!
Waffleplots — also called square pie charts — are very useful in communicating parts of a whole for categorical quantities. Bob Rudis (twitter) — scholar and R developer among many other things — did us all a favor and created the R waffle package.
First, we need to install and load the waffle package.
install.packages("waffle") # install waffle package
library(waffle) # load in package
I will use the famous iris data to demonstrate both plots.
Since waffleplots work with frequencies, I will specifically use the iris$Species data stored as a frequency table.
Some examples hosted on the Github page also use the iris dataset, so you can have a look at those. However, I made novel visuals because I prefer theme_light. Hence, I first install the ggbeeswarm package along with ggplot2, and then set the default theme to theme_light.
The second function in the ggbeeswarm package is geom_quasirandom, an alternative to the original geom_jitter. Basically, it’s a convenient tool to offset points within categories to reduce overplotting.
ggplot(iris, aes(Species, Sepal.Length, col = Species)) + geom_quasirandom()
Instead of the quasirandom offset, the geom allows for many other methods, including a smiley face pattern : )
There is also a earlier package on CRAN, called beeswarm, but it doesn’t seem to be maintained anymore. Moreover, its syntax more or less resembles R’s base::plot, whereas I have a strong preference for ggplot2 personally.