Category: learning

What to consider when choosing colors for data visualization, by DataWrapper.de

What to consider when choosing colors for data visualization, by DataWrapper.de

Lisa Charlotte Rost of DataWrapper often writes about data visualization and lately she has focused on the (im)proper use of color in visualization. In this recent blog, she gives a bunch of great tips and best practices, some of which I copied below:

color in data vis advice
Gradient colors can be great to show a pattern but, for categorical data, it is often easier to highlight the most important values with colored bars, positions (like in a dot plot) or even areas. [https://blog.datawrapper.de/colors/]
color in data vis advice
If you need more than seven colors in a chart, consider using another chart type or to group categories together. [https://blog.datawrapper.de/colors/]
color in data vis advice
Consider using the same color for the same variables, but do differentiate between categories, even across graphics. [https://blog.datawrapper.de/colors/]
color in data vis advice
Using grey for less important elements in your chart makes your highlight colors (which should be reserved for your most important data points) stick out even more.  [https://blog.datawrapper.de/colors/]
color in data vis advice
Consider color-blind people. There are many different types of color blindness: Use an online tool or Datawrapper’s automatic colorblind-check. [https://blog.datawrapper.de/colors/]
 You can find additional useful tips in the original DataWrapper blog.

Transitioning from Excel to R: Dictionary of common functions

Transitioning from Excel to R: Dictionary of common functions

Alyssa Columbus published maintains this GitHub repository with a great tutorial on how to move from Excel to R. Very useful for beginning useRs, the repository’s tutorial includes a translation table between Excel and R functions:

Excel Formula R Function Type
ABS abs Arithmetic
ADDRESS assign Essentials
AND &,&&,all Boolean
AVERAGE, AVG, AVERAGEIF mean Arithmetic
BETADIST pbeta Statistics
BETAINV qbeta Statistics
BINOMDIST pbinom when cumulative,dbinom when not Statistics
CEILING ceiling Arithmetic
CELL str has the same idea Essentials
CHIDIST, CHISQDIST pchisq Statistics
CHIINV, CHISQINV qchisq Statistics
CHITEST chisq.test Statistics
CHOOSE switch Essentials
CLEAN gsub Text
COLS, COLUMNS ncol Essentials
COLUMN col,:,seq Essentials
COMBIN choose Essentals
CONCATENATE paste Text
CONFIDENCE -qnorm(alpha/2)*std/sqrt(n) Statistics
CORREL cor Statistics
COUNT, COUNTIF length Arithmetic
COVAR cov Statistics
CRITBINOM qbinom Statistics
DELTA identical Boolean
EXACT == Boolean
EXP exp Arithmetic
EXPONDIST pexp when cumulative,dexp when not Statistics
FACT factorial Arithmetic
FACTDOUBLE dfactorial in the phangorn package Arithmetic
FDIST pf Statistics
FIND regexpr,grepl,grep Text
FINV qf Statistics
FISHER atanh Arithmetic
FISHERINV tanh Arithmetic
FIXED format,sprintf,formatC Essentials
FLOOR floor Arithmetic
FORECAST predict on an lm object Statistics
FREQUENCY cut,table Arithmetic
FTEST var.test Statistics
GAMMADIST pgamma if last argument T,dgamma if last arg. F Statistics
GAMMAINV qgamma Statistics
GAMMALN lgamma Statistics
GAUSS pnorm(x) - 0.5 Statistics
GCD gcd Arithmetic
GEOMEAN exp(mean(log(x))) Arithmetic
GESTEP >= Boolean
HARMEAN harmonic.mean in the psych package Arithmetic
HLOOKUP match,merge Essentials
HYPGEOMDIST dhyper Statistics
IF if,ifelse Essentials
IFERROR try,tryCatch Essentials
INDEX x[y,z] Essentials
INDIRECT get Essentials
INT as.integer(not for negative numbers),floor Arithmetic
INTERCEPT first element of coef of an lm object Statistics
ISLOGICAL is.logical Boolean
ISNA is.na Boolean
ISNUMBER is.numeric Boolean
ISTEXT is.character Boolean
KURT kurtosis in the moments package Statistics
LARGE sort Statistics
LCM scm in the schoolmath package Arithmetic
LEFT substr Text
LEN, LENGTH nchar Text
LINEST lm Statistics
LN, LOG log Arithmetic
LOG10 log10 Arithmetic
LOGINV qlnorm Statistics
LOGNORMDIST plnorm Statistics
LOWER tolower Text
MATCH match,which Essentials
MAX max (sometimes pmax) Arithmetic
MDETERM det Arithmetic
MEDIAN median Arithmetic
MID substr Text
MIN min (sometimes pmin) Arithmetic
MINVERSE solve Arithmetic
MMULT %*% Arithmetic
MOD %% Arithmetic
MODE as.numeric(names(which.max(table(x)))) Arithmetic
MUNIT diag Arithmetic
N as.numeric Arithmetic
NEGBINOMDIST dnbinom Statistics
NORMDIST, NORMSDIST pnorm when cumulative,dnorm when not Statistics
NORMINV, NORMSINV qnorm Statistics
NOT ! Boolean
NOW date,Sys.time Essentials
OR ` ,
PEARSON cor Statistics
PERCENTILE quantile Statistics
PERCENTRANK ecdf Statistics
PERMUT function(n,k) {choose(n,k)*factorial(k)} Arithmetic
PERMUTATIONA n^k Arithmetic
PHI dnorm Statistics
POISSON ppois when cumulatic,dpois when not Statistics
POWER ^ Arithmetic
PROB ecdf Statistics
PRODUCT prod Arithmetic
PROPER toupper Text
QUARTILE quantile Arithmetic
QUOTIENT %/% Arithmetic
RAND runif Arithmetic
RANDBETWEEN sample Arithmetic
RANK rank Essentials
REPLACE sub,gsub Text
REPT rep and paste or paste0 Text
RIGHT substring Text
ROUND round Arithmetic
ROUNDDOWN floor Arithmetic
ROUNDUP ceiling Arithmetic
ROW row,:,seq Essentials
ROWS nrow Essentials
RSQ summary of lm object Statistics
SEARCH regexpr,grep Text
SIGN sign Arithmetic
SKEW skewness in the moments package Statistics
SLOPE in coef of lm object Statistics
SMALL sort Arithmetic
SQRT sqrt Arithmetic
STANDARDIZE scale Statitics
STD, STDEV sd Arithmetic
STEYX predict on an lm object Statistics
STRING format,sprintf,formatC Text
SUBSTITUTE sub,gsub,paste Essentials
SUM, SUMIF sum Arithmetic
SUMPRODUCT crossprod Arithmetic
TDIST pt Statistics
TEXT format,sprintf,formatC Text
TINV abs(qt(x/2,data)) Statistics
TODAY Sys.Date Essentials
TRANSPOSE t Arithmetic
TREND fitted of an lm object Statistics
TRIM sub Essentials
TRIMMEAN mean(x,trim=tr/2) Arithmetic
TRUNC trunc Essentials
TTEST t.test Statistics
TYPE typeof,mode,class Essentials
UPPER toupper Text
VALUE as.numeric Arithmetic
VAR var Essentials
VLOOKUP match,merge Essentials
WEEKDAY weekdays Essentials
WEIBULL pweibull when cumulative,dweibull when not Statistics
ZTEST pnorm Statistics

 

 

(Time Series) Forecasting: Principles & Practice (in R)

(Time Series) Forecasting: Principles & Practice (in R)

I stumbled across this open access book by Rob Hyndman, the god of time series, and George Athanasopoulos, a colleague statistician / econometrician at Monash University in Melbourne Australia.

Hyndman and Athanasopoulos provide a comprehensive introduction to forecasting methods, accessible and relevant among others for business professionals without any formal training in the area. All R examples in the book assume work build on the fpp2 R package. fpp2 includes all datasets referred to in the book and depends on other R packages including forecast and ggplot2.

Some examples of the analyses you can expect to recreate, ignore the agricultural topic for now ; )

Monthly milk production per cow.
One of the example analysis you will recreate by following the book (Figure 3.3)
Forecasts of egg prices using a random walk with drift applied to the logged data.
You will be forecasting price data using different analyses and adjustments (Figure 3.4)

I highly recommend this book to any professionals or students looking to learn more about forecasting and time series modelling. There is also a DataCamp course based on this book. If you got value out of this free book, be sure to buy a hardcopy as well.

R tips and tricks

R tips and tricks

Below are a dozen of very specific R tips and tricks. Some are valuable, useful, or boost your productivity. Others are just geeky funny. 

More general helpful R packages and resources can be found in this list.

If you have additions, please comment below or contact me!

Completely new to R? → Start here!

Table of Contents


Join 385 other subscribers

RStudio

Many more shortkeys available here online, and in your RStudio under Tools → Keyboard Shortcuts Help.

General

Disclaimer: This page contains one or more links to Amazon.
Any purchases made through those links provide us with a small commission that helps to host this blog.

Useful base functions

Back to Table of Contents

R Markdown

Data manipulation

Data visualization

Back to Table of Contents

Fun

Easter eggs

Join 385 other subscribers

Back to Table of Contents

Interactive Explanation of Network and Graph Principles

Interactive Explanation of Network and Graph Principles

Why do groups of people act smart, dumb, kind, or cruel? People behave in strange ways, particularly when they are able to influence one another. Both good and bad things can happen when people interact and behave in network structures. On the bright side, you must be familiar with the wisdom of the crowd, where the aggregated knowledge of a group is more valuable than its sum? Ensemble algorithms – like random forest analysis – rely on this positive principle.

On the dark side, are you familiar with the phenomenon called the tragedy of the commons, where shared resource-systems collapse because individuals behave in their self-interest? Or psychological phenomena such as groupthink, where groups of people make irrational decisions due to social issues? The recent spread of fake news and misinformation is also stimulated by network interactions. In these cases, we could speak of the madness of the crowd.

Nicky Case made a great interactive walkthrough explaining why and when networks of people become wise or mad. You are tasked to change and simulate network interactions while Nicky explains concepts such as (complex) contagion, the majority illusion paradox, bonding and bridging, and small world networks. In the references, Nicky provides links to scientific papers explaining these concepts in more detail. I highly suggest you check out her website here.

 

example.png
Screenshot of one of the explanations/simulations Nicky offers.

 

 

Predictive HR Analytics

Predictive HR Analytics

Tilburg University has set up a masterclass Predictive HR Analytics. In 3 days, the Professional Learning program will teach you all you need to know to implement predictive analytics and take HR to the next level. More information can be found here.

What makes this program unique?

  • The masterclass Predictive HR Analytics goes beyond HR analytics and focuses on transformational people predictions. You learn how to embed predictive HR analytics into your HR Strategy and how to use your findings to convince others.
  • The masterclass is developed at the prestigious Human Resources department at Tilburg University, which has obtained international recognition with its high-quality academic research in the HRM field.
  • The mix of professors in conjunction with leading HR professionals leads to a strong academic program with a practical approach.
  • Your peer participants will make sure that the class opens up a high-quality network of HR specialists. The diversity of leading companies from different sectors in the classroom creates new insights for all the participants.
  • The program is like a 3-day pressure cooker. By combining online and offline components, we can create more in-depth discussions in the classroom.
  • You will experience a high impact on your daily practice, since the program is focused on direct implementation.

Your profile

This course is ideal for anyone in HR seeking to become more adept in using quantitative data for decision making. Typical participants are (future) HR analysts, HR managers, HR business partners, HR consultants and (financial) business analysts with a strong link on people resources. Participants are from various sectors, such as financial services, healthcare institutions, government agencies and business services.