Tag: Excel

# Create a publication-ready correlation matrix, with significance levels, in R

TLDR; You can use the `corrtable` package (see CRAN or Github)!

In most (observational) research papers you read, you will probably run into a correlation matrix. Often it looks something like this:

In Social Sciences, like Psychology, researchers like to denote the statistical significance levels of the correlation coefficients, often using asterisks (i.e., *). Then the table will look more like this:

Regardless of my personal preferences and opinions, I had to make many of these tables for the scientific (non-)publications of my Ph.D..

I remember that, when I first started using R, I found it quite difficult to generate these correlation matrices automatically.

Yes, there is the `cor` function, but it does not include significance levels.

Then there the (in)famous `Hmisc` package, with its `rcorr` function. But this tool provides a whole new range of issues.

What’s this `storage.mode`, and what are we trying to coerce again?

Soon you figure out that `Hmisc::rcorr` only takes in matrices (thus with only numeric values). Hurray, now you can run a correlation analysis on your dataframe, you think…

Yet, the output is all but publication-ready!

You wanted one correlation matrix, but now you have two… Double the trouble?

[UPDATED] To spare future scholars the struggle of the early day R programming, Laura Lambert and I created an R package `corrtable`, which includes the helpful function `correlation_matrix`.

This `correlation_matrix` takes in a dataframe, selects only the numeric (and boolean/logical) columns, calculates the correlation coefficients and p-values, and outputs a fully formatted publication-ready correlation matrix!

You can specify many formatting options in `correlation_matrix`.

For instance, you can use only 2 decimals. You can focus on the lower triangle (as the lower and upper triangle values are identical). And you can drop the diagonal values:

Or maybe you are interested in a different type of correlation coefficients, and not so much in significance levels:

For other formatting options, do have a look at the source code on github.

Now, to make matters even easier, the package includes a second function (`save_correlation_matrix`) to directly save any created correlation matrices:

Once you open your new correlation matrix file in Excel, it is immediately ready to be copy-pasted into Word!

If you are looking for ways to visualize your correlations do have a look at the packages `corrr`, `corrplot`, or `ppsr`.

I hope this package is of help to you!

Do reach out if you get to use them in any of your research papers!

Sign up to keep up to date on the latest R, Data Science & Tech content:

# Data Visualization Tools & Resources

There’s this amazing overview of helpful dataviz resources atwww.visualisingdata.com/resources!

Browse through hundreds of helpful data visualization tools, programs, and services. All neatly organized by Andy Kirk in categories: data handling, applications, programming, web-based, qualitative, mapping, specialist, and colour. What a great repository!

Looking for expert books on data visualization?
Have a look at these recommendations!

# Transitioning from Excel to R: Dictionary of common functions

Alyssa Columbus published maintains this GitHub repository with a great tutorial on how to move from Excel to R. Very useful for beginning useRs, the repository’s tutorial includes a translation table between Excel and R functions:

Excel Formula R Function Type
ABS `abs` Arithmetic
ADDRESS `assign` Essentials
AND `&`,`&&`,`all` Boolean
AVERAGE, AVG, AVERAGEIF `mean` Arithmetic
BETADIST `pbeta` Statistics
BETAINV `qbeta` Statistics
BINOMDIST `pbinom` when cumulative,`dbinom` when not Statistics
CEILING `ceiling` Arithmetic
CELL `str` has the same idea Essentials
CHIDIST, CHISQDIST `pchisq` Statistics
CHIINV, CHISQINV `qchisq` Statistics
CHITEST `chisq.test` Statistics
CHOOSE `switch` Essentials
CLEAN `gsub` Text
COLS, COLUMNS `ncol` Essentials
COLUMN `col`,`:`,`seq` Essentials
COMBIN `choose` Essentals
CONCATENATE `paste` Text
CONFIDENCE `-qnorm(alpha/2)*std/sqrt(n)` Statistics
CORREL `cor` Statistics
COUNT, COUNTIF `length` Arithmetic
COVAR `cov` Statistics
CRITBINOM `qbinom` Statistics
DELTA `identical` Boolean
EXACT `==` Boolean
EXP `exp` Arithmetic
EXPONDIST `pexp` when cumulative,`dexp` when not Statistics
FACT `factorial` Arithmetic
FACTDOUBLE `dfactorial` in the `phangorn` package Arithmetic
FDIST `pf` Statistics
FIND `regexpr`,`grepl`,`grep` Text
FINV `qf` Statistics
FISHER `atanh` Arithmetic
FISHERINV `tanh` Arithmetic
FIXED `format`,`sprintf`,`formatC` Essentials
FLOOR `floor` Arithmetic
FORECAST `predict` on an `lm` object Statistics
FREQUENCY `cut`,`table` Arithmetic
FTEST `var.test` Statistics
GAMMADIST `pgamma` if last argument T,`dgamma` if last arg. F Statistics
GAMMAINV `qgamma` Statistics
GAMMALN `lgamma` Statistics
GAUSS `pnorm(x) - 0.5` Statistics
GCD `gcd` Arithmetic
GEOMEAN `exp(mean(log(x)))` Arithmetic
GESTEP `>=` Boolean
HARMEAN `harmonic.mean` in the `psych` package Arithmetic
HLOOKUP `match`,`merge` Essentials
HYPGEOMDIST `dhyper` Statistics
IF `if`,`ifelse` Essentials
IFERROR `try`,`tryCatch` Essentials
INDEX `x[y,z]` Essentials
INDIRECT `get` Essentials
INT `as.integer`(not for negative numbers),`floor` Arithmetic
INTERCEPT first element of `coef` of an `lm` object Statistics
ISLOGICAL `is.logical` Boolean
ISNA `is.na` Boolean
ISNUMBER `is.numeric` Boolean
ISTEXT `is.character` Boolean
KURT `kurtosis` in the `moments` package Statistics
LARGE `sort` Statistics
LCM `scm` in the `schoolmath` package Arithmetic
LEFT `substr` Text
LEN, LENGTH `nchar` Text
LINEST `lm` Statistics
LN, LOG `log` Arithmetic
LOG10 `log10` Arithmetic
LOGINV `qlnorm` Statistics
LOGNORMDIST `plnorm` Statistics
LOWER `tolower` Text
MATCH `match`,`which` Essentials
MAX `max` (sometimes `pmax`) Arithmetic
MDETERM `det` Arithmetic
MEDIAN `median` Arithmetic
MID `substr` Text
MIN `min` (sometimes `pmin`) Arithmetic
MINVERSE `solve` Arithmetic
MMULT `%*%` Arithmetic
MOD `%%` Arithmetic
MODE `as.numeric(names(which.max(table(x))))` Arithmetic
MUNIT `diag` Arithmetic
N `as.numeric` Arithmetic
NEGBINOMDIST `dnbinom` Statistics
NORMDIST, NORMSDIST `pnorm` when cumulative,`dnorm` when not Statistics
NORMINV, NORMSINV `qnorm` Statistics
NOT `!` Boolean
NOW `date`,`Sys.time` Essentials
OR ` `,`
PEARSON `cor` Statistics
PERCENTILE `quantile` Statistics
PERCENTRANK `ecdf` Statistics
PERMUT `function(n,k) {choose(n,k)*factorial(k)}` Arithmetic
PERMUTATIONA `n^k` Arithmetic
PHI `dnorm` Statistics
POISSON `ppois` when cumulatic,`dpois` when not Statistics
POWER `^` Arithmetic
PROB `ecdf` Statistics
PRODUCT `prod` Arithmetic
PROPER `toupper` Text
QUARTILE `quantile` Arithmetic
QUOTIENT `%/%` Arithmetic
RAND `runif` Arithmetic
RANDBETWEEN `sample` Arithmetic
RANK `rank` Essentials
REPLACE `sub`,`gsub` Text
REPT `rep` and `paste` or `paste0` Text
RIGHT `substring` Text
ROUND `round` Arithmetic
ROUNDDOWN `floor` Arithmetic
ROUNDUP `ceiling` Arithmetic
ROW `row`,`:`,`seq` Essentials
ROWS `nrow` Essentials
RSQ `summary` of `lm` object Statistics
SEARCH `regexpr`,`grep` Text
SIGN `sign` Arithmetic
SKEW `skewness` in the `moments` package Statistics
SLOPE in `coef` of `lm` object Statistics
SMALL `sort` Arithmetic
SQRT `sqrt` Arithmetic
STANDARDIZE `scale` Statitics
STD, STDEV `sd` Arithmetic
STEYX `predict` on an `lm` object Statistics
STRING `format`,`sprintf`,`formatC` Text
SUBSTITUTE `sub`,`gsub`,`paste` Essentials
SUM, SUMIF `sum` Arithmetic
SUMPRODUCT `crossprod` Arithmetic
TDIST `pt` Statistics
TEXT `format`,`sprintf`,`formatC` Text
TINV `abs(qt(x/2,data))` Statistics
TODAY `Sys.Date` Essentials
TRANSPOSE `t` Arithmetic
TREND `fitted` of an `lm` object Statistics
TRIM `sub` Essentials
TRIMMEAN `mean(x,trim=tr/2)` Arithmetic
TRUNC `trunc` Essentials
TTEST `t.test` Statistics
TYPE `typeof`,`mode`,`class` Essentials
UPPER `toupper` Text
VALUE `as.numeric` Arithmetic
VAR `var` Essentials
VLOOKUP `match`,`merge` Essentials
WEEKDAY `weekdays` Essentials
WEIBULL `pweibull` when cumulative,`dweibull` when not Statistics
ZTEST `pnorm` Statistics