Tag: datajournalism

How Do I…? R Code Snippets by Sharon Machlis

Sharon Machlis is the author of Practical R for Mass Communication and Journalism. In writing this book, she obviously wrote a lot of R code. Now, Sharon has been nice enough to share all 195 tricks and tips she came across during her writing with us, via this handy table.

Sharon’s list contains many neat tricks, some of which less well-known base functions, others features of more niche packages. Here’s the ones I am definitely adding to my R tricks overview and want to highlight here as well:

Categorize values into interval cut()
Convert numbers that came in as strings with commas to R numbers with readr::parse_number(mydf$mycol)
Create a searchable, sortable HTML table in 1 line of code with DT::datatable(mydf, filter = 'top')
Display a fraction between 0 and 1 as a percentage with scales::percent(myfraction)
Generate a vector of 1:length(myvec) with seq_along(myvec)

I've posted a searchable table of 190+ #rstats tasks with code snippets. Hope some of you find it useful: https://t.co/FwCGO4qWOj pic.twitter.com/IYpQbCongz
— Sharon Machlis (@sharon000) December 24, 2018

And as if one list was not enough, scrolling through her Twitter feed, I found another R tips and tricks list by Sharon:

Up to 35 Do More With R 🎥episodes! Check out the searchable table of all my #rstats screencasts @infoworld https://t.co/Qq051K0gTV pic.twitter.com/TanqfZxviF
— Sharon Machlis (@sharon000) October 3, 2019

Data Visualization Style Guide Repositories

Amy Cesal put together (1) this great overview of style guides for data visualization practice. Moreover, in the original tweet, Amy refers to other great repositories such as (2) this PolicyViz one and (3) this humongous one by Adele.

Spreadsheet of #dataviz style guides: https://t.co/lLQUT5Qwi0

And a form for adding more: https://t.co/i14hb0fZOO pic.twitter.com/qJ2vhcl7QV
— Amy Cesal (@AmyCesal) June 21, 2019

Amy’s list includes many references to the best practices used by some of the leading data journalism companies, such as the BBC, or professional data companies like Salesforce and IBM.

As I’m worried that this great repository may not stand the test of time on the current Google Docs location, here are the base URLs once more:

URL of guidelines	Company name
https://sunlightfoundation.com/2014/03/12/datavizguide	Sunlight Foundation
https://cfpb.github.io/design-manual/data-visualization/data-visualization.html	Consumer Financial Protection Bureau
https://knightcenter.utexas.edu/mooc/file/tdmn_graphics.pdf	Dallas Morning News
https://urbaninstitute.github.io/graphics-styleguide/	The Urban Institute
http://code.minnpost.com/minnpost-styles/	MinnPost
https://public.tableau.com/profile/bbc.audiences#!/vizhome/BBCAudiencesTableauStyleGuide/Hello	BBC Audiences
https://www.ibm.com/design/v1/language/experience/data-visualization/	IBM
https://style.ons.gov.uk/category/data-visualisation/	Office for National Statistics
https://www.ibcs.com/standards	International Business Communication Standards (IBCS®)
https://data.london.gov.uk/blog/city-intelligence-data-design-guidelines/	London City Intelligence
https://www.bbc.co.uk/gel/guidelines/how-to-design-infographics	BBC
https://polaris.shopify.com/design/data-visualizationst	Shopify
https://ux.opower.com/opattern/how-to-charts.html	Opower
https://www.consults-iot.com	Consults-IoT.Com LLP
https://ux.mailchimp.com/patterns/data	MailChimp
https://material.io/design/communication/data-visualization.html	Google- Material Design
https://lightningdesignsystem.com/guidelines/charts/	Salesforce
https://github.com/glosophy/CatoDataVizGuidelines/blob/master/PocketStyleBook.pdf	Cato Institute
https://bbc.github.io/rcookbook/	BBC
https://docs.microsoft.com/en-us/office/dev/add-ins/design/data-visualization-guidelines	Microsoft
https://www.interaction-design.org/literature/book/the-encyclopedia-of-human-computer-interaction-2nd-ed/data-visualization-for-human-perception	ACI

If you have any resources or style guides to contribute to Amy’s list, you can do so via this link.

A Categorical Spatial Interpolation Tutorial in R

Timo Grossenbacher works as reporter/coder for SRF Data, the data journalism unit of Swiss Radio and TV. He analyzes and visualizes data and investigates data-driven stories. On his website, he hosts a growing list of cool projects. One of his recent blogs covers categorical spatial interpolation in R. The end result of that blog looks amazing:

This map was built with data Timo crowdsourced for one of his projects. With this data, Timo took the following steps, which are covered in his tutorial:

Read in the data, first the geometries (Germany political boundaries), then the point data upon which the interpolation will be based on.
Preprocess the data (simplify geometries, convert CSV point data into an sf object, reproject the geodata into the ETRS CRS, clip the point data to Germany, so data outside of Germany is discarded).
Then, a regular grid (a raster without “data”) is created. Each grid point in this raster will later be interpolated from the point data.
Run the spatial interpolation with the kknn package. Since this is quite computationally and memory intensive, the resulting raster is split up into 20 batches, and each batch is computed by a single CPU core in parallel.
Visualize the resulting raster with ggplot2.

All code for the above process can be accessed on Timo’s Github. The georeferenced points underlying the interpolation look like the below, where each point represents the location of a person who selected a certain pronunciation in an online survey. More details on the crowdsourced pronunciation project van be found here, .