Lars Albertsson, former software engineer at Spotify and Google and currently freelance data engineer via mapflat, maintains this list of data engineering resources. It includes many links to videos and courses about data pipelines, batch processing, Kafka, NoSQL, Clojure, Scala, Parquet, Luigi, Storm, Spark, Hadoop, Cassandra, and other tools I am not too familiar with. Looks like it could function as a great curated overview for starters.
Reddit user LucasCu90 used the R package twitteR to retrieve all tweets that were sent with #Irma and a Geocode of central Miami (25 mile radius) from Saturday September 9, to Sunday September 10, 2017 (the period of Irma’s approach and initial landfall on the Florida Keys and the mainland). From the 29,000 tweets he collected, Lucas then retrieved the 600 most common words and overlaid them on a map of Florida, with their size relative to their frequency in the data. The result is quite nice!
It’s easy to think that disasters as devastating as Typhoon Yolanda – the super typhoon that claimed over 7,000 lives in 2013 – only happen once in a lifetime. However, the Philippines got hit a few more times over the past century.
Thinking.Machin.es provides an interactive history of almost every storm, earthquake, flood, volcanic eruption, landslide, drought, epidemic, or wildfire to have caused at least 10 deaths in the Philippines between 1901 and 2015. Data was obtained from the rich Emergency Events Database (EM-DAT) of the Centre for Research on the Epidemiology of Disasters (CRED) in Belgium. Their interactive visualization is astonishing, just look at the following screenshot: