Suppose you operate a warehouse where workers work 11-hour shifts. In order to meet your productivity KPIs, a significant number of them need to take painkillers multiple times per shift. Do you…
Decrease or change the KPI (goals)
Make shifts shorter
Increase the number or duration of breaks
Increase the medical staff
Install vending machines to dispense painkillers more efficiently
Nobody in their right mind would take option 5… Right?
Yet, this is precisely what Amazon did according to Emily Guendelsberger in her insanely interesting and relevant book “On the clock” (note the paradoxal link to Amazon’s webshop here).
Emily went undercover as employee at several organizations to experience blue collar jobs first-hand. In her book, she discusses how tech and data have changed low-wage jobs in ways that are simply dehumanizing.
These days, with sensors, timers, and smart nudging, employees are constantly being monitored and continue working (hard), sometimes at the cost of their own health and well-being.
I really enjoyed the book, despite the harsh picture it sketches of low wage jobs and malicious working conditions these days. The book poses several dilemma’s and asks multiple reflective questions that made me re-evaluate and re-appreciate my own job. Truly an interesting read!
Some quotes from the book to get you excited:
“As more and more skill is stripped out of a job, the cost of turnover falls; eventually, training an ever-churning influx of new unskilled workers becomes less expensive than incentivizing people to stay by improving the experience of work or paying more.”
Emily Guendelsberger, On the Clock
“Q: Your customer-service representatives handle roughly sixty calls in an eighty-hour shift, with a half-hour lunch and two fifteen-minute breaks. By the end of the day, a problematic number of them are so exhausted by these interactions that their ability to focus, read basic conversational cues, and maintain a peppy demeanor is negatively affected. Do you:
A. Increase staffing so you can scale back the number of calls each rep takes per shift — clearly, workers are at their cognitive limits
B. Allow workers to take a few minutes to decompress after difficult calls
C. Increase the number or duration of breaks
D. Decrease the number of objectives workers have for each call so they aren’t as mentally and emotionally taxing
E. Install a program that badgers workers with corrective pop-ups telling them that they sound tired.
Seriously—what kind of fucking sociopath goes with E?”
The following are my summary and take-aways from Janelle Shane’s 2019 book named You look like a thing and I love you. Most of the below are excerpts from Janelle’s book, combined, or rewritten by me. For the sake of copyright, just consider everything Janelle’s : )
You look like a thing and I love you is about AI. More specifically, the book is about what AI can and can not do. And how and why AI often fails in miserably hilareous ways.
Janelle has spend her time foing fun experiments with AI. In this book, she shares those experiments along with many real life examples of AIs in practice. While explaining the technical details behind these AIs in an accesible though technically correct way, she informs the reader where, how, and why AIs fail.
Janelle took AIs out of their comfort zone and it produced some hilareously weird results. She proposes five principles of AI Weirdness:
The danger of AI is not that it’s too smart, but that it’s not smart enough
AI has the approximate brainpower of a worm
AI does not really understand the problem you want it to solve
But: AI will do exactly what you tell it to. Or at least it will try its best.
And AI willt ake the path of the least resistance
Definitions: What is (not) AI?
If it seems like AI is everywhere, it’s partly because Artificial Intelligence means lots of things, depending on whether you’re reading science fiction or selling a new app or doing academic research.
To spot an AI in the wild, it’s important to know the difference between machine learning algorithms (what Janelle calls AI in her book) and traditional, rules-based programs.
To solve a problem with a rules-based program, you have to know every step required to complete the program’s task and how to describe each one of those steps. But a machine learning algorithm figures out the rules for itself via trail and error, gauging its success on goals the programmer has specified. As the AI tries to reach this goal, it can discover rules and correlations that the programmer didn’t even know existed. This is what makes AIs attractive problem solvers and is particularly handy if the rules are really complicated or just plain mysterious.
Sometimes an AI’s brilliant problem-solving rules actually rely on mistaken assumptions. Rules that served it well in training but fail miserably when it encountered the real world. While training errors are common in complex AIs, the consequences of these mistakes can be serious.
It’s often not easy to tell when AIs make mistakes. Since we don’t write the rules, they come up with their own, and they don’t write them down or explain them the way a human would.
The difference between succesful AI problem solving and failure usually has a lot to do with the suitability of the task for an AI solution. And there are plenty of tasks for which AI solutions are more efficient than human solutions. But there are also plenty of cases where things go miserably wrong.
Janelle proposes four signs of “AI Doom”, contexts where machine learning will not produce the desired results:
The problem is too hard, broad, or complex
The problem is not what we thought it was
There are sneaky shortcuts to solving the problem
The AI tried to solve the problem learning from flawed data
Programming an AI is almost more like teaching a child than programming a computer.
Explaining how AI works
In her book, Janelle takes us through many example problems which she or others tried to solve using AIs. These example problems are increasingly hilareous, but I assure you that they are technically and didactically sound:
Managing a cockroach farm
Riding a bicycle
Rating sandwich deliciousness
Tossing a sandwich into a wall
Guiding people through a hallway
Answering questions regarding photo’s
The amazing thing is these ridiculous example problems actually serve a purpose. They are used to explain different algorithms and their applications, strengths, and limitations! Janelle covers a wide variety of algorithms in such a way that anyone new to machine learning would understand, while people with some experience will still be amused.
Janelle talks about artificial neural networks, random forests, and markov chains. Moreover, she explains how activation functions, recurrancy and long short-term memory, evolutionary algorithms and gradient descent work. And all in understandable though technically correct language.
Janelle herself seems particularly fond of generative algorithms. She’s elaborates on having deployed recurrent neural nets, generative adversial networks, and markov chains for a wide variety of generative tasks. In the book, Jabekke explains what went well and went wrong when coming up with new and original…
Janelle’s book is lingered with examples of failing AI. As a matter of fact, the whole book seems like an ode to how machine learning can and will inevitably fail. Particularly in the latter chapters, Janelle covers many limitations of and issues with AI in much detail:
I have yet to come across a book that explain AI in this much detail and in a manner as accessible and entertaining as Janelle Shane does in You look like a thing and I love you. Janelle makes machine learning and AI understandable for a wide public without passing on the deeper technical details. Taking a critical stance, she provides a good overview of the strenghts and weaknesses of AI, and a realistic outlook for the future to come. This book is not looking for sensation or hype, although reading it will be a most amusing experience for the more technical as well as the lay reader.
I highly recommend you reward yourself with a copy!
Aleszu started his analysis on only the French wines, with a simple word count per region:
Next, he applied TF-IDF to surface the words that are most characteristic for specific French wine regions — words used often in combination with that specific region, but not in relation to other regions.
The data also contained some price information, which Aleszu mapped France with ggplot2 and the maps package to demonstrate which French wine regions are generally more costly.
On the full dataset, Alezsu also demonstrated that there is a strong relationship between price and points, meaning that, in general, more expensive wines seem to get better reviews:
The full script and more details you can find in the orginal blog.
Wat gebeurt er in Nederland op het gebied van HR Analytics? Dit nieuwe boek laat zien wat enkele Nederlandse organisaties de afgelopen jaren daadwerkelĳk hebben ondernomen. De verschillende auteurs, waaronder ik mij mag scharen, geven een kĳkje in de praktĳkwereld van het onderbouwen van HR-beslissingen aan de hand van diverse databronnen en analysetechnieken. Ze verklaren daarmee HR Analytics niet heilig, maar wie als HR- professional waarde wil toevoegen aan de business, kan er veel aan hebben. Het credo is dan: weet wat je moet doen, wees alert op de valkuilen en beschouw HR Analytics als een zevende zintuig naast je andere zintuigen. Met dit extra zintuig kun je als HR- professional scherper waarnemen wat het echte HR-probleem is, en wat mogelĳk de oplossing is.
Het boek ‘HR Analytics’ is voor de moderne HR-professional die nieuwsgierig is naar wat analytics kan bĳ dragen aan zĳn of haar professionaliteit. De voorbeelden en verhalen uit de praktijk leveren verschillende leerpunten en inzichten die helpen bĳ een meer analytische benadering van de diverse HR beleidsthema’s rondom recruitment, loopbanen, arbeidsvoorwaarden, training en opleiding of engagement. Het is een duwtje in de rug op weg naar HR Analytics als een toevoeging aan het HR-vak. Niet als vervanging.
Wiemer Renkema, recensist op managementboeken.nl, heeft het boek inmiddels gelezen en vat de inhoud mooi samen:
In de tien hoofdstukken van het boek komen de belangrijkste HR analytics voorbij, zoals die voor recruitment, carrièreontwikkeling, medewerkerstevredenheid en beloning. De lezer kan zelf de relevantie van ieder onderwerp bepalen en gericht de informatie zoeken die voor hem van belang is. Bij ieder onderwerp gaan de schrijvers in op alle kernvragen, wat het boek een overzichtelijke en makkelijk leesbare structuur geeft.
Je hebt geen lange adem nodig om HR analytics. Een 7e zintuig voor de moderne HR-professional te lezen. Wat een praktisch, compleet en goed geschreven boek is dit!
Sentiment analysis is a topic I cover regularly, for instance, with regard to Harry Plotter, Stranger Things, or Facebook. Usually I stick to the three sentiment dictionaries (i.e., lexicons) included in the tidytext R package (Bing, NRC, and AFINN) but there are many more one could use. Heck, I’ve even tried building one myself using a synonym/antonym network (unsuccessful, though a nice challenge). Two lexicons that did become famous are SentiWordNet, accessible via the lexicon R package, and the Loughran lexicon, designed specifically for the analysis of shareholder reports.
Josh Yazman did the world a favor and compared the quality of the five lexicons mentioned above. He observed their validity in relation to the millions of restaurant reviews in the Yelp dataset. This dataset includes both textual reviews and 1 to 5 star ratings. Here’s a summary of Josh’s findings, including two visualizations (read Josh’s full blog + details here):
NRC overestimates the positive sentiment.
AFINN also provides overly positive estimates, but to a lesser extent.
Loughran seems unreliable altogether (on Yelp data).
Bing estimates are accurate as long as texts are long enough (e.g., 200+ words).
SentiWordNet‘s estimates are mostly valid and precise, also on shorter texts, but may include minor outliers.
On a more detailed level, David Robinson demonstrated how to uncover performance errors or quality issues in lexicons, in his 2016 blog on the AFINN lexicon. Using only the most common words (i.e., used in 200+ reviews for at least 10 businesses) of the same Yelp dataset, David visualized the inconsistencies between the AFINN sentiment lexicon and the Yelp ratings in two very smart and appealing ways:
As the figure above shows, David found a strong positive correlations between the sentiment score assigned to words in the AFINN lexicon and the way they are used in Yelp reviews. However, there are some exception – words that did not have the same meaning in the lexicon and the observed data. Examples of words that seem to cause errors are die and bomb (both negative AFINN scores but used in positive Yelp reviews) or, the other way around, joke and honor (positive AFINN scores but negative meanings on Yelp).
With the graph above, it is easy to see what words cause inaccuracies. Blue words should be in the upper section of this visual while reds should be closer to the bottom. If this is not the case, a word likely has a different meaning in the lexicon respective to how it’s used on Yelp. These lexicon-data differences become increasingly important as words are located closer to the right side of the graph, which means they more frequently screw up your sentiment estimates. For instance, fine, joke, fuck and hopecause much overestimation of positive sentiment while fresh is not considered in the positive scores it entails and die causes many negative errors.
TL;DR: Sentiment lexicons vary in terms of their quality/performance. If your texts are short (few hundred words) you might be best off using Bing (tidytext). In other cases, opt for SentiWordNet (lexicon), which considers a broader vocabulary. If possible, try to evaluate inaccuracies, outliers, and/or prediction errors via data visualizations.