The Science of Storytelling: Why Talks Go Viral (Part 1)

One week before I embarked on a data science bootcamp that would disrupt my career trajectory, I did something much more terrifying. I gave a toast at my parents’ 50th wedding anniversary. I don’t mind speaking in public – I love it, in fact – but I love speaking about subjects that I vaguely know something about. Marriage is not one of them.

Not to mention I had no time to prepare. Minutes before the toast, I was literally talking to a mirror in the restroom because I had no script and didn’t know what to say. After being embarrassingly interrupted in the restroom by a man who wanted to pee, though, I knew that I needed to suck it up, stand in front of the crowd of 100 guests, and say something.

And you know what? It went beautifully. My parents were beaming, and numerous people complimented me afterward. Giving that anniversary toast was one of the favorite recent memories.

That’s why I love public speaking. It’s impossible to predict which talks go well and which fall flat. I’ve always been fascinated with why certain talks go better than others, so I decided to construct a natural language processing project to answer that question – what makes certain talks stand out from the crowd?

What does a “story” look like?

A story isn't exactly something you measure with a ruler. Part 1 of this blog describes how I tackled the challenge of quantifying the "story arc" of a talk from blobs of text, in preparation for modeling how stories predict a talk's success.

(Aside - I also wanted to blog about this after endlessly hearing the cliché that data scientists need to "tell a story" with their data. Nobody believes in data storytelling more than me. It’s one of the most important skills for any scientist to have. But I think the cliché needs more substance on how to tell a story. End of soapbox.)

Experts believe one key component of storytelling is understanding the “shape” of a story. Writer Kurt Vonnegut gave a legendary talk on how memorable stories are characterized by a “rags to riches” rise and fall of emotion. In 2016, Vonnegut’s ideas were backed up by computer scientists who studied text from 1,737 works of fiction from Project Gutenberg and found 6 clusters of story shapes that found similar rises and falls in sentiment.

This idea has been studied in written text, but is the same thing true for speaking? That is what I sought to answer using transcripts from over 2,300 TED Talks.

When this project began, I was intrigued to model speakers' storytelling, but I had no clue how to do that. That’s what I love about natural language processing – you learn to embrace the chaos and devise a plan to turn abstract, unstructured information into something concrete you can measure. To turn text from TED Talks into categories of story arcs, I took a 3-step process that you can see detailed on GitHub:

Use a sliding window to divide transcripts into overlapping blocks of text
Measure the sentiment of each individual block, producing an array of sentiment values
Identify clusters of story arcs using k-Shape, a clustering algorithm that is particularly suited to capturing the shapes of time series

The best way to illustrate this is by comparing two TED speakers who are both legendary … though only one is legendary for their talk.

Al Gore and Brené Brown

Al Gore

Brené Brown

I don’t need to explain who Al Gore is. Brené Brown may not be as famous, but her talk, “The Power of Vulnerability,” earned the 4th-most views of any TED Talk in history. It also launched her career, as prior to then she was a relatively unknown professor, and now she’s a 4-time New York Times bestselling author. In short, she epitomizes why people scratch and claw to give TED Talks.

The alternating graphs underneath them represent the shapes that I extracted from transcripts of their respective TED Talks, using the steps listed above.

Truthfully, “shape” might be a charitable word to describe Gore’s talk. The sentiment analysis revealed a narrow range of emotion with no arc whatsoever (the flat line is actually a 3rd-degree polynomial that had 2 degrees to spare). Brown, on the other hand, represents everything that Vonnegut described. There is a subtle but sizable rise and fall of emotion when she speaks, with much higher peaks and valleys in her graph. It closely matches what the 2016 study authors described as the “Cinderella” arc with its “rise-fall-rise” shape.

Don’t get me wrong – Al Gore is no slouch as a speaker. His 2006 talk, "Averting the Climate Crisis", earned 3.2 million views, well above the TED median of 1.1 million. But he had the benefits of time, fame, and a topic that millions of people are passionate about. Brené Brown had less time, little fame, but a classic storytelling style that 32 million people have enjoyed.

Now, you might be thinking, “Cute story arcs are great – but so what?” It’s one thing to extract story shapes from a transcript, it’s another to prove Vonnegut was right when he said, “The shape of the curve is what matters.” Does the rise and fall of sentiment predict how many views a speaker gets?

That’s what my next blog will dive into. For now, I'll say that range and variation in emotion were consistent predictors of the number of views talks get, but they were hardly the only predictors. Using NLTK, TextBlob, gensim, and other Python tools, I was able to extract and engineer several other features that make the most successful TED talks stand out.

(To be continued ...)

(Image Sources: Flickr/urban_data, Flickr/Center for American Progress Action Fund and Flickr/TED Conference, under a Creative Commons license; images were cropped slightly)

The Science of Storytelling: Why Talks Go Viral (Part 1)

What does a “story” look like?

Al Gore and Brené Brown

The Tortoise, the Hare, & the Data Scientist

Contact Me