Narrative Machines

This week’s podcast was a real Story Nerd extravaganza.

Andy Reagan, a PhD candidate at The University of Vermont’s Computational Story Lab, was our guest.

A few weeks ago, I spoke about how the CSL had done some deep data mining into about 2000 English language novels from project Gutenberg. What they found confirmed Kurt Vonnegut’s 1946 Master’s Thesis for the Anthropology Department at The University of Chicago.  Vonnegut’s Thesis put forth that there are essentially six story arcs baked into the human narrative pie. [The CSL is not the sole diver into this world either.  Matthew L. Jockers at The University of Nebraska is doing similar work too.  He’s publishing a book about his findings, The Bestseller Code, on September 20.]

We are fortunate today that Vonnegut’s work was rejected by the U of C (I worked for a few years at the University of Chicago myself and can attest to their academic rigor) and that he had to fall back on his backup profession…fiction writing…to make his living.

Anyway, Andy was kind enough to listen to my long-winded questioning and gave Tim and I some serious food for thought.  While the technology is in its infancy (the hedometer machine is capable of only analyzing word values and is not nuanced enough to understand literary context nor is it conditioned to understand the fundamental structural units of Story), the quantitative/qualitative mind meld it could lead to astonishes me.

The Story Grid methodology is very much dedicated to bringing these two disciplines together…quantitative applied mathematical reasoning with qualitative cultural analysis that (hopefully forever) only human beings can accurately assess.  In order to tell a story well, one needs a unique Weltanschauug (worldview) and so far, machines are incapable of telling themselves stories so them having a worldview beyond pure mathematical reasoning is not presently possible.

From my vantage point, I see the CSL and Jockers work as confirmation of the essential assumptions inherent in The Story Grid method, that stories concern the shift of values from positive to negative along a narrative timeline.

If pressed, I would say that the CSL and Jockers graphs show the global sequential movement of longform story’s arc…that is a man in a hole sequence followed by a Cinderella, capped off with a tragedy or some other combination thereof of the six fundamental movements. Don’t get me started about how these six arcs relate to The Hero’s Journey…except to say that I see them as sequential guideposts to move a character/characters along Campbell’s mono-myth progression.

These graphs are X-rays that show the bones of the story…but they do not, as of yet, have the resolution to show the muscles, joints, organs, neurons etc. of the story in any practical detail. That is, they show that there’s a man in a hole arc, but do not show what the elements of the man in a hole arc are.  They do not delineate the units of story…from beat to scene to sequence to act to subplot to global external and/or global internal genre.

Where Story Grid differs is in its ability to map value from a particular human point of view (a story editor/writer).  Story Grid methods teach how an individual human being (not necessarily a machine…but I have little doubt that over time a program could be developed to mimic a particular editorial point of view…say mine or with enough interview data an editorial genius like Robert Gottlieb) uses his/her personal Weltanschauug to make their own value judgements about the particular unit of story.

Story Grid Methodology is a combo plate technology that uses quantitative tools to create uniquely qualitative arguments.  What I take away from The Silence of the Lambs is absolutely valid and my Story Grid for the book reasonable and insightful.  But if Robert Gottlieb were to map out The Silence of the Lambs, his grid may differ from mine in the unit by unit details…but globally, it would have a similar graphic representation, akin to the CSL and Jockers graphs.

What Story Grid Methodology allows for is individual opinion and individual Weltanschauugs…so Gottlieb and I could examine why we labeled a particular value for a particular scene differently and in so doing, we’d reveal how our views of the world differed.

By the way, this is exactly the editorial process that Steven Pressfield I undergo with each of his books.  We may not both create story grids and swap them, but we do understand each of his scenes/chapters as fundamental units of story and then debate which ones are working and which ones aren’t until we come to a mutual understanding of what must be done to make it better.

Will artificially intelligent machines one day acquire unique Weltanschauugs?

My guess is that it is theoretically possible (some believe inevitable and perhaps I do too…still haven’t wrapped my mind around this entirely).  My gut though is that the machines will reflect the souls/geniuses of the Dr. Frankensteins who create them, more so than becoming unique sentient beings. That is, The Story Grid Machine would reflect my particular Weltanschauug while other narrative analytical machines would reflect their human source material. The Gottlieb Machine would reflect Gottlieb, The Howard Machine would reflect editor Gerry Howard, The Arthur Machine would reflect Reagan Arthur etc.

But what of AI learning?

If I were to create a primitive Story Grid Machine with a learning system pre-programmed inside (as I’m constantly tweaking and reconsidering my own assumptions about what I know about Story structure so would the machine), would I recognize that same machine as my own…generations in the future? Or would the machine be unrecognizable to me?

This may all sound Brave New Worldish or Philip K. Dickian, but these are extraordinarily important times.

Remember that we all live narrative…we can’t help it…it’s embedded in human DNA as our primary software.  And the best Storytellers win in the external world because they are capable of manipulating and co-opting mass internal worlds. Elon Musk, in addition to his other gifts, knows how to tell a story.  But so did Joseph Goebbels.  To all things, lightness and darkness…

That is, great Storytellers know the six internal arcs and use them to move us to change our worldviews. Propaganda is the most potent weapon on the planet. So teaching machines how story works will give them the tools to re-engineer too.

What would happen if a machine became the uber-Storyteller? Hmmm…light and dark will result no doubt. For some reason Robert Oppenheimer’s quote from Bhagavad Gita: “Now I am become Death, the destroyer of worlds.” on July 16, 1945 comes to mind on that fateful day when a machine pens a blockbuster novel.

Today’s rise of “Digital Humanities” inside academic ivory towers is extraordinarily exciting as it brings together the two cultures that C.P. Snow so brilliantly argued were dangerously incompatible. But it has its dark side too.

Will the lack of global understanding of Story on the part of the mathematician/coders (its supremacy as changer of internal worlds) matched with the naivete of humanities specialists when it concerns the inherent structural forms of creation and their foundations in pure science (form misunderstood as the hobgoblin of creativity) create a perfect digital storm for artificially intelligent digital beings?

As you can tell, I could go on and on about this. With equal parts joy and despair, I feel a strong pull to head back into the lab to be a part of this new frontier.

In the meantime, you can listen to the podcast by clicking the play button or reader the transcript that follows:


[0:00:00.6] TG: Hello and welcome to the Story Grid Podcast. This is a show dedicated to helping you become a better writer. I’m your host Tim Grahl and I am a struggling writer trying to figure out how to tell a story that works. Shawn Coyne is joining me soon, he is the creator of Story Grid, the author of the book Story Grid and he is an editor with over 25 years’ experience. Now, a few weeks ago, Shawn mentioned this Hedonometer, this research into stories and how it’s being applied to stories and how we see the same six story arcs over and over.


So we were talking about this some more and decided to invite one of the researchers on the show. This episode is a bit of a departure from us but it becomes really interesting as we talk about not only what this research shows but the future implications of it. So I think you’re really going to enjoy it, it was a lot of fun to talk to this researcher and dive into the work that they’re doing.


Let’s jump in and get started.


[0:01:00.6] TG: So in this episode, we’re diving deep into the research that we mentioned last week that Shawn mentioned last week with the Hedonometer and we have on now with us, Andy Reagan who is a part of this. So Andy, I just want you to tell us a little bit about who you are and what you do and specifically the work you’re doing with this hedonometer research?


[0:01:24.1] AR: Yeah, great. My name is Andy, as Tim said, I am a PhD candidate at the University of Vermont up here in Burlington, Vermont and I work with Computational Story Lab research group. Stories have been in our framing for a while. The leaders of the group are Chris Danforth and Peter Dodds, professors here at UVM and we’re affiliated with the Complex Systems Center here. Complex Systems are studying all sorts of behaviors and data systems that are out in the real world.


So we, in particular, have studied the hedonometer. We developed this tool which allows us to measure happiness and we’ve applied it to a lot of different areas. If you go to our website for example,, the first thing you’ll see is a time series of Twitter, there’s a lot of interesting patterns that you can pick up from there. Most recently we’ve gone and looked at stories which I think you guys heard about last week and yeah, we’re going to dive more into that.


[0:02:15.4] SC: Well Andy, first of all, thank you so much for being here, this is Shawn, and needless to say when I read about all of your research I was just so over the moon because one of the things that I’ve been doing as an editor in the major publishing houses and in my personal career over 25 years, is analyzing stories in a very similar — without the data. I have no ability to use a computer or know about algorithms. So what I’ve been going on is through my own intuitive sense of how stories are moving positively or negatively from the beginning to the end.


So one of the things that really kind of fascinates me is sort of this evolution of this entire realm called, I think it’s called the digital humanities which seems to be sort of the beginning of a means to analyze literature and art through intense machine algorithms and attaching specific kinds of sentiments to language itself. Is that correct? Is that what sentiment analysis is all about?


[0:03:23.4] AR: Right, if you go all the way back to the original natural English processing, people were by hand making concordances. Now that the computing powers increase, I can run books through on my laptop. The digital humanities has kind of come about where we’re taking, we’re going beyond that just counting words which was a concordance, you’d go through and account the words and then real literary scholars could take that data and use it to better understand the text. Now what we are able to do is we’re able to go beyond counting words and we’re starting to investigate some of the theories of literature using natural language processing on its own.


[0:04:02.6] SC: I get it. So the other thing that struck me, hit me on top of the head was this notion of this six core story arcs and when I say arc, what I’m talking about is the emotional movement of the lead character throughout the entire story or the small cast of characters in a multi character story and how you guys, I think there’s even more research being done by, I think there’s somebody in university of Nebraska named Matthew Jockers who is doing similar work to you.


He’s using sentiment, analysis and I think he’s coming out with even a book in September called The Bestseller Code. Which essentially is going from word to word to word but in larger chunks so you’re not exactly parsing every exact paragraph but you’re’ sort of taking pictures of individual stories and then tracking those, is that correct?


[0:04:57.3] AR: Right, you hit on a bunch of different things. So we came up with six stories, there’s been a lot of theories over time that have proposed different kind of framing for how narratives work and those have enumerated different amounts of stories. You guys talk a lot about Campo I think and he came up with the mono myth. That was the idea that there was one story and variations on that story, you create some differences but if you went all the way, you can get to this core elements of every story.


There have been different theories over time. So one of those is three plots, right? This is Foster Harris back in 1959 saying that there are three basic patterns at play, kind of all from a central pattern of conflict and even in more recently, Christopher Booker published a big volume representing three decades of work and he boiled it down to six course. Yeah, seven. He says the seven basic laws but I’m not sure comedy is applied.


[0:05:50.1] SC: Yeah, I had a problem with that too but this is nerd talk, so we’ll leave that to it another time, but keep going.


[0:05:57.4] AR: Right, there’s a lot of, you can keep going, there’s people that have found 20 and 36 and now we’re beginning to take this theories where people read lots of books, I assume Christopher Booker read hundreds of books and was able to twiddle this down. Now, the computer, we can read many thousands and an hour analysis, we’re able to look at all the project Gutenberg, which is a collection of 50,000 eBook. We only look at a subset of those because we’re actually analyzing fiction, we will get up till 4,000 books but we’re able to answer this original questions using the data.


[0:06:29.7] SC: All right, let’s just try and delineate, I’m going to try and have a conversation with you about some of my theories and how they merge and ebb with what the six central plots are.


[0:06:41.4] TG: Can I cut in? I’m going to cut in. I know you’re so excited Shawn, let me cut in.


[0:06:45.2] SC: I know.


[0:06:47.8] TG: So Andy, I want to back up and when you say that you measured the happiness, what does that mean? How do you go about doing that?


[0:06:56.4] AR: Yeah, that was the second part of what Shawn just asked, was about the methodology and some of the work that Jockers have done. There’s some differences and methodology and also what we think the methodology is doing but I think are definitely worth highlighting. What we’re doing, we’re looking at individual words. You open a novel and if you were to look at the happiness of individual words out of context, we have scores for individual words, we have scores for the 10,000, most used words in English which we’ll cover about all of the words and 98% coverage of a text.


So we go in to the book and we look at those individual words and replace them with their average score, each word we had scored by 50 people on it’s happiness. This is a particular sentiment dictionary we use, there are others and we had a crowd sourced methodology by 50 people rate each word and so we go through a book and we replace each word with its score.


What you end up with is a book that’s just a bunch of numbers. For us to begin to take go into the overall sentiment of a piece of text, we’ll average the scores of those words. What we’re doing and in a particular choice we made was to go through and then take big chunks of text, 10,000 words at a time and averaged the sentiment of all the words in those 10,000 word chunks. We take that 10,000 word window when we nudge it a little bit, se take a few off at the start, add a few more on and generate another average score.


We do that throughout the book until we have 200 points. We move it just enough so that we end up with 200 sentiment points throughout the book. There are other ways you go about doing this and in particular, the way that Jockers goes by doing this is measuring individual sentences. So what he’ll do is take kind of the same approach but do it to a sentence, so you get a score for a sentence, then you’ll get a score for the next sentence and then what he’ll do, because the sentiment, those sentences actually end up being really noisy. You get a happy sentence, sad sentence. Happy sentence, sad sentence.


It is taking a rolling average and then from that doing some more analysis. So originally I think he was doing that and then taking that noisy time series was kind of up-down, up-down. Then doing the 48 transform, but that doesn’t work as well because it’s very noisy and you get some unintended effects. From there, taking the average kind of smooths out the contributions of each word.


[0:09:12.6] TG: You’re basically just walking through the book and taking like big chunks at a time and saying — and so you just have like certain words, this group of words, if this show up, that weights to happiness and this group of words shows up, it weights to sadness and that’s kind of where you get this values valleys and troughs?


[0:09:33.0] AR: Yeah, the happiest three words in our set are happiness — the happiest word is laughter and then love, and then happiness. So if we’re seeing a lot of mentions of love and happiness, that’s going to bring the score for that section of the book up. Saddest words we see include terrorist and death. If you start saying a lot of words like that, it’s going to just bring that average score down.


I think it’s kind of important to emphasize too, these methods are getting better at rating individual sentences but they’re not really good at that individual sentence level. So if you see the word death in a sentence, it’s going to rate that sentence negatively regardless of what else was going on around it, right? It could just be your analyzing Harry Potter and you saw a death eater. They may not mean that something negative was happening, it could have been going away but at an individual sentence scale it’s hard because you’re missing that context. So we’ll kind of smooth that out by looking at a lot of text.


[0:10:28.8] SC: Yeah, I was reading that Jockers has gotten some criticisms for that very thing, I think a woman named Anne Swafford. Yeah, I think that’s a very valid point. Now I just want to — I’ve got a million things that I could talk about in terms of the five commandments of storytelling that I talked about a lot in story grid and how they could be applied to your system but I wanted to sort of go Philip K. Dick for a minute here. Think about what’s the end game here? Where does the analysis of story lead in your opinion? I know you can’t say one way or the other.


Storytelling is such, I talk about this all the time, it’s such an innately human characteristic and many people, I myself included believe that in the concept that Carl Jung sort of came in up with is the inner genius, not Carl Jung but he took the Greeks and the Romans sense of each person being sort of born with an inner genius in which has to sort of come out and be realized for that person to pursue their own personal destiny. Campbell talks about this too.


So how do we get excited about sort of analyzing stories from an algorithmic computational data mining point of view? Is there going to be a moment when we could pre-program some kind of robot that would play with the language in a way that would give us this ups and down curves of positive and negative emotional valence? Do you think that there would be a time and I don’t know that Kurt Vonnegut ever believed that this was true. I thought what he was always saying was that it’s only a matter of time where the computers will be able to prove my theory and they did.


I think he was absolutely correct and he was saying back in 1994 whenever it was in that lecture that’s on YouTube that it’s only a matter of time before data mining could bring this together, and I think you guys have really shown that that’s true. But my question, back to my question is, do you think there’s a time in the future or near future where some sort of robotic algorithm could create a story that would be similar to the Da Vinci code or any of the other sort of pulp, fun, stories that are around today?


[0:12:52.0] AR: You hit on a whole bunch of things. I think that it’s, where we are now is further than we were. We started talking about concordances and counting words and the methods that we use for natural language processing and sentiment analysis had gotten a lot better, right? We’re now able to go in to each book and look at the happiness of the words and do a rolling average and on top of that, do analysis to see which are the most common arcs. We were able to I think answer the question of Vonnegut pretty well but I think that the methods are still pretty simplistic.


So like we just described, we’re looking at words and so I think one of the important distinction is talking about plot. Booker talks about the seven plots. We’re not really measuring plot. What we’re measuring is kind of the emotional experience, which is created by the plot. You’ve got the plot as a sequence of events and then you got the structure, which is how those are presented and on top of that is kind of where you get his emotional experience that’s created.


So I think that it’s important to note that what we’re really measuring using kind of this sentiment methods is we’re not measuring the plot that’s underlying the story. That said, I do think that for computers to understand us and to be more human, like you said, we’re innate story tellers. So artificial intelligence has gotten better too right? You can ask Siri a lot of complex questions and that has improved through this natural language tools and I do thing that one of the next leaps forward will be computers understanding stories in the way that we do.


Because Patrick Winston MIT put forth this Strong Story Hypothesis and his idea was that what really separates our mental abilities is the fact that we use stories to generalize and to understand the world. That leap from, this is happy, this is sad, to building a more complex story is going to be a big step forward in the artificial intelligence and our understanding.


[0:14:47.1] TG: Well I’m looking at it kind of from the other direction of if we have this kind of six basic stories and they all kind of follow these arcs, I’m coming from a standpoint of I’m trying to learn how to write a great story. I’m struggling through how to arrange everything in a way so that it does, to me the first thing that pops in my head with this is at the end of, when I finish my novel, I should be able to run it through the hedonometer and my book should fall in one of those graphs and if they don’t, I can probably pin point some places I need to go back and fix my story. Do you think that’s feasible?


[0:15:30.1] AR: Yeah, and actually one of the things that we’re working on is the ability, we’re working on a web app and a desktop tool for people to download and run their own story through. So you’ll get out that sentiment arc. If you see that things are moving wildly from positive to negative at the beginning, you know that you’re not conforming to one of those story arcs.


[0:15:50.7] SC: Well I think Andy, if I could, I think you really made a great point here that sort of got glossed over. Your point was, the hedonometer and the research that is currently very convincing in my opinion has delineated story, emotional story arcs. They have not delineated the ups and down’s and curves like if you were to take a look at my story grid for The Silences of the Lambs and you were to compress all the points down, it would look similar to something a kin to man in hole, Cinderella, tragic, which as you guys had pointed out in your paper, that seems to be the most popular sequence of emotional arcs in a long form novel.


Now what that means is that, and this is really what excited me when I read this stuff is that the pinpointing of the emotional arcs for me, represents sequences in a novel. Now I know a lot of people haven’t read The Story Grid but there are units of story that I talk about in The Story Grid and to me, the story, the six story arcs are representative of a way to construct your sequences and Tim, I was trying to get into this in our last conversation and it was — it got very muddy very quickly and I fear it’s getting muddy now.


But my point is that you, the computational story lab is not in the business as of today of analyzing a crime story. They’re not going to tell you and they’re not going to be able to explain to you the rise and falls of a particular external genre that we all know and love. They will be able to tell you is whether or not your emotional arc is working in a way that will engage the reader’s interest in a very fluid way.


So what they are, very exciting to me, is being able to take a look at whether or not your narrative drive will drive people to continue to read. So what you have been able to pinpoint are the emotional arcs that get people turning pages. The man in the hole story is the very simple thing where somebody is neither positive nor negative in their life, maybe a little bit more positive because nobody likes anybody who is depressed as Vonnegut said.


He’s walking down the street, he falls in a hole and he has to get out of the hole. So we want to know as a reader or as a viewer, is that guy going to be able to get out of the hole? And what are the progressive complications that he must undergo in order to ultimately go from a fall to a rise? So this six story arcs are wonderful tools when you are considering the internal movements of your protagonist and I love the idea being able to, people being able to plug their manuscripts into the hedonometer because they are going to see how their novel moves sequentially.


At the beginning, they may see man in a hole, it starts positively, dips below the negative and then it rises just a little bit above where the character started at the beginning and then we get another man in the hole and he falls again and he rises a little bit more, that’s called progressively complicating the story and then the Cinderella which a rise, a fall and a rise and then it ends with a fall.


If you look at The Silence of the Lambs, that’s kind of what you’re going to find and I found when I was reading all your research and everything and Jockers’ research too and I don’t think we should discount his work in any way. I think he’s trying to go down into nano, going down to the word and I think probably the next step would be for the science to understand the principles of the five commandments of storytelling, which Tim and I talk about all the time, which are you know, the inciting incident, is it positive or negative? Progressive complications, are those positive or negative progressive complications in a particular scene?


So I think if you were able to have a combo plate between — the other thing that I want to ask you in a second Andy is how you got into this to begin with. If you were to have a combo plate between the mind of an editor and the mind or a writer or storyteller or screenwriter or a radio personality and a mathematician and for those people to bring the peanut butter and the chocolate together, you could come up with something really fascinating. Back to my question about how you got into this predicament yourself Andy, what drove you to get into applied mathematics and how did you get in to the storytelling element of it?


[0:21:04.6] AR: We’re interested in story as kind of a founding principle because all the reasons that we’ve talked about, right? Stories are, they’re fundamental and I think that a goal of science, a great achievement will be a rich understanding of the landscape of human stories. I think that that basic drive and the fact that, the way in which we use stories and how important they are too was really what drove us to get into this.


[0:21:27.8] SC: What about you personally Andy? Where did you grow up? Did you always love math? Were you afraid of stories? Because I started out as a scientist and then I flipped over to storytelling.


[0:21:41.1] AR: Yeah, so I’ve always been a math guy, I had not been much of a writer or professional storyteller at all. So I went to Virginia Tech for my undergrad and I studied pure math and then I came up here to work on my masters and actually began studying weather and climate models. So I looked at data assimilation algorithms which are useful to combine observations and weather models to make good forecasts.


Just that basic training and linear algebra and mathematics makes a lot of these algorithms, which processed language possible. So the hedonometer project was already going when I got to the story lab and I came on board to kind of pursue this next goal, which was to get the stories. Yeah.


[0:22:23.7] SC: So what is the next step for the story lab? Are you continuing on this path? Are you trying to get a little bit more granular? What’s the next thing?


[0:22:33.7] AR: Yeah, I think that there’s a lot of ways that we can directly improve the work that we’ve done and we’ve already improved it a little bit just from some of the feedback that I’ve gotten. So in particular, we’ve even improved the corpus that we’re looking at, corpus of books. I think that going down the census is a really great goal, right now that dictionary based sentiment methods aren’t that good at it but I think that going down to, getting out that granular level is really what we want to do.


And we want to do it in a way, like you said, that respects kind of this commandments of storytelling and understand that there’s more than just positivity and negativity and that positive and negative language interacts with kind of the events and characters that are in a story and in different ways. I think what’s going to be really important for us is to get out characters. So who is the main character and how are they being talked about? There’s been some work to extract different character types across lots of books in the digital humanities and what we’re going to try to do is, or what I think is going to be interesting is connecting characters and events.


So you’ve got the first thing that happens in a story, which is something that happens around the main character. But then, how that event connects to the next event and what characters are involved in the progression I think is going to be an important thing to understand. So setting that up to kind of fit in to a framework that is more in tune with the way that we understand stories. I think it’s going to be a big next step. You might want to ask the computer a question, here’s a sequence of events and you can think of a simple story like think of Macbeth, right?


So you want to say, “Is this a story of revenge?” You can see that in the beginning, Macbeth killed Macduff, right? How did that progress into what ultimately happened? There’s some hidden parts which are in the sentiment about how, which people are affected by certain actions. So if you can take the events that happened there and then connect them in a way that you can draw causal inferences so what things in a book fit into our societal norms of what’s appropriate action?


When would it be appropriate to take, you know, do an action as a revenge? Or if you’re reading a book in a different culture, that doesn’t happen, that’s not part of the cultural narrative. I think getting in a model that can start to answer those questions and kind of framing the analysis around characters and the events that happen between them will be really big step forward.


[0:24:54.2] TG: Is there anything in this, because one of the things that I’m learning working with Shawn is just how many times my thought, the first thought I have for a scene or a sequence is cliché, I’ve seen it a hundred times. I was just talking with another writer this weekend and he described this opening scene and I’m like, “Dude, I’ve seen that eight times in a movie in the last two years.” You know?


It’s so funny because people are starting to ask my advice because I’ve been talking to Shawn so much and I’m like, “This is not as good as the real thing.” But I kind of coaching him through it and so I wonder if there is something in there too of like, would we be able to run 20 books from the genera we’re trying to write in and see what’s happening at each of this places as a way to just avoid common cliché’s that show up over and over?


[0:25:51.8] AR: Right, so if everyone begins with a man in a hole story type, that tends to work but it also might be something that you want to avoid because everyone’s just seen it before.


[0:26:03.7] SC: Well, it’s not exactly that, it’s the way the man gets in the hole. A man getting in the hole can be done and Tim and I talked about this in terms of stranger knocks on the door, that is a setup for a man in a hole problem, right? A stranger knocks on the door and says, “Hey, you won the lottery but the problem is that in order to collect the lottery, you have to go do this horribly difficult thing.”


So what I think is really interesting here is the concept of taking this to the next level and also, it’s not only the micro look at an individual’s story, it’s, as Tim says, Sort of identifying plot point cliché’s that have been worked on over and over in particular. One of the things that I was fascinated by is that I saw, it was either your research or Jockers’ research where they did a global analysis of all of 19th century literature to find out whether or not, what the cultural shift and the cultural sort of norms of that time period were and, as an editor, and as an acquiring editor when I was at the major publishing houses. This is sort of what editors try and do. What they try and do is anticipate the next big thing.


One of the ways in which you do that, for example, in the early 1990’s, there was the rise of what you call, the legal thriller with The Firm by John Grisham and the legal thriller became a hugely successful and wonderfully profitable genre of book that to me, I thought to myself, “Why is the legal thriller becoming so popular right now? What’s going on in the culture that is making people want to fantasize about the actions in the legal justice system?” Those are the kinds of questions that I think, you having the ability to analyze 4,000 words within I’m sure it only took… how long does it take to run a story for the hedonometer?


[0:28:14.2] AR: Yeah, a few seconds.


[0:28:15.9] SC: Oh my god.


[0:28:16.2] AR: Not long.


[0:28:16.6] TG: Wow, that’s crazy.


[0:28:18.4] SC: You know how long it took me to run The Silence of the Lambs? A good two years. But wow, a few seconds? That’s funny, wow.


[0:28:27.0] AR: Yeah, we were just looking for words, it’s simple. I think doing more complex dives into it will take more time.


[0:28:34.8] TG: So what do you see, because we asked a little bit about artificial intelligence, but I’m more interested in like, what is this going to do for writers to have this kind of tools, Now with what you have now and then over the next five and 10 years, how is this going to help writers tell better stories?


[0:28:54.6] AR: That’s almost a question that I would want to ask you. We’ve been able to get these things out and I want to know, how would you use this in your work? You could take a story that you have and run it through and see how it compares to, like you said, other things in your genre, you talked about kind of this cliché ways to get in to different story patterns. You can see, “Okay, I want the story pattern but I want to get in to it in a different way than what’s done before.” I don’t know if you guys use the word troupes, but this is kind of just another word used to frame the common elements that make those twists.


So yeah, I think that if you could use it to compare your work to what’s coming out now and what’s popular as well. I think this fits into what Jockers is looking at now. So he is looking at bestsellers in New York Times and saying, “Okay, what makes this books different than other books in terms of their emotional arc?”


[0:29:48.6] SC: Yeah, things that come to mind for me would be like, one is if you’ve run all of the thousands of books through this, could I separate them into different genres where I want to see the 20 legal thrillers that you’ve run through and then I could see those patterns laid out on top of each other and what would be really cool is to like dive into the five scenes that are around that peek in each book.


Because right now, the idea of the work that goes into, if I want to write in a certain genre and I want to make sure I’m not hitting all the clichés, I’ve got to go back and pull out the 10 books I’ve read, 10 more books I haven’t read, read them all from start to finish and really I should be mapping them out, scene by scene so I can kind of build my own arc to make sure that I’m following the same path but not doing it in the same way and not doing it in a cliché way. It would be really cool to, at a couple of clicks of a button, to be able to dive in and compare 10 or 20 books at the same time.


[0:30:55.5] AR: Yeah, I think you could do that and then on top of that and some light annotation will help you kind of understand, “Okay, well these are words people are using, this is what is going on. I think you know you said that there’s this different characteristic patterns and talking about genre. I think what would be interesting is you know, legal thrillers became a big deal. Are they following just the same patterns that other stories were before them but just with a new genre? So what is it that makes a legal thriller that genre? What would be the most predicted feature? Maybe the arcs are the same but it’s just the characters and the language is different because of the setting.


[0:31:30.4] SC: That’s absolutely true. Sort of the way I delineate it Andy is that you have external genre and you have an internal genre. The six story arcs track the global sort of emotional valence of the reading experience and usually what that extrapolates to is the movement of the lead character, or in the case of a multi character story, the movement of the group of characters.


[0:31:59.7] AR: Right, and that’s a core assumption of what we’re doing. We’re assuming that the language in the book is mostly about the main characters.


[0:32:04.3] SC: Exactly, exactly. So if you were to plug in say The Firm, Presumed Innocent, can’t think of any other legal thrillers off the top of my — Robert Tanenbaum’s work and you would have come up with the top five, one of the other things that I talk about are conventions and obligatory scenes and they sort of are similar to what you’re talking about when you say as “troupes”.


To be able to see the emotional movements of those four, five, or 10 or 20 legal thrillers is going to give you the sensibility of, “Oh that’s when I really,” — I think if they would all sort of map out quite similarly and that you would be able to pinpoint knowing your genres conventions and obligatory scenes in a way that I always recommend people to do. You would be able to pinpoint that moment, “Oh wait, that’s 35th thousand word in The Firm is the exact, is very close to the 36,000th word and as Tim said, the writer, hardworking practical writing today would be able to go and find out workout, go to their manuscript, find this scene and say there’s the all is lost moment. That is the way John Grisham solved that all is lost moment for my lead character by using in external circumstance like wife finds out husband’s had a secret affair.


That is a way of using your research to track the emotional arcs of the stories that you are most, I mean if we were to open up some kind of story grid software and we had the hedonometer as a tool, we would recommend, “Hey, first thing you want to do is pop your top 10 of your genre into the hedonometer, find out their emotional arcs and then go scene by scene and discover what’s in common, what’s not in common, what ways can I innovate? If somebody has used the same kind of scene as a cliché to propel the plot forward, how can I change it?” That kind of thing.


[0:34:19.7] TG: Now I was going to ask, is there — if I did have a book I want to run through it, how would I even get it in there because there’s DRM on my Kindle titles and that kind of thing. Is there a way right now to get a book, like take any book and run it through?


[0:34:37.5] AR: Yes, we use Guttenberg, Project Guttenberg, which is all post copyright. There’s no DRM that we have to break to analyze it , you can analyze the books and look at them yourself but if you break the DRM to analyze them, of course you can do this with a software like calibre. You don’t want to share that, right? You’re using it for your internal use, right? That’s not going to be something that I would give you.


If you want to send your story on to me in just the full text form, I could run it through and see what the arcs are so far. But yeah, I think you’re going to need the full text to do that. If you own the book, you can use it for your own internal use. It does get in to some happy trust copyright concerns if you’re using this work to share.


[0:35:23.8] SC: That makes sense. Well Tim, I think it might be a fun project, knock on wood and however many months to throw yours through there and see what it looks like, you know? See if all the advice that I’m giving you is actually worthwhile or if I’m just talking out of the side of my…


[0:35:45.1] TG: It’s interesting to see, I’m trying to think — so Hugh Howie wrote this short story one time, I think it was called The Plagiarist and basically, they had created all of these artificial intelligence worlds where all this people were living in this AI world, didn’t know they were an Ai like the matrix or whatever. What people would do is they would mine this artificial intelligence places to find books and works of art that they would then republish in their own world and make a bunch of money on.


Because it was legal to plagiarize computers. Then of course well I won’t ruin the ending but it’s always been kind of stuck in my head of this idea that you could create this AI that’s creating these works of art that you could then seal in and use yourself but I just think of these things as those first step towards doing this things that have been manual for so long that we then can use a computer to go faster on. We’ve even had this discussion of course using a word processor is much faster than writing out your book long hand.


But at the same time we’ve talked about how Malcolm Gladwell, when he does an interview, he’ll transcribe it himself and then print it off and then go through it by hand and do all of this stuff by hand that he could do automatically or pay somebody $20 to do, he does it himself. It’s just interesting as this new tools come online of what pieces are going to help us write better but faster and which ones, if we try to use them will actually pull out the hard work that it takes to create an actual great piece of writing?


[0:37:38.6] AR: I think that what you touched on is really important because what is it that really makes a story compelling, right? There’s the arc and there is having characters that people are drawn to. I’m not the expert here. I think, knowing what those pieces are and being able to extract those to create a new story would be the most important part for automatically generating stories. So I would say, “Okay, here is 30 John Grisham novels, can I just make another one? Based on the way he uses language and what else is going to be important?”


It’s easy right now to train like a recursive neural network, one of this language models to say okay, here’s a bunch of text, give me a bunch of text that looks exactly like it and it would do that. The story would make no sense, sentences would be like suppose that aren’t even related.


[0:38:25.8] TG: It must be alike a Trump speech.


[0:38:32.0] AR: They call it a word salad, yeah.


[0:38:36.7] SC: It’s kind of funny that you brought that up Tim, but we have to remember that storytelling is the way we change people’s minds, right? Say you’re the CEO of a major manufacturing company and you need to sell a product. I could foresee somebody who is writing the speeches for somebody like that, using the hedonometer, sounds like the terminator now. Using the machine to track the emotional valences from that speech so that they can see whether or not they’re hitting like some of our great public speakers like Tony Robins, there’s a guy who is using these emotional arcs, this six storytelling primal arcs in a way that motivates people to change their lives.


When I’m thinking as another fantastical use of this machine would be for businesses, right? Sentiment analysis itself is, it’s not just confined to looking at novels. There’s big businesses that are going through the World Wide Web and blog pages and saying — giving reports to large corporations about how their particular brand is trending, right?


I would suspect that there are, right? Yeah, I mean, that is how we’re convincing people to buy things and so the scary sort of big brother-esque element of this thing is using a machine to manipulate the consumer base in such a way that we laugh about the old 1950’s subliminal seduction of the 1950’s advertising age but this is literal gnarly manipulating people to purchase products, services, et cetera, it’s almost in a way brain washing by using — I mean there’s a reason why Aristotle or was it Plato who warned:


The first thing you want to do if you want to control society is kill all the storytellers because they’re the ones who can change people’s minds in a way that can change the power structure. This is kind of fun to talk about but it’s also everything has a shadow element to it too. Have you guys thought about that in any way or is it, I suspect it’s more fun, it just sort of dive in to the science and let these sort of esoteric ethical considerations sort of sort themselves out themselves.


[0:41:18.2] AR: Yeah, we always tend to air in the side of doing the science and publishing openly. That’s what we’ve done so far and it’s been a lot of fun. You mentioned kind of the propaganda side of this and there was a darpa grant a few years ago called Narrative Networks and they wanted to understand. They want to understand exactly this. Narratives are useful for end points people and how can we measure those. I think they have people hooked up in an MRI machine watching movies but I’m not exactly sure.


You also mentioned marketer’s right? In business. These are the great storytellers, they understand what drives people and the other interesting thing is you mentioned Trump right? The marketing and the storytelling come together, his slogan is make America great again. We’re talking about the story arcs, there’s a story arc even in that. So my advisor Peter was the one who elaborated this one for me. There’s the whole man in the whole story right in there, right?


And there’s kind of a notion of actions, make America great again is saying something about the past, that things were great. Saying something about the present, things aren’t great and it say something about the future. They’re going to get better again. This is a classic man in the hole story that’s encompassed in just four words. As far as the story goes, its’ really well done.


[0:42:34.2] SC: I would absolutely agree with that and it feeds into an emotional element that I talk about the beginning hook, the middle build and the ending payoff of a particular story and the fact that you just delineated that exact same thing in this political slogan, see — the way I would air would be to educate people right? To educate them into the forums, if more people understood the structural form of storytelling, the better prepared they will be not to become susceptible to propaganda.


Somebody who knows that somebody’s a fraud, they can recognize each other, a fraud recognizes another fraud. If you’re capable of understanding the story principles and saying, “That’s an interesting slogan, make America great again, what does that exactly mean? What is he trying to sell me using that language and I think we all walk through our lives in so many hazes and digital hazes that we stop, now everybody’s talking about the power of video and how video is really going to be the next iteration of the world wide web and how language and everything is really taking a back seat.


I think that’s ridiculous because language is where it begins, language is where the imagery comes from. So to understand storytelling and understanding that all story begins with language and with internal processes and understanding the arcs, the more people will understand this six arcs that are very simple and primal. The better they’ll be able to recognize them when they’ve been given a load of bullshit you know?


Because when somebody’s manipulating you and you can sense it but you’re not quite sure, I know the way I can pinpoint is I know the story they’re telling, they’re mimicking a story that they know about me that they’ve read from some source online in order for me to get to buy something, it’s like a fortune teller, you go into a fortune teller, they look at your suit, they look at your fingernails, they look at your haircut and they’re brilliant storytellers and they can look at you and say, “Oh, I can see that you’ve lost somebody dear in your life recently.”


You go, “Oh my god, how did they know that?” That is how propaganda and everybody work, I think your work is really important because it educates people to this primal forces that we really need to understand and the story telling arcs are all about that.


[0:45:17.4] AR: Right, I agree, yeah.


[0:45:19.6] TG: Andy, tell us a little bit about what’s in the near future with your work and where people can go to make sure they’re getting updates on everything that you’re doing?


[0:45:29.7] AR: Yeah, so we have a website, which I think I mentioned in the beginning and if you go there, you’ll see the Twitter time series and if you poke around on top part, you can make it down to look at the time series for books. We also have some movie scripts up there. We’re working on developing a tool to allow people to put in their own works and maybe you could put in on your own, some works of books that you’re trying to be in the same genre as and see where you line up.


We’re working on building out a desktop and a web version of that tool. We have a link for an email list out there, you can sign up for that. That we’ll send out once we have something working.


[0:46:08.1] TG: Okay, I’ll put all of that in the show notes as well so people can link straight from there at our website. Well this is great, thanks for coming on and talking about this. Shawn, did you have anything you wanted to ask before we wrapped up?


[0:46:20.9] SC: No, I don’t think. I,7 so, I think again Andy, I can’t thank you enough for being here, I’m really excited about the work you’re doing and I think it’s going to be very applicable for all amateur and professional writers from here on out.


[0:46:32.4] AR: Yeah, thanks a lot and I happily take the advice from you guys on how to do that. Tim, I’m sorry if I took away from time that you guys are working on, your story as well.




[0:46:42.7] TG: Thanks for listening to this episode of the Story Grid Podcast. To find the links to the Hedonometer and everything we discussed in this episode, you can find that at Also at that URL is all our past episodes and all the past show notes. If you need to refer back to anything, it’s all there. If you want to keep up with everything story grid related, you can do that by going to and signing up for the email newsletter.


If you want to reach out to us, provide us feedback, ask questions for future episodes, you can do all of that on Twitter @storygrid and as always, and as always, if you could leave us a rating or a review on iTunes, that would be fantastic. So tanks for listening and we will see you next week.

7 comments on “Narrative Machines

  1. mlibdoyle says:

    I remember enough from my graduate research methods class to be impressed with what Andy and the team are doing. I was glad to hear the conversation come around to the potential applicability of story analysis for writers, particularly conventions and obligatory scenes. While there is no substitute for taking the time to analyze one’s genre, it would be great to have evidence-based confirmation that one’s analysis is on target. So…you guys create a software program for that — a marriage of science and art — and I’ll be there, putting my money on the table! As always, thanks!

  2. I still find it weird that Grisham’s best work (imho) is his first book.
    When I think of reading him, I think: A Time to Kill, The Firm, and perhaps The Rainmaker as good books.
    His next best work, again, imho, is The Painted House.
    Then, when he wrote The Innocent Man, I thought, awesome, he’s done grinding out the same boring crap over and over and he’s going to actually try to make the world a better place….
    But no, I guess the lure of easy money… Not that he needs anymore.

    I guess my point is that we can come up with formulas, but the real magic of something insanely beautiful (even if it’s dark) will probably elude computing and analysis for a long time. Maybe forever.

    I don’t know… A computer can be taught to think, I imagine a computer will become sentient at some point (god like) but a computer will never have hormones and emotions. Without love, hate, a desire to marry, murder, or seek revenge, or fear, or lust, or desire, what do you have?

    I just read Blake Synder’s Save the Cat yesterday, and I liked what he said about “primal.”

    That’s the essence, I think, to good story, and maybe something beyond a computer’s capacity, the primal drives we have as humans (which are flesh and bones and hormonally based).

    Love and hate.

    A computer can find these words, sure, but can a computer understand what they mean? Impossible, at least until a computer can get a body (which, heck, might happen, Heinlein had self aware computer beings that were downloaded into cloned human bodies in his stories).

    Well, that’s my two cents.

    One of the things I’m thinking these days: unless you’ve got access to a real professional (of which there is a short supply) any feedback you get that is not based upon factual errors & continuity & grammar, is, at best, risky to even listen to.

    I don’t know…

    >bangs head against wall….

    Maybe someone will clone Shawn.

    1. Tina M Goodman says:

      A Time to Kill was great! He did set out to write bestsellers. I didn’t read Painted House or Inno. Man. Maybe I should.

    2. Tina M Goodman says:

      I don’t have a problem with makhines telling stories. They would most likely get better with input. It would be handy. They kould teakh preskool and entertain our khildren, basikly take over our future generations… (Sorry, my letter see doesn’t work on this board.)

  3. NewspaperMan says:

    Dear Mr.Coyne! Thanks for sharing the podcast. If you gentlemen could have just let the PhD kid (AR) have more airspace and let him explain more about the project. Out of this 8000 odd words transcript – the kid gets to speak about 3400 words only. 🙂 🙂 🙂

  4. Ruth Nolan says:

    The more I study science the more I believe in God. – Einstein –

  5. Tim says:

    Word salad – insult to lettuce :))))

    The meter needs to also be selectable for the parameter to measure – such that ALL of The StoryGrid ‘values’ in there – such that edit process or editors can focus on the qualitative with the quantitative metrics are done for them … …

    Great show – missed a couple before this – so will catch up

Leave a Comment