Sometimes it can be tricky to identify what will be popular in the years to come, but that doesn’t stop big data from trying. Cultural anomalies are particularly difficult to predict, but an algorithm is attempting to predict which modern novel will become the next big bestseller. Jodie Archer, author of an upcoming book titled The Bestseller Code: Anatomy of the Blockbuster Novel, takes a critical look at what makes something popular amongst consumers–particularly in regard to literature.

A computer algorithm, affectionately referred to as the “bestseller-ometer,” examines a huge amount of literature for qualities that make bestselling fiction. According to The Atlantic, the algorithm is capable of identifying a bestseller upwards of 80 percent of the time. This success rate is achieved by going off a list of novels from the past 30 years and identifying New York Times best sellers. This is one of the ways in which data-driven initiatives are attempting to better understand the way that the human brain identifies concepts, and it could change the way that publishers identify potential best sellers.

Like most good ideas, this concept was borne from a question that needed to be answered: “Why do we all read the same book?” It’s a valid question to ask, as people find different traits valuable in literature. The same folks who like to read literary fiction may find a guilty pleasure in young adult novellas. A book could be slaughtered by critics, but be wildly popular amongst the masses, as we’ve seen with several novels based on vampires.

Archer, along with English professor Matthew L. Jockers, built the algorithm with the intention of discovering what makes readers flock to a particular story. The Bestseller Code documents the process through teaching an algorithm to closely analyze the text within the content for certain key factors that create popular fiction. This generally involves semantics, like themes, allusions, word choice, and other literary topics. Among the most popular traits found within bestsellers are:

  • Authoritative voice
  • Colloquial (everyday) language
  • Action-oriented characters
  • Cohesion
  • Intimacy

Another major theme that was touched upon was the idea of the zeitgeist, which can be defined as time-sensitive ideals and beliefs. In other words, what’s contemporary and popular amongst the public plays a major role in how successful any novel is. Thus, it becomes difficult to predict how bestsellers will be selected in the future, as it’s difficult to guess where society will stand in the future. Plus, it would make sense for a human–with its ability to understand and interpret emotion–to be the one to decide whether or not a book is worth labeling as a bestseller. After all, the computer isn’t necessarily the one reading the book, but rather it’s hundreds of thousands of people around the world who will be doing so.

Whether or not big data manages to identify a masterpiece recipe for the next bestseller hasn’t been mentioned, and thus it’s important to realize that the entire concept of trying to identify human behavior with big data could be irrelevant. After all, humans can be difficult to predict, as they often function in ways beyond any and all reason or logic. Thus, it’s important to remember that while technology has provided great ways to improve operations and hone in on a consumer base, it’s still crucial to remember that people are people.

What are your thoughts on big data? Let us know in the comments.