The Martian - Tracking its Progress into Best-seller Orbit

By now, most people will be familiar with the movie “The Martian” starring Matt Damon and directed by Ridley Scott.   Many people, especially those reading a blog connected to Indie publishing will also be familiar with the story of Andy Weir and how he came to write this block-buster novel:
·         He is a computer professional who always had an interest in science and science fiction.
·         He first published the book in serial form on his blog.  As it became more popular, he eventually self-published it on Amazon, if only to cut down on demands on his time created by the story’s blog format.
·         That took off, and eventually a Trad publisher took him on and "really" published the book. 
·         Then, Hollywood liked the story, and bought the movie rights.  NASA also became involved, at least peripherally, as it fit into their plans to maintain interest in human exploration of the solar system.
·         Ridley Scott did a great job with the movie, which has now become a major hit and Andy Weir is rolling in dough (I hope).

It so happens that I have been tracking the “Amazon Top 100” books of the last couple of years.  The Martian was in that list  in 2014 and has burned up the charts since then.  See the chart below for details.

The Martian was named #53 by Amazon, in its list of the top books published in 2014.  That’s a sort of average for the year, and it only includes books that were published in 2014.  So, it wasn’t 53rd throughout the year, in terms of its rank in the entire Amazon book population.  But, from research that I have done, it would probably indicated that it averaged around 100 or so during the year (it was published in February of 2014, according to the Amazon metadata).  Throughout the first half of 2015 it remained in the 150 to 100 range, then shot into the top 25 and eventually the top 10.  In October, it was in 3rd place overall.
The second chart focuses on 2015.  It is also a somewhat more accurate chart than the first bar chart, as my tracking was on an approximately monthly basis, but not exactly so.  The second chart represents the time factor more accurately, as the scale is linear in time.

I have included some text boxes, showing some of the major PR campaign events (as outlined on the wiki page for The Martian), and how the sales ranking correlated with them, namely:
·         Early June had a viral marketing campaign launched by 20th Century Fox.  As you can see, the books ranking shot up at about that time.
·         In early August, NASA and science populizer and astrophysicist Neil deGrasse Tyson had special events or broadcasts related to The Martian.  Sales also seemed to respond to those events.
·         Lastly, the movie came out in early October and the book hit #3 at about that time.   I should note that I haven’t seen the movie yet, but I have read the book.  A work colleague reports that the movie is excellent, though and her judgement is beyond reproach :)
·         Here’s a wiki link for all that:
Next, we can look at the trend in the number of reviews that the book received, a metric much watched by writers and publishers.  As you can see, the cumulative number of reviews has increased through the year, and has accelerated as the year has gone by. That’s pretty clear from visual inspection and reinforced by the quadratic term in the “best fit” polynomial that Excel drew for us.

Again, a second version of the graph does a better job with the time trend, as its axis takes the gap in days between recordings of the number of reviews into account.  As you can see, the fit between the actual data and the quadratic “best fit” line is very close.  That always makes a data analysts heart jump with a tiny jolt of pleasure :).

We can also try to impute sales from this data.  Those charts are below.

The first graph shows sales per day, estimated on the days that the sales ranks were recorded.  The sales are estimated from a power law that relates Amazon ranks to Amazon sales, developed from data supplied by the data analyst and writer who goes by the nom de plume “Data Guy” on the internet.  He is also affiliated with noted Indie writer Hugh Howie - both are widely considered to be very astute data analysts and writers. His table represents the combined efforts of a community of writers who correlated their ranks and sales, so it is a crowd sourced result.  Experience and research has shown that the results of such endeavors are usually quite accurate.

The second graph gives the book’s cumulative Amazon sales, based on taking the average daily sales on the upper and lower bounds of each time period and multiplying by the number of days in that period.  As you can see that smoothed the graph out nicely, generally a sign that your imputed numbers are corresponding fairly well with the underlying reality.  Bear in mind, this is just Amazon sales; the book is also available in paper book stores, so overall sales are undoubtedly much higher, given its best-seller status.

We can also attempt to estimate sales from the number of reviews that the book has received.  Other crowd sourced efforts have estimated that something around one percent of books sold on Amazon are reviewed, though that can vary from half a percent or less, to several percent.  It seems to depend on things such as genre, ranking, writer fan base and so forth.
The graph below makes the assumption that 1.5% of all copies of the book sold were reviewed.  This matches the Data Guy power law derived results quite nicely in the latter part of the year, but overestimates sales in the earlier part of the year.  But, all things considered, the agreement is pretty good, in my opinion.  I suspect that as the book gained in popularity, its tendency to be reviewed changed.  When it was still something of a cult hit among SF, it probably had a higher reviews to sales ratio; after all SF fans are a literate, voluble bunch who like to share their thoughts, knowledge and opinions.  Once the book hit a more mainstream audience, it probably had a lower reviews to sales ratio; best-seller readers are probably somewhat less likely to write reviews.

One other interesting effect in this graph is the bump in sales per day estimated from reviews in the early February data.  One wonders if this is based on reviews of copies of the book that were bought for Christmas, or purchased after a Kindle was gifted to the reviewer for Christmas.  That’s why I have labelled this “Christmas Kindle Effect?” in the text box on the graph.

As a matter of interest, I bought the old 1960’s movie “Robinson Crusoe on Mars” to compare the stories.  That movie is a lot more thoughtful and scientifically accurate (based on what was known at the time) than the title would indicate.  Time willing, I will do a compare and contrast blog on that soon.

