One Year with Harper Lee’s To Kill a Mockingbird and Go Set a Watchman, Part 2
As most people must have heard by now, Harper Lee died
a little while ago (Feb 19, 2016). Earlier,
I published a blog following her Amazon sales rank and imputed sales over the
past year (Feb 2015 to Feb 2016), noting how sales corresponded to some key
events over the year. This companion
blog looks at how the number of Amazon reviews corresponded to those key
events. It also performs some analysis
on the relationship between the Sales Rank and the number of reviews, for these
two books, To Kill a Mockingbird and Go Set a Watchman.
As a reminder of that blog, and for context, the graph
below shows how the sales rank of To Kill a Mockingbird (TKAM) and Go Set a
Watchman (GSAW) varied over the time span from early February 2015 to late
February 2016, a period of a little over a year.
1 – Number of Amazon Reviews and Key Events over the Year
The graph below shows the total number of reviews
recorded on the Amazon site, by date for the period from early Feb 2015 to
early Feb 2016. The same key events are
outlined, as was done for the sales rank graph.
The first key event was the announcement that a new
book by Harper Lee was in the works, in early February 2015. As you can see, the slope of the curve for
TKAM reviews increased (the blue line gets steeper), when that announcement was
made, indicating that interest was piqued, as reflected by people’s propensity
to leave a review.
The next key event was the pre-release of GSAW in late
May 2015, followed by publication in early July 2015. The pre-release of GSAW didn’t do much, if
anything, for the review numbers of TKAM.
However, they did pick up with the release of the new book (again, the
slope of the blue line increases).
Naturally, once GSAW was released, the number of reviews shot up very
quickly, along with sales, of course.
The rapid increase in reviews would seem to indicate that there was a
lot of latent interest in the new book.
Reviews for GSAW began trailing off at about the
beginning of October, as indicated by the diminishing slope of the red
line. An inflection point happened
sometime in October, with the line bending back down. The slope of the blue TKAM line also
diminished about this time, though the effect is rather slight.
The next major event happened in December, when GSAW
won its category in the Goodreads Book of the Year (2015) rankings. That, and Christmas, seems to have turned the
line back upwards, with an inflection point some time in January. The pace of reviews for TKAM didn’t appear to
change much, if at all.
Then, of course, we come to Ms. Lee’s death. A funny
thing happens almost immediately - Amazon takes away about 1700 reviews,
overnight, on Feb 21, 2016. That’s why
the blue line takes a sudden plunge, a discontinuity.
One wonders just what happened here. Many Amazon authors have had the experience
of having reviews taken away by Amazon, especially we Indies with modest
sales. The explanation for this is
generally that the reviewer had some kind of family or commercial relationship
with the writer or the publisher.
Presumably the same thing must be at work here. Since Harper Lee probably didn’t have 1700
“bogus” reviews from her family and friends, it is natural to assume that this
must relate to the publisher. Had the
publisher salted in all these reviews?
Or is some other explanation at work.
I suppose that we will never know.
Anyway, after that the TKAM line resumes, and the rate
of reviews seems to increase modestly.
GSAW, on the other hand, doesn’t seem to be much affected by the
writer’s death.
In the last blog, I noted that death did seem to be a
good career move, in terms of sales. But
the effect was not long lasting. Both
books are now in the 300-400 rank range.
It probably wont’ be long before they reach their baseline level,
somewhere in the 800 to 1000 rank range.
The graph below gives a day by day count of the number
of reviews for each book, rather than a running total, along with the key
events during the year. It can also be
correlated with the comments in the text above.
This format makes some things clearer, but others more obscure (hidden
by the day to day noise of the time series).
By the way, I cut off the data before the big TKAM recalculation of
reviews, as it distracted from the other aspects of the graph, given the scale
of that one day change.
2 – Sales Rank versus Number of Reviews
As a data analyst, I am always interested in exploring
relationships among variables. In this
case, I will look at just how sales rank and number of reviews were related,
for these two books during the time period in question.
The first graph shows the average sales rank during a
given ten day period for TKAM, versus the number of reviews that the book received
during that same ten day period. As you
can see, there does seem to be a definite relationship - a lower sales rank
(more sales) corresponds to a higher review rate (more reviews). This is as one would expect. You need sales to get reviews, but reviews
can also trigger sales, due to the “social proof” that people tend to assume
from the mere presence of reviews.
I used Excel’s trend-line option to test a few
different functional forms, to the relationship. The best fit was given by an exponential
function. Basically, that implies that
the slope of the relationship is highest when the sales rank is lower, and
weakens with increasing rank.
I should
note that removing the outlier at approximately x=100, y=45 only improves the
model R-square a bit, increasing it from 0.742 to 0.767. An R-square of 1.00 implies a perfect
positive fit, while an R-square of 0.00 implies no relationship, and an
R-square of -1.00 implies a perfect negative fit. So, this is a pretty decent fit.
We can now go on to look at whether the fit gets
better or worse, if we compare sales rank at period T with sales rank at period
T+1 (using ten day period averages). In
other words, we are testing how strongly sales predict later reviews. When we do that we see that the fit gets
worse, with the R-square dropping from .742 to .511, for the exponential
functional form.
I then tried the other alternative - testing sales
rank at period T against number of reviews in period T-1. In other words, that tests how strongly
reviews predict sales. In this case, the R-square was .567, which is greater
than the previous case, but less than the case where sales rank and number of
reviews are drawn from the same time period.
So, it would seem that the relationship between sales
rank and reviews is:
·
strongest when the two are close
together in time,
·
next strongest when reviews lead
sales rank
·
then weakest when sales rank lead
reviews.
Naturally it
would be best to do a multiple regression to pin this down further, but as a
first level qualitative result it is still useful.
The results
were substantially similar, when looking at sales rank and reviews for GSAW,
though a logarithmic function proved to have the best fit:
·
strongest when the two are close
together in time (R-square=.731),
·
next strongest when reviews lead
sales rank (R-square=.650)
·
then weakest when sales rank lead
reviews (R-square=.686).
So, to sum up:
·
The key events in the year (announcement of
new book, publishing of new book, award to new book, author’s death) tended to
correspond in increases in sales and reviews for both books, “To Kill a
Mockingbird” and “Go Set a Watchman”.
·
There were some unusual re-jiggings
of reviews by Amazon, especially for “To Kill a Mockingbird”, where about 1700
reviews were pulled, shortly after Harper Lee’s death.
·
Sales rank and reviews were related,
in a non-linear fashion. The best fit
relationship was given when both variables were within the same ten day time
period.
================================================================
Finally, of course, I should remind you that you can
buy one of our Dodecahedron Books titles.
Since Harper Lee wrote about the social and racial complexities of the American
experience, I will offer up “On the Road with Bronco Billy”, a travelogue and
cultural study of late 20th century America, as seen from the cab of
a big rig. It also includes some
observations on race and class in America, though not with so fine a literary
touch as Harper Lee’s books. J
On the Road with Bronco Billy - A
Trucking Journal
Kindle Edition
Amazon U.S. http://www.amazon.com/gp/product/B00X2IRHSK
Amazon
U.K. http://www.amazon.co.uk/gp/product/B00X2IRHSK
What follows
is an account of a ten day journey through western North America during a
working trip, delivering lumber from Edmonton Alberta to Dallas Texas, and
returning with oilfield equipment. The writer had the opportunity to accompany
a friend who is a professional truck driver, which he eagerly accepted. He
works as a statistician for the University of Alberta, and is therefore is
generally confined to desk, chair, and computer. The chance to see the world
from the cab of a truck, and be immersed in the truck driving culture was
intriguing. In early May 1997 they hit the road.
Some time has passed since this journal was written and many things have changed since the late 1990’s. That renders the journey as not just a geographical one, but also a historical account, which I think only increases its interest.
We were fortunate to have an eventful trip - a mechanical breakdown, a near miss from a tornado, and a large-scale flood were among these events. But even without these turns of fate, the drama of the landscape, the close-up view of the trucking lifestyle, and the opportunity to observe the cultural habits of a wide swath of western North America would have been sufficient to fill up an interesting journal.
The travelogue is about 20,000 words, about 60 to 90 minutes of reading, at typical reading speeds.
Some time has passed since this journal was written and many things have changed since the late 1990’s. That renders the journey as not just a geographical one, but also a historical account, which I think only increases its interest.
We were fortunate to have an eventful trip - a mechanical breakdown, a near miss from a tornado, and a large-scale flood were among these events. But even without these turns of fate, the drama of the landscape, the close-up view of the trucking lifestyle, and the opportunity to observe the cultural habits of a wide swath of western North America would have been sufficient to fill up an interesting journal.
The travelogue is about 20,000 words, about 60 to 90 minutes of reading, at typical reading speeds.
No comments:
Post a Comment