As
regular readers of this blog know, I have kept data on the Amazon Top
100 list of books, for the years 2013 and 2014 and have written a
number of blogs in which I analyzed that data. It will soon be time
to update that database with the most popular new books of 2015. But
before doing that, I thought it would be interesting to see just how
the Top 100 of 2013 and 2014 did during the year 2015. How did their
rankings change? How did their review numbers change? Which books
held their rankings the best, by such important factors as genre,
early ranking, writer sex, writer age, price and so on?
There
are several ways to look at such data. The first and most obvious is
via descriptive statistical analysis – i.e. just looking at
measures such as average rank by category. Beyond that, more
advanced techniques, such as logistic regression, can be used to
determine the independent effect of each of these categories. That
can help to predict the types of books that best hold their rankings
and reviews (and therefore, sales) over time.
This
blog will focus on the descriptive statistics, by looking at the
average rank of these books, during each month in 2015. The graph
above is an example. For the 2013 and 2014 Top 100 lists, the
average rank by month in 2015 is given by the height of the bars.
Higher average ranks are, of course, not desirable. In these graphs,
lower numbers are better, just like in golf.
The
books in the combined 2013 and 2014 Amazon Top 100 lists fell from an
average rank of about 4000 in early 2015 to about 9000 by
mid-December. One could
fit a functional form to the data, but eye it appears to be
quasi-linear, perhaps a gently
sloped power law, over the
12 month period.
It is hard to be sure what
that represents in terms of reduced sales or income, however, as the
relationship between rank and sales is not linear. Furthermore, due
to differences in pricing, the relationship between units sold and
money earned is not straightforward either.
We can try to estimate the
drop in sales via reviews. It turns out that this set of books
“earned” about 3.3 reviews per day per book in the early part of
2015 versus about 1.2 per day in the last period. If we assume that
a relatively constant percentage of books are reviewed by purchasers,
that would indicate that sales of these books declined to about a bit
over one-third of their early 2015 total, by the end of 2015. Note
that in their “Top 100” year, these books averaged about 9.4
reviews per book per day. So, by the end of 2015, they were probably
selling about one-eighth as many books as they were during their
initial publishing year. This is shown in the graph below, along
with data for each of the two Top 100 years, and best fit exponential
decay curves.
At any rate, this particular
blog is more interested in comparing how well different categories of
books did over time, rather than estimating sales figures (though I
might try that in a later blog).
Overall Ranks in 2015, by “Top 100” Year
As you can see, the more
recent books from 2014 held their ranks during 2015 better than the
books from the 2013 Top 100 list. In both cases, though, the average
rank of the books drifted upwards, throughout the year. The 2013
books started 2015 with an average rank of about 6000 at the end of
January, and finished at about 12000 in mid-December. The 2014 books
started the year at about rank 2000 on average, and ended at about
6000. Remember that these books were in the Top 100 lists in their
respective years. However, the Top 100 lists were constructed
relative to books published that year, and the ranks in 2015 were
against all years, so the declines look worse than they really were.
Ranks in 2015, by Rank Quartile in Top 100 Year
The second set of graphs
shows how book ranks changed in 2015, based on the books initial
ranking in the Top 100 year. The first category, labelled 1,
represents books that were in the first quartile of ranks in their
initial year (i.e. the top 25%). The group of books labelled 2 was
in the second quartile, and so on.
It is clear that books that
were in the top quartile in their publishing year managed to hold
their rank the best, and books in the bottom quartile did the worst,
in that regard. That's not too surprising. There wasn't much
difference in the two middle quartiles, though. So, it appears that
the readers don't distinguish between books in the middle ranks all
that much.
Looking at the data by year,
the pattern repeats itself, or at least approximately so. Books that
were in Quartile 1 during their publication year held their ranks the
best, while those in Quartile 4 did the worst. Quartiles 2 and 3
reversed between the 2013 and 2014 list, however.
Ranks in 2015, by Sex of Writer
The third group of graphs
gives book ranks during 2015, by the gender of the writer. It
appears that gender didn't make much difference at the start of the
year – both female and male writers were averaging about rank 4000.
But as the year went on, books by females lost ground in the
rankings more quickly than books by males, so that there was a
substantial difference by year end.
As we will see later on,
much of that is probably a reflection of the genre that the sexes
tend to write in. Romances lost their rankings more quickly than
other genres, and since women tend to write in the romance genre,
their rankings suffered accordingly as 2015 progressed.
In this case, breaking out
the data by year did reveal some differences. In the 2013 Top 100
books, there was little difference between males and females, in the ranks
by 2015. However, the 2014 Top 100 books indicated an advantage for
male writers. With this amount of data, we can't tell whether the
male-female difference is real, but short lived, or whether it is a
quirk of the datasets.
Ranks in 2015, by Educational Status of Writer
The graph of Rank by Writer
Education is a bit counter-intuitive. At the start of the year,
books by writers with graduate degrees held their rankings the best,
followed by those with some university, then high school, then
Bachelor's degree and Unknown. By the end of the year, it was
writers with “some university” who held their rankings the best,
though.
If we collapse these
categories into “No degree or unknown status” versus “Has a
degree”, things change somewhat. I collapsed those categories in
that fashion, on the assumption that writers who weren't keen on
disclosing their educational status, probably didn't have university
degrees. But that could be wrong.
Using this
re-categorization, the degree holders did somewhat better than the
non-degree holders, though the difference was not all that great.
Basically, they did better in the middle months of the year, but
about the same at the beginning and end of the year. It seems fair
to say that there is no clear trend evident. Breaking out the data
by year (not shown) also shows no clear trend – in 2013 non-degreed
writers seemed to do slightly better, while in 2014 the reverse was
true.
Looking at the subject that
the writer studied and/or worked in (besides writing), we see that
the traditional subjects of English/History/Journalism and Law were
most successful at holding their ranks through 2015.
Ranks in 2015, by Age Range of Writer
In this case, a clear trend
was evident, in favour of older, more established writers. Generally
speaking, as the writer was older, the books held their rankings
better. This was probably a reflection of the older writers'
longer tenure, and thus more established reputation with readers.
The exception was the first
age group, which did somewhat better than the second. I should note
that the difficulty that writers in the 35-44 age group had in
holding their rank was probably related to genre – this tends to be
the age group that writes a lot of Romance books, which don't hold
their rank as well as other genres.
Looking at the data by year
(not shown here) revealed a similar trend in both years, whereby
older writers held their ranks better than younger writers.
Ranks in 2015, by Publisher Type
This graph also shows a very
clear trend. Books published by Indie writer/publishers started off
2015 with much higher ranks, and lost ground from that point. Books
published by the Big 5 publishers (BPH on the graph) did better,
though not great. It was books that were published by the smaller
traditional publishers that performed best, in terms of holding their
ranks and starting off 2015 at a fairly desirable rank.
This again was at least
partially a reflection of genre, since Indies are largely found in
the Romance genre. However, the extra marketing push of traditional
publishing might also be playing a role.
Looking at the data by Top
100 year shows that this effect was very similar for both sets of
books. For the Indie books in the 2013 dataset, though, we see that
the 2015 average ranks have not seen a clear trend during the year –
they have more or less stabilized in the 10,000 to 15,000 range,
though with a fair bit of variance.
Ranks in 2015, by Publishing Month
This graph was very
interesting, though probably no surprise to anyone with experience in
the traditional publishing industry. Clearly, you want to be
published in the 11th month, November. Those books
started off with very good ranks and held their position. Books
published in September also did fairly well. But books published in
October were clobbered. That appears to be the no-mans-land of
publishing, at least in this dataset.
I imagine that the November
effect is related to the most popular and established writers being
published in that month, timed carefully to benefit from Christmas
gift book buying. It would appear that October books are too far
from Christmas to hit that sweet spot. As for September, it seems
likely that is a “return-to-school” effect. March also seemed to
be a good month, perhaps a “nearing-end-of-term” effect.
This result held true for
both the 2013 and 2014 Top 100 lists (graph not shown).
Ranks in 2015, by Original Price Range
This graph shows how well
books held their rank in 2015, by the price range that they were
originally published at. Those ranges were Low = under $4, Moderate
$4 to $7.99, high $8 and up.
As you can see, the high
priced books started 2015 at a lower rank, on average, and held the
lower rank better than the other groups. The moderately priced books
were next, though by the end of the year there was little difference
between them and the high priced books. Low priced books entered the
year with the least desirable rankings, and got worse from there.
This effect was also similar
for the two years.
Ranks in 2015, by Ebook vs Pbook Price in 2015
During 2015, traditional
publishers began increasing book prices, and notably often priced
ebooks higher than pbooks (print books). This is an effort to
maintain the print book market, and the print book stores that sell
those books. Traditional publishers can thereby use their advantage
in getting into the big print book stores as a selling point to both
readers and writers.
The graph below shows how
that worked out, in terms of holding rankings during 2015. One can
see that books where the ebook was priced higher than the pbook lost
ground in the latter part of 2015, about when that pricing strategy
took hold. So, it definitely hurt those books. The other books,
where the ebooks were priced lower than the pbooks held their
rankings better. The “NA” books are those that were only
available as ebooks.
The graph with data split
out by Top 100 year shows how this effect was more pronounced in the 2014
set of books, so recency seemed to play a role in this. Note the big jump in ranks for
the 2014 books whose ebook was priced higher than the pbook,
beginning in Sept 2015.
Ranks in 2015, by Genre
This graph was also very
interesting. It is clear that Romance books had the shortest shelf
life. They tended to start 2015 at the highest ranks and lost rank
from there. Next were the “Other” books, a mix of hard to
categorize fiction and non-fiction. Thriller/Suspense/Crime started
off at about the same point at Literary Fiction, but lost more ground
as the year progressed. Interestingly, it was Science Fiction and
Fantasy that started the year with the lowest ranks and lost the
least ground as the year progressed.
The graph with 2013 and 2014
books broken out separately shows that this trend was very similar
across both years, though in the 2014 set, Science Fiction held its
rank better than Literary Fiction.
To summarize the ability of
books to hold their sales rank over time:
- Newer books did better (2014 publishing vs 2013).
- Books that were originally better ranked did better.
- Books by male writers did better, but the effect was fairly small.
- Books by writers with university degrees did better, but the effect was small.
- Books by older writers did better.
- Books by traditional publishers did better.
- Books published in November did much better, books published in October did much worse.
- Books that were originally high priced did better.
- Pricing a book's ebook version higher than its pbook version seemed to hurt its ranking.
- Romances had the shortest shelf life, literary fiction and Science Fiction the longest.
In later blogs, I intend to
look at how reviews held up, and also do some multivariate analysis,
to see what the most important predictors of a long shelf life were.
=========================================================
After all these stats, you
might want to read some less quantitative. So, try a road trip
through North America in an 18 wheeler, with “On the Road with
Bronco Billy”:
Amazon U.S
http://www.amazon.com/gp/product/B00X2IRHSK
Amazon
U.K.: http://www.amazon.co.uk/gp/product/B00X2IRHSK
Amazon Canada:
http://www.amazon.ca/gp/product/B00X2IRHSK
Or even better, try a
spaceship and planet-side road trip (escaping from slavers), with our
gal Kati of Terra:
Amazon U.S
http://www.amazon.com/gp/product/B00811WVXO
Amazon U.K.:
http://www.amazon.co.uk/gp/product/B00811WVXO
Amazon Canada:
http://www.amazon.ca/gp/product/B00811WVXO