Book Statistics Corner, Part 1 – Book Blurbs vs. Reader Reviews

There is a lot of information that a writer or publisher can mine in Amazon’s “Also bought” (that line of books at the bottom of the screen that says “People who bought this book also bought these books”, which is sometimes referred to as the alsobot).  It’s all publicly available for anyone willing to put in the time to do some tedious cutting and pasting and knows a thing or two about data analysis.  I am talking about a statistical analysis here – nothing that in any way violates anyone’s reasonable expectation of privacy.  Examining your alsobots can help a writer or publisher understand the market that they are already reaching, and perhaps shape their books to better serve that market.

As an example, one can use publicly available web-based software to do simple text analytics, that report on the complexity of writing.  Some common and easy to understand measures are word count, average sentence length, proportion of long or hard words (i.e. more syllables or more letters).  In the example below, I took a sample of about 20 books in the alsobot of Kati of Terra Book 1 (a Dodecahedron Books publication), with an average of about 20 reviews each.  I analysed their blurbs and also analysed the reader reviews of those books.  As you can see, as the complexity of the book blurb goes up (as measured by percentage of "hard words" with three or more syllables), the complexity of the corresponding reviews also goes up.  The correlation is fairly high (R-Square=.43), for those who have a background in statistics.  This general result holds true for a number of the other measures of writing complexity noted above.

So, what’s likely behind this?  I would hypothesize that readers judge the assumed writing style of the book by the writing style of the blurb.  Those who prefer a more complicated writing style will purchase and read the books with more complicated blurbs.  Eventually, when they review those books, they will write in a more complex style themselves, as that’s what they are comfortable with.  Similarly for readers on other points along this dimension.

What’s this mean for the writer or publisher?  I suppose one could take one of several lessons from it.  One strategy might be to attempt to maximize your market, by adapting the blurb style to the widest possible audience.  Of course, going after too wide an audience might just mean you don’t appeal to those most likely to enjoy the book.   Alternatively, one could try to target a very specific market, say readers who love complex prose, by writing a very erudite blurb.  Of course, if the blurb style seriously misrepresents the book style, that will just lead to reader disappointment and possibly bad reviews and returns.   

Alternatively, one could simply look at these results and conclude that readers will seek out and find the books with which they are most comfortable, and not worry too much about the whole matter.  Just be Zen about it - write in the style that is your natural voice, and wait for readers who prefer that voice to find you.

