01 January, 2011
Assessing the quality of Google's N-gram viewer data
Here is one way to think of quality control for the data from Google's N-gram viewer: look at the relative popularity of the digits two, three, four, etc. I suspect we would expect a fall in popularity as the numbers increase, but that should itself be decelerating, so that three - two should be bigger than six - five. That is, indeed, what we find. This looks fairly stable by that measure.
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment