11 January, 2011

Body bits


I'm glad tooth has been overtaken by penis, but brains have recently become wayyyy over-rated.

Wheat in war





Looks to me like wheat underwent a change during WW2 that cotton, and copper did not. Steel did its own thing. Why?

Periodicities in books?


What standard of evidence would we require if we were to claim we had identified a periodicity in these book-based data? We don't see annual cycles any more, of course. What about the frequency of Locke in this philosophical triumverate?

And if we were to make such a claim, what would it mean? Lawfulness (of a "physical" kind) in the abstract world of texts. We would look for corroboration, where possible, but that is hard for a single proper noun like Locke (though there is a related vocabulary, of course). You could use corpus-counts to identify the subset of terms that best triangulates the proper noun in question, and look for similar periods in their aggregated data. You would have to think geometrically in conceptual space.

The sexualities



Metrosexual doesn't quite make it, but homosexuality seems to be grabbing our attention. My take on this is that because hetrosexual is the default assumption, it is a label only necessary in order to contrast itself with homosexuality.

09 January, 2011

02 January, 2011

Physical vs Mental


Without comment.

Colors and Polarity

Here's a puzzle. If we look at colors, we see, as expected, that black and white are more common than the primaries, and the primaries are more common than any other color terms. But something unexpected has happened roughly since the early sixties: black and white together exhibited a rise in prevalence that is independent of the color terms. Has our thinking become more polarized?

(Note: there is an unfortunate Stroop effect here. The curves for black and white are not in black and white. Red, blue, green and yellow are drawn in their own colors. Brown is a light green. Sorry).

01 January, 2011

Assessing the quality of Google's N-gram viewer data

Here is one way to think of quality control for the data from Google's N-gram viewer: look at the relative popularity of the digits two, three, four, etc. I suspect we would expect a fall in popularity as the numbers increase, but that should itself be decelerating, so that three - two should be bigger than six - five. That is, indeed, what we find. This looks fairly stable by that measure.

God vs the Devil

God vs the devil. Oddly, the data for God reported in the Science article looked very very different. It will take a while before anyone has much confidence in these figures. God is actually doing quite well, and has overtaken the devil in popularity.

The N-gram viewer is released.


By now, you will be well aware that Google released the n-gram viewer, looking at 1- to 5-grams (groups of 1 to 5 words) in a corpus containing about 4% of all books published from 1800 to today. There is so much. The Culturomics paper just hit Science. Lets play with it a bit. Here is hell vs heaven.

11 August, 2009



So why does Descartes take a holiday in Summer, while the Greeks are still looked at? The fall off in interest in Descartes has to be a function of the end of University classes. But how come the Greeks don't exhibit a similar drop?

26 March, 2009

Another tool for meme tracking

We can now follow WIkipedia traffic:

http://wikirank.com

Cool! And results can be embedded, thus:



Shame you are restricted to 30, 60 or 90 days. Annual trends are hard to find there....

18 December, 2008

Our Collective Nature

I've been neglecting this little corner of the world. Nobody looks here either, and that's kind of nice. This started off as an odd quirk, apart from my normal thought patterns, my work, even my play. I was just interested in what you could see using Google Trends. As I saw stuff, it blended into my life well. In fact, I am now committed to recognizing the collective fabric of our being, that fully acknowledges our subjectivity, and learns to accord that its proper place. The individual is both our misery and our joy. When we see ourselves as continuous with biology, we appear in a new light, and our private worlds become both more and less hermetically sealed off. In fact, if you remove the troubling concept of the bottleneck of the ego there, we become both. We see how a common fabric is wrought from individual points of view. We learn not to place blame on individuals.

31 July, 2008

Periodicities

I had started to get bored with this. I kept seeing the annual cycle. But then it dawned on me. That's who we are. We are driven by the rhythms around us. You can chart the natural world in our behaviour. This is weak evidence of the term environment. All you can account for is behaviour, no matter how hard you try. The psychophysicists have tried hard, by God. Trying to bridge the Observable-Experienced gap. By report. Tell me what you heard. That's a very direct question. Almost rude. Except we have no cultural reference frame. Who would be ashamed to tell friends that they had a frequency deficit with a notch at 1600 Hz.? Disabilities define us. Its not intimate, though it tries to be.

15 March, 2008

Mind the gap......



Gapminder is owned by Google too. It too is a meme tool. I found it at TED.

02 March, 2008

Noise background



This is essentially noise, as the search terms are semantically unstructured (more or less content-free). What is the structure to this background noise? Does it describe us? Should we include the many annual cycle-shapes in there?

Not all 'x's are equal....



... and 'x' is both searched for and reported, while 'xxx' is only searched for.

21 February, 2008

Chomsky and academia

Piaget is more searched for than Vygotsky, no surprise there, and its all students doing the searching. But look how Chomsky is not so popular any more, and his search data do not show the academic year! He's only known now for his politics.

10 February, 2008

US vs UK

... or rather grey vs gray. For many terms, the volume of an Americanized spelling will simply swamp the results for the corresponding UK term (try 'honor' vs 'honour'), but grey and gray seem comparable. But one version has an annual peak in November. Why?

09 February, 2008

Question: liar

What's up with 'liar'? There does not seem to be a news peak to match the huge jump in early 2007.

Presents

This is just funny!



Here's another one. You can see the seasons in snot. I think I'll call the book that:)

A different annual cycle

Here is the agricultural year spelled out plainly. Quite a contrast to the academic year.... (But what's with that spring hump for 'harvest' in 2004?)

Anti-correlations

Correlations are interesting, but all too easy to find. Anti-correlations would be far more interesting. We could look for them automatically if we had the data....

But common sense would also be required:

Reincarnation

'Reincarnation' experiences a surge of interest in August three years in a row, and then not at all in 2007. Why, why, why? Perhaps if we had the data we might dig a little deeper here....

Structural change

Look how 'hydroponics' changes between 2004 and 2005. I wonder why....

Global warming (sigh)

This one would be interesting to model. Notice that as the meme becomes amplified in 2007, it retains its academic-year structure. That's perhaps a little surprising. Wouldn't its prevalence in the media suggest that it is escaping from the double hump of the academic year? Apparently not....

Anyone for maths?

Here are two terms that are almost guaranteed to reveal the academic year effect. 'Algebra' and 'Calculus'. The co-variation is very strong indeed. You can even pick out the little blip around Thanksgiving! With terms like these, one could model the effect of the academic year, and hence factor it out of other analyses. At least one could if one had the data.....

Of Hangovers and Software

I noted before that most terms exhibit either a drop or a rise over Christmas. My expectation is that the number of associated news stories would do likewise. Here, for example, is 'hangover':



But 'software' is different. Searches rise, while news stories fall. Why?

08 February, 2008

Question: 'prison'

Why does 'prison' suddenly appear to become a topic of academic interest (if I am reading the annual trend correctly)?

Wordnet and Trends

Google Trends seems to offer a way to test the coherence of synsets in WordNet. One could quantify the similarity in trends for supposedly related words. (If we had the data......)

Christmas

Christmas is a funny time. Most trends change around that time. Some terms experience a drop in frequency, others a rise. It seems that those connected with well-being and relaxation get a boost, while the problematical, the practical and the mundane drop off. It would be very nice to quantify this (if we had the data, please Google....):

For example, 'whisky' and 'vodka' go up, but 'heroin' goes down....



'Masturbation' and 'abortion' display opposing trends at Christmas:



Sometimes, of course, it is the present-potential:



Oddly, 'birthday' takes a big hit right around this time:

Detrending

If we had the raw data..... That plea will recur throughout this exercise. ...we could see how many ways there are to detrend these data. Simplistic detrending of time series looks for a periodic component, with a 12 month cycle, or a 7 day cycle. Months are not as robust, due to their varying day numbers. But these data suggest that one could be much more refined. Many searches reveal a flurry of activity in Spring and Autumn/Winter, with a big dip in the Summer. This is, among other things, the academic year, and students do a lot of searches. Compare 'Jung' and 'Freud', for example:




You can see that Freud is taught in Universities, while Jung has gone out of fashion.

There might be other odd forms of recurrence. 'Breakfast' has a peak at weekends, but also in July.


The start of Real Meme

This blog is just a place for me to keep notes on meme hunting. Initially, I will make some observations about Google Trends, which has me all a twitter. Perhaps it will go other places later.