Learning about world cultures via Google Autocomplete

Out of curiosity, I was looking how a browser interacts with the Google Instant backend. While looking the HTTP exchanges via Firebug, I first asked myself why they’re encoding HTML and JS with \xYY escape sequences, then why the very same JS functions are sent back and forth on every request, and later I stumbled upon the google.com/s?q=QUERY JSONp service.

Give it a query, and it’ll return the suggested related phrases that are used to build the menu under the search input while using suggestions and/or instant (didn’t dig too much in all the other parameters).

Anyway, what’s interesting is that, of course, the suggestions are customized on a per-country basis. To show the differences explicitly let’s ask the service the simplest query possible, a:

For Italy you’ll get:

$ curl http://www.google.it/s?q=a
["alice","","1"],["alitalia","","2"],["alice mail","","3"],
["apple","","4"],["agenzia delle entrate","","5"],

hum, let’s scrap the JSONp and parameters out:

$ curl -s http://www.google.it/s?q=a | ruby -rjson -ne 'puts JSON($_[19..-2])[1].map(&:first).join(", ")'            
ansa, alice, alitalia, alice mail, apple, agenzia delle entrate, audi, aci, autoscout, atm

For the US you’ll get:

amazon, aol, att, apple, american airlines, abc, ask.com, amtrak, addicting games, aim


argos, amazon, asda, asos, autotrader, aa route planner, aol, apple, amazon uk, aqa


aer lingus, aib, argos, amazon.co.uk, argos.ie, asos, aa route planner, amazon, aldi, aib internet banking

Lastly, because I’ve been there lately and it has been a profound experience, Cuba:

asus, antonio maceo, amor, amigos, ain, antivirus, avira, alba, aduana, as

I’m sure @nhaima is smiling while seeing these words, because hell yeah, over there they really google antivirus software (avira is one of them) a lot because it’s a world without the Internet, thus without free software: you’re condemned in using Windows stuff, and you take what you pay for. Antonio Maceo has been an hero of the 19th century revolution, and it’s in the heart of Cuban people. Amor, Amigos! :-)

Anyway, looks like that simple queries like this really give an insight on what a population thinks and/or needs, because they’re surely generated by the search trends, thus are the “most searched words”. Am I discovering hot water? Maybe, but it was funny to rediscover it. Just make sure not to hammer the /s service with too many requests, because they’ll anyway be handled by the same cluster of machines, thus you’ll be banned early (I’ve been :-p).

Creative Commons License

About this entry