Learning about world cultures via Google Autocomplete
Out of curiosity, I was looking how a browser interacts with the Google Instant
backend. While looking at the HTTP exchanges via Firebug, I first asked myself
why they’re encoding HTML and JS with \xYY
escape sequences, then why the
very same JS functions are sent back and forth on every request, and later I
stumbled upon the google.com/s?q=QUERY
JSONp service.
Give it a query, and it’ll return the suggested related phrases that are used to build the menu under the search input while using suggestions and/or instant (didn’t dig too much into all the other parameters).
Anyway, what’s interesting is that, of course, the suggestions are customized
on a per-country basis. To show the differences explicitly, let’s ask the
service the simplest query possible, a
:
For Italy you’ll get:
$ curl http://www.google.it/s?q=a
window.google.ac.h(["a",[["ansa","","0"],
["alice","","1"],["alitalia","","2"],["alice mail","","3"],
["apple","","4"],["agenzia delle entrate","","5"],
["audi","","6"],["aci","","7"],["autoscout","","8"],
["atm","","9"]],"","","","","",{}])
hum, let’s scrap the JSONp and parameters out:
$ curl -s http://www.google.it/s?q=a | ruby -rjson -ne 'puts JSON($_[19..-2])[1].map(&:first).join(", ")'
ansa, alice, alitalia, alice mail, apple, agenzia delle entrate, audi, aci, autoscout, atm
For the US you’ll get:
amazon, aol, att, apple, american airlines, abc, ask.com, amtrak, addicting games, aim
UK:
argos, amazon, asda, asos, autotrader, aa route planner, aol, apple, amazon uk, aqa
Ireland:
aer lingus, aib, argos, amazon.co.uk, argos.ie, asos, aa route planner, amazon, aldi, aib internet banking
Lastly, because I’ve been there lately and it has been a profound experience, Cuba:
asus, antonio maceo, amor, amigos, ain, antivirus, avira, alba, aduana, as
I’m sure @nhaima is smiling while seeing these words, because hell yeah, over there they really google antivirus software (avira is one of them) a lot because it’s a world without the Internet, thus without free software: you’re condemned in using Windows stuff, and you take what you pay for. Antonio Maceo has been an hero of the 19th century revolution, and it’s in the heart of Cuban people. Amor, Amigos! :-)
Anyway, looks like that simple queries like this really give an insight on what a population thinks and/or needs, because they’re surely generated by the search trends, thus are the “most searched words”. Am I discovering hot water? Maybe, but it was funny to rediscover it. Just make sure not to hammer the /s service with too many requests, because they’ll anyway be handled by the same cluster of machines, thus you’ll be banned early (I’ve been :-p).