Mini-mashup - a quick stab at mapping IP addresses

(First post! Something simple to get started with...)

Wheel re-invention, we all understand, is a bad thing. But don't let a mantra like that stop you! Sometimes re-inventing a wheel is just the best way in really understand how wheels work, and who knows you may invent a better wheel.

With than in mind I had a play around with visualising the route my home pc take when talking to the host of this blog, laughing squid. Web developers have long since mashed-up utilities such as traceroute with the Google Maps API to show the route taken from your PC to another ip address, but I was curious to know the details. Obtaining and displaying this kind of data on the command line is the kind of thing you might hope would be trivial, and thankfully it doesn't seem too hard to get a rough and ready result – but the emphasis is definitely on the 'rough'.

First off, the bulk of the work is done by the common-garden *nix utility 'traceroute'. In a nutshell it reports basic details of each computer it hops through on the way to a specified destination.


$ traceroute laughingsquid.com

traceroute to laughingsquid.com (72.32.93.164), 30 hops max, 40 byte packets
1 192.168.1.1 (192.168.1.1) 0.594 ms 0.899 ms 1.199 ms
2 10.128.192.1 (10.128.192.1) 9.130 ms 9.376 ms 15.346 ms
...
15 aggr115a.dfw1.rackspace.net (72.3.129.109) 123.069 ms 123.392 ms 127.058 ms
16 octopus.laughingsquid.net (72.32.93.164) 127.960 ms 123.888 ms 121.828 ms

Extremely helpful, but very boring. A helpful article in Linux Journal notes that it's possible to lookup an ip address with a simple web query to the public NetGeo database, although the location data is now really out of date and so has to be taken with a big pinch of salt. You can try it out in your web browser:

http://netgeo.caida.org/perl/netgeo.cgi?target=my_ip_address

Returning a webpage isn't the most helpful form for a one-liner, but it's got a basic structure so you can do some rough and ready parsing to pull out the good bits. Using wget (“web get”) with the -O - option (print the output to standard out) you can lookup the geographical info of an ip address in a one-liner:


$ function geolookup { wget -q -O - http://netgeo.caida.org/perl/netgeo.cgi?target=$1 | egrep '^[A-Z]+:' | sed 's/<br>//' }

$ geolookup 207.46.193.254 # lookup microsoft

TARGET: 207.46.193.254
NAME: MICROSOFT-GLOBAL-NET
NUMBER: 207.46.0.0 - 207.46.255.255
CITY: REDMOND
STATE: WASHINGTON
COUNTRY: US
LAT: 47.67
LONG: -122.12
NIC: ARIN
RATING:
STATUS: OK

Very handy! Incidentally, the latitude and longitude are exactly what you need if you were going to plug this into the Google Maps API.

One fly in the ointment is that NetGeo database only accepts ip address, not domain names. As a hack – and I'm sure there's probably a cleaner way to do this - you can make use of 'ping' to resolve domain names to ip addresses for you. (I find this easier than parsing the output of 'host', which is the other, better, alternative).


$ function iplookup { ping -c 1 $1 | grep 'PING' | sed 's/[\(\)]//g' | awk '{print $3}' }
$ iplookup www.microsoft.com
207.46.193.254
$ geolookup $(iplookup www.microsoft.com)
TARGET: 207.46.193.254
NAME: MICROSOFT-GLOBAL-NET
..

In order to feed the output of traceroute through geolookup we need to scrape the ip addresses out of each line. This will just print a list of each ip address traceroute encounters:
$ function iproute { traceroute $1 | grep -v 'traceroute|*' | sed 's/[\(\)]//g' | awk '{ print $3 }' }
192.168.1.1
10.128.192.1
...

It's then possible to loop over the output of iproute, feed it into geolookup, and strip off the info we want from that. Here, I'm just reporting the second word in any line that starts with CITY.


$ for i in $(iproute www.microsoft.com); do geolookup $i | grep CITY | awk '{print $2}'; done

MARINA
MARINA
AMSTERDAM
AMSTERDAM
FARNBOROUGH
AMSTERDAM
MUNSTER
AMSTERDAM
REDMOND
REDMOND
REDMOND
REDMOND
REDMOND
REDMOND
REDMOND

Hmm. Sort of there. The approach is ok, but the content is a bit crap.

The MARINA entries are the city result returned for local network addresses (192.168.*) and would need special casing to handle properly, but more serious is that you don't have to try this out on many addresses before you find problems with the NetGeo database and/or uninteresting traceroute results - although I can't say I wasn't warned. The NetGeo front page states all over it that it's unreliable!