Learning by doing: Writing your own traceroute in 8 easy steps
Posted in Networking on July 29th, 2010 by Leonid Grinberg – 30 CommentsAnyone who administers even a moderately sized network knows that when problems arise, diagnosing and fixing them can be extremely difficult. They’re usually non-deterministic and difficult to reproduce, and very similar symptoms (e.g. a slow or unreliable connection) can be caused by any number of problems — congestion, a broken router, a bad physical link, etc.
One very useful weapon in a system administrator’s arsenal for dealing with network issues is traceroute (or tracert, if you use Windows). This is a neat little program that will print out the path that packets take to get from the local machine to a destination — that is, the sequence of routers that the packets go through.
Using traceroute is pretty straightforward. On a UNIX-like system, you can do something like the following:
$ traceroute google.com
traceroute to google.com (173.194.33.104), 30 hops max, 60 byte packets
1 router.lan (192.168.1.1) 0.595 ms 1.276 ms 1.519 ms
2 70.162.48.1 (70.162.48.1) 13.669 ms 17.583 ms 18.242 ms
3 ge-2-20-ur01.cambridge.ma.boston.comcast.net (68.87.36.225) 18.710 ms 19.192 ms 19.640 ms
4 be-51-ar01.needham.ma.boston.comcast.net (68.85.162.157) 20.642 ms 21.160 ms 21.571 ms
5 pos-2-4-0-0-cr01.newyork.ny.ibone.comcast.net (68.86.90.61) 28.870 ms 29.788 ms 30.437 ms
6 pos-0-3-0-0-pe01.111eighthave.ny.ibone.comcast.net (68.86.86.190) 30.911 ms 17.377 ms 15.442 ms
7 as15169-3.111eighthave.ny.ibone.comcast.net (75.149.230.194) 40.081 ms 41.018 ms 39.229 ms
8 72.14.238.232 (72.14.238.232) 20.139 ms 21.629 ms 20.965 ms
9 216.239.48.24 (216.239.48.24) 25.771 ms 26.196 ms 26.633 ms
10 173.194.33.104 (173.194.33.104) 23.856 ms 24.820 ms 27.722 ms
Pretty nifty. But how does it work? After all, when a packet leaves your network, you can’t monitor it anymore. So when it hits all those routers, the only way you can know about that is if one of them tells you about it.
The secret behind traceroute is a field called “Time To Live” (TTL) that is contained in the headers of the packets sent via the Internet Protocol. When a host receives a packet, it checks if the packet’s TTL is greater than 1 before sending it on down the chain. If it is, it decrements the field. Otherwise, it drops the packet and sends an ICMP TIME_EXCEEDED packet to the sender. This packet, like all IP packets, contains the address of its sender, i.e. the intermediate host.
traceroute works by sending consecutive requests to the same destination with increasing TTL fields. Most of these attempts result in messages from intermediate hosts saying that the packet was dropped. The IP addresses of these intermediate hosts are then printed on the screen (generally with an attempt made at determining the hostname) as they arrive, terminating when the maximum number of hosts have been hit (on my machine’s traceroute the default maximum is 30, but this is configurable), or when the intended destination has been reached.
The rest of this post will walk through implementing a very primitive version of traceroute in Python. The real traceroute is of course more complicated than what we will create, with many configurable features and modes. Still, our version will implement the basic functionality, and at the end, we’ll have a really nice and short Python script that will do just fine for performing a simple traceroute.
So let’s begin. Our algorithm, at a high level, is an infinite loop whose body creates a connection, prints out information about it, and then breaks out of the loop if a certain condition has been reached. So we can start with the following skeletal code:
#!/usr/bin/python
def main(dest):
while True:
# ... open connections ...
# ... print data ...
# ... break if useful ...
pass
if __name__ == "__main__":
main('google.com')
Step 1: Turn a hostname into an IP address.
The socket module provides a gethostbyname() method that attempts to resolve a domain name into an IP address:
#!/usr/bin/python
import socket
def main(dest_name):
dest_addr = socket.gethostbyname(dest_name)
while True:
# ... open connections ...
# ... print data ...
# ... break if useful ...
pass
if __name__ == "__main__":
main('google.com')
Step 2: Create sockets for the connections.
We’ll need two sockets for our connections — one for receiving data and one for sending. We have a lot of choices for what kind of probes to send; let’s use UDP probes, which require a datagram socket (SOCK_DGRAM). The routers along our traceroute path are going to send back ICMP packets, so for those we need a raw socket (SOCK_RAW).
#!/usr/bin/python
import socket
def main(dest_name):
dest_addr = socket.gethostbyname(dest_name)
icmp = socket.getprotobyname('icmp')
udp = socket.getprotobyname('udp')
while True:
recv_socket = socket.socket(socket.AF_INET, socket.SOCK_RAW, icmp)
send_socket = socket.socket(socket.AF_INET, socket.SOCK_DGRAM, udp)
# ... print data ...
# ... break if useful ...
if __name__ == "__main__":
main('google.com')
Step 3: Set the TTL field on the packets.
We’ll simply use a counter which begins at 1 and which we increment with each iteration of the loop. We set the TTL using the setsockopt module of the socket object:
#!/usr/bin/python
import socket
def main(dest_name):
dest_addr = socket.gethostbyname(dest_name)
icmp = socket.getprotobyname('icmp')
udp = socket.getprotobyname('udp')
ttl = 1
while True:
recv_socket = socket.socket(socket.AF_INET, socket.SOCK_RAW, icmp)
send_socket = socket.socket(socket.AF_INET, socket.SOCK_DGRAM, udp)
send_socket.setsockopt(socket.SOL_IP, socket.IP_TTL, ttl)
ttl += 1
# ... print data ...
# ... break if useful ...
if __name__ == "__main__":
main('google.com')
Step 4: Bind the sockets and send some packets.
Now that our sockets are all set up, we can put them to work! We first tell the receiving socket to listen to connections from all hosts on a specific port (most implementations of traceroute use ports from 33434 to 33534 so we will use 33434 as a default). We do this using the bind() method of the receiving socket object, by specifying the port and an empty string for the hostname. We can then use the sendto() method of the sending socket object to send to the destination host (on the same port). The first argument of the sendto() method is the data to send; in our case, we don’t actually have anything specific we want to send, so we can just give the empty string:
#!/usr/bin/python
import socket
def main(dest_name):
dest_addr = socket.gethostbyname(dest_name)
port = 33434
icmp = socket.getprotobyname('icmp')
udp = socket.getprotobyname('udp')
ttl = 1
while True:
recv_socket = socket.socket(socket.AF_INET, socket.SOCK_RAW, icmp)
send_socket = socket.socket(socket.AF_INET, socket.SOCK_DGRAM, udp)
send_socket.setsockopt(socket.SOL_IP, socket.IP_TTL, ttl)
recv_socket.bind(("", port))
send_socket.sendto("", (dest_name, port))
ttl += 1
# ... print data ...
# ... break if useful ...
if __name__ == "__main__":
main('google.com')
Step 5: Get the intermediate hosts’ IP addresses.
Next, we need to actually get our data from the receiving socket. For this, we can use the recvfrom() method of the object, whose return value is a tuple containing the packet data and the sender’s address. In our case, we only care about the latter. Note that the address is itself actually a tuple containing both the IP address and the port, but we only care about the former. recvfrom() takes a single argument, the blocksize to read — let’s go with 512.
It’s worth noting that some administrators disable receiving ICMP ECHO requests, pretty much specifically to prevent the use of utilities like traceroute, since the detailed layout of a network can be sensitive information (another common reason to disable them is the ping utility, which can be used for denial-of-service attacks). It is therefore completely possible that we’ll get a timeout error, which will result in an exception. Thus, we’ll wrap this call in a try/except block. Traditionally, traceroute prints asterisks when it can’t get the address of a host. We’ll do the same once we print out results.
#!/usr/bin/python
import socket
def main(dest_name):
dest_addr = socket.gethostbyname(dest_name)
port = 33434
icmp = socket.getprotobyname('icmp')
udp = socket.getprotobyname('udp')
ttl = 1
while True:
recv_socket = socket.socket(socket.AF_INET, socket.SOCK_RAW, icmp)
send_socket = socket.socket(socket.AF_INET, socket.SOCK_DGRAM, udp)
send_socket.setsockopt(socket.SOL_IP, socket.IP_TTL, ttl)
recv_socket.bind(("", port))
send_socket.sendto("", (dest_name, port))
curr_addr = None
try:
_, curr_addr = recv_socket.recvfrom(512)
curr_addr = curr_addr[0]
except socket.error:
pass
finally:
send_socket.close()
recv_socket.close()
ttl += 1
# ... print data ...
# ... break if useful ...
if __name__ == "__main__":
main('google.com')
Step 6: Turn the IP addresses into hostnames and print the data.
To match traceroute‘s behavior, we want to try to display the hostname along with the IP address. The socket module provides the gethostbyaddr() method for reverse DNS resolution. The resolution can fail and result in an exception, in which case we’ll want to catch it and make the hostname the same as the address. Once we get the hostname, we have all the information we need to print our data:
#!/usr/bin/python
import socket
def main(dest_name):
dest_addr = socket.gethostbyname(dest_name)
port = 33434
icmp = socket.getprotobyname('icmp')
udp = socket.getprotobyname('udp')
ttl = 1
while True:
recv_socket = socket.socket(socket.AF_INET, socket.SOCK_RAW, icmp)
send_socket = socket.socket(socket.AF_INET, socket.SOCK_DGRAM, udp)
send_socket.setsockopt(socket.SOL_IP, socket.IP_TTL, ttl)
recv_socket.bind(("", port))
send_socket.sendto("", (dest_name, port))
curr_addr = None
curr_name = None
try:
_, curr_addr = recv_socket.recvfrom(512)
curr_addr = curr_addr[0]
try:
curr_name = socket.gethostbyaddr(curr_addr)[0]
except socket.error:
curr_name = curr_addr
except socket.error:
pass
finally:
send_socket.close()
recv_socket.close()
if curr_addr is not None:
curr_host = "%s (%s)" % (curr_name, curr_addr)
else:
curr_host = "*"
print "%d\t%s" % (ttl, curr_host)
ttl += 1
# ... break if useful ...
if __name__ == "__main__":
main('google.com')
Step 7: End the loop.
There are two conditions for exiting our loop — either we have reached our destination (that is, curr_addr is equal to dest_addr)1 or we have exceeded some maximum number of hops. We will set our maximum at 30:
#!/usr/bin/python
import socket
def main(dest_name):
dest_addr = socket.gethostbyname(dest_name)
port = 33434
max_hops = 30
icmp = socket.getprotobyname('icmp')
udp = socket.getprotobyname('udp')
ttl = 1
while True:
recv_socket = socket.socket(socket.AF_INET, socket.SOCK_RAW, icmp)
send_socket = socket.socket(socket.AF_INET, socket.SOCK_DGRAM, udp)
send_socket.setsockopt(socket.SOL_IP, socket.IP_TTL, ttl)
recv_socket.bind(("", port))
send_socket.sendto("", (dest_name, port))
curr_addr = None
curr_name = None
try:
_, curr_addr = recv_socket.recvfrom(512)
curr_addr = curr_addr[0]
try:
curr_name = socket.gethostbyaddr(curr_addr)[0]
except socket.error:
curr_name = curr_addr
except socket.error:
pass
finally:
send_socket.close()
recv_socket.close()
if curr_addr is not None:
curr_host = "%s (%s)" % (curr_name, curr_addr)
else:
curr_host = "*"
print "%d\t%s" % (ttl, curr_host)
ttl += 1
if curr_addr == dest_addr or ttl > max_hops:
break
if __name__ == "__main__":
main('google.com')
Step 8: Run the code!
We’re done! Let’s save this to a file and run it! Because raw sockets require root privileges, traceroute is typically setuid. For our purposes, we can just run the script as root:
$ sudo python poor-mans-traceroute.py
[sudo] password for leonidg:
1 router.lan (192.168.1.1)
2 70.162.48.1 (70.162.48.1)
3 ge-2-20-ur01.cambridge.ma.boston.comcast.net (68.87.36.225)
4 be-51-ar01.needham.ma.boston.comcast.net (68.85.162.157)
5 pos-2-4-0-0-cr01.newyork.ny.ibone.comcast.net (68.86.90.61)
6 pos-0-3-0-0-pe01.111eighthave.ny.ibone.comcast.net (68.86.86.190)
7 as15169-3.111eighthave.ny.ibone.comcast.net (75.149.230.194)
8 72.14.238.232 (72.14.238.232)
9 216.239.48.24 (216.239.48.24)
10 173.194.33.104 (173.194.33.104)
Hurrah! The data matches the real traceroute‘s perfectly.
Of course, there are many improvements that we could make. As I mentioned, the real traceroute has a whole slew of other features, which you can learn about by reading the manpage. In the meantime, I wrote a slightly more complete version of the above code that allows configuring the port and max number of hops, as well as specifying the destination host. You can download it at my github repository.
Alright folks, What UNIX utility should we write next? strace, anyone?
2
traceroute works. Rather than checking the IP addresses of the hosts and stopping when the destination address matches, it stops when it receives a ICMP “port unreachable” message, which means that the host has been reached. For our purposes, though, this simple address heuristic is good enough.
2 Ksplice blogger Nelson took up a DIY strace on his personal blog, Made of Bugs.
Fixing network problems? Don’t let reboots slow you down!
With Ksplice Uptrack, you can apply kernel patches without rebooting. Spend less time babysitting your updates and more time playing with network utilities!

Great execution!
I love the way you show the fow of coding, going from the design in clearly visible steps to the finished code.
Showing only the new codeline is neat!
Thanks! I confess that I was inspired by my friend Evan (http://blog.ksplice.com/2010/07/building-filesystems-the-way-you-build-web-apps/) when writing this.
Great article, very clear and well oranganized.
I challenge you to write strace. You will find it some orders of magnitude more difficult.
Love it. I’ve always wondered how traceroute works. And yes yes please, I’d love to see strace explained like this as well. Thanks!
Thanks! I’m glad this was helpful/interesting.
As for strace, Jing Jong is right — implementing it is definitely much harder. I’ll see if I can con another Ksplice blogger into doing it!
There’s also tracepath which is similar to traceroute but doesn’t require root privs. Any idea how it works?
Hi, I’ve tried both poor-mans-traceroute and traceroute itself (as you have them in github) and none worked.
I’ve tried in OSx, with both python 2.6 and 2.5:
silverbullet:python wu$ python traceroute.py 212.89.9.184
Traceback (most recent call last):
File “traceroute.py”, line 71, in
max_hops=int(options.max_hops)))
File “traceroute.py”, line 24, in main
recv_socket, send_socket = create_sockets(ttl)
File “traceroute.py”, line 15, in create_sockets
recv_socket = socket.socket(socket.AF_INET, socket.SOCK_RAW, icmp)
File “/opt/local/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/socket.py”, line 159, in __init__
_sock = _realsocket(family, type, proto)
socket.error: (1, ‘Operation not permitted’)
silverbullet:python wu$ python2.5 traceroute.py 212.89.9.184
Traceback (most recent call last):
File “traceroute.py”, line 71, in
max_hops=int(options.max_hops)))
File “traceroute.py”, line 24, in main
recv_socket, send_socket = create_sockets(ttl)
File “traceroute.py”, line 15, in create_sockets
recv_socket = socket.socket(socket.AF_INET, socket.SOCK_RAW, icmp)
File “/opt/local/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/socket.py”, line 159, in __init__
_sock = _realsocket(family, type, proto)
socket.error: (1, ‘Operation not permitted’)
silverbullet:python wu$ python poor-mans-traceroute.py
Traceback (most recent call last):
File “poor-mans-traceroute.py”, line 44, in
main(‘google.com’)
File “poor-mans-traceroute.py”, line 13, in main
recv_socket = socket.socket(socket.AF_INET, socket.SOCK_RAW, icmp)
File “/opt/local/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/socket.py”, line 159, in __init__
_sock = _realsocket(family, type, proto)
socket.error: (1, ‘Operation not permitted’)
And I’ve tried in FreeBSD 8 with python 2.6 too, same result:
[nidhogg] ~> python traceroute.py google.com
Traceback (most recent call last):
File “traceroute.py”, line 71, in
max_hops=int(options.max_hops)))
File “traceroute.py”, line 24, in main
recv_socket, send_socket = create_sockets(ttl)
File “traceroute.py”, line 15, in create_sockets
recv_socket = socket.socket(socket.AF_INET, socket.SOCK_RAW, icmp)
File “/usr/local/lib/python2.6/socket.py”, line 182, in __init__
_sock = _realsocket(family, type, proto)
socket.error: [Errno 1] Operation not permitted
Any idea? am I the only one with that problem?
Ups, forget the previous comment, I forgot to use sudo!
Very nice, going to give that a try. Thanks for sharing
Why did you choose not to use ‘dest_addr’ in the send_socket.sendto function call? I knwo it was used as a loop termination condition, but if it was only used there, then there was no need to add that code until step 7.
dest_addr = socket.gethostbyname(dest_name)
…
send_socket.sendto(“”, (dest_name, port))
Should be
send_socket.sendto(“”, (dest_name, port))
IIRC, Python will do a hostname lookup for each sendto call. Sure, it’ll cache, but the point remains.
Sorry…should be:
send_socket.sendto(“”, (dest_addr, port))
Very nice work!
And a very educational post!
I don’t normally comment on blogs (can’t be bothered most of the time), but this was a nice, brief, yet interesting article not compounded by extraneous opinions or waffle. I especially loved the way you introduced new code.
Loved it!
Brilliant! Great article. Also shows how powerful python’s “batteries included” design philosophy can be.
Great post
This is great article, both from the perspective of understanding network behavior, but also using Python’s raw socket calls. Thanks!
I’ve made some modifications to this script so it behaves a little more like traceroute: I’ve added the default 5-second timeout for when certain hosts do not respond. This prevents it from getting stuck indefinitely on a particular hop. Like traceroute, this code also tries three times for each TTL (unless it gets a response), and it prints an asterisk for each failure.
You can see the modified code here: http://gist.github.com/502451
Traceback (most recent call last):
File “C:\Users\admin\Desktop\python\script1.py”, line 44, in main(‘google.com’)
File “C:\Users\admin\Desktop\python\script1.py”, line 17, in main send_socket.sendto(“”, dest_addr, port))
TypeError: sendto() takes exactly 3 arguments (2 given)
——-
Using python 3.1.2 and I am under administrator mode in command prompt. Is there a way around this?
noobstar: It’s hard to debug since I don’t see what you are running, but judging from:
File “C:\Users\admin\Desktop\python\script1.py”, line 17, in main send_socket.sendto(“”, dest_addr, port))I am guessing that you want
send_socket("", (dest_addr, port))instead (i.e. put parentheses around the last two arguments.
Hi,
Sorry I must’ve pasted it weirdly, but the parentheses are there.
Traceback (most recent call last):
File “C:\Users\admin\Desktop\python\script1.py”, line 44, in
main(‘google.com’)
File “C:\Users\admin\Desktop\python\script1.py”, line 17, in main
send_socket.sendto(“”, (dest_addr,port))
TypeError: sendto() takes exactly 3 arguments (2 given)
—–
After trying everything, I’ve copied and pasted the original source from the github repository. I am running Windows 7, Pythong 3.1.2, cmd prompt with administrator enabled. I’ve also tried putting an “extra” argument in the form of 0, “” and []. It gives me the same form of error but the last line says: TypeError: must be bytes or buffer, not str.
If you need more details about my setup, just give another post.
Posted my implementation of this in Ruby :
http://artiom.posterous.com/ruby-and-python-are-brothers
Wow, awesome article thanks. Brings back memories of networking class at uni (most of which i’ve gradually forgotten over the years
)
An article on poor man’s “whois” and “nmap” would be sweet.
Thanks,
Ab
noobstar: you need to install Python 2.71 for this code to work ….. else you need to modify the code to match the Python 3.1 syntax.
Great article, very clear and well oranganized.
Can you code it in C#?
wow !!
This is Pretty useful =)
Thanks for sharing your article & code
No discussion of traceroute would be complete without a mention of “Tracer T”: http://www.youtube.com/watch?v=SXmv8quf_xM
Linux version: http://www.youtube.com/watch?v=6WHu1EM8CgY
Great post – I also love the highlighted new code approach.
One interesting aspect of traceroute is the fact that the ICMP packets are allowed back in through your NAT/firewall. Curious huh? My buddy Samy Kamkar has exploited that for NAT traversal — check it out:
http://samy.pl/pwnat/