very interesting, i ran into urlparse's shortcomings while prototyping a web crawler just last week! i created something similar to identify domain, subdomain, directories, pages, and fqdn. note that if you run sockets.getfqdn against most cloud servers then you get some weird string with ip numbers e.g. 182.43.210.102.static.cloud-server.com
let's merge some code!