I use pywhois to determine if a domain name is registered or not. Here is my source code. (all permutations from a.net
to zzz.net
)
<
This is a tricky problem to solve, trickier than most people realize. The reason for that is that some people don't want you to find that out. Most domain registrars apply lots of black magic (i.e. lots of TLD-specific hacks) to get the nice listings they provide, and often they get it wrong. Of course, in the end they will know for sure, since they have EPP access that will hold the authoritative answer (but it's usually done only when you click "order").
Your first method (whois) used to be a good one, and I did this on a large scale back in the 90s when everything was more open. Nowadays, many TLDs protect this information behind captchas and obstructive web interfaces, and whatnot. If nothing else, there will be quotas on the number of queries per IP. (And it may be for good reason too, I used to get ridiculous amounts of spam to email addresses used for registering domains). Also note that spamming their WHOIS databases with queries is usually in breach of their terms of use and you might get rate limited, blocked, or even get an abuse report to your ISP.
Your second method (DNS) is usually a lot quicker (but don't use gethostbyname, use Twisted or some other async DNS for efficiency). You need to figure out how the response for taken and free domains look like for each TLD. Just because a domain doesn't resolve doesn't mean its free (it could just be unused). And conversely, some TLDs have landing pages for all nonexisting domains. In some cases it will be impossible to determine using DNS alone.
So, how do you solve it? Not with ease, I'm afraid. For each TLD, you need to figure out how to make clever use of DNS and whois databases, starting with DNS and resorting to other means in the tricky cases. Make sure not to flood whois databases with queries.
Another option is to get API access to one of the registrars, they might offer programmatic access to domain search.