What Python way would you suggest to check whois database records?

旧巷老猫 提交于 2019-11-29 13:54:54

问题


I'm trying to get a webservice up and running that actually requires to check whois databases. What I'm doing right now is ugly and I'd like to avoid it as much as I can: I call gwhois command and parse its output. Ugly.

I did some search to try to find a pythonic way to do this task. Generally I got quite much nothing - this old discussion list link has a way to check if domain exist. Quite not what I was looking for... But still, it was best anwser Google gave me - everything else is just a bunch of unanwsered questions.

Any of you have succeeded to get some method up and running? I'd very much appreciate some tips, or should I just do it the opensource-way, sit down and code something by myself? :)


回答1:


There's nothing wrong with using a command line utility to do what you want. If you put a nice wrapper around the service, you can implement the internals however you want! For example:

class Whois(object):
    _whois_by_query_cache = {}

    def __init__(self, query):
        """Initializes the instance variables to defaults. See :meth:`lookup`
        for details on how to submit the query."""
        self.query = query
        self.domain = None
        # ... other fields.

    def lookup(self):
        """Submits the `whois` query and stores results internally."""
        # ... implementation

Now, whether or not you roll your own using urllib, wrap around a command line utility (like you're doing), or import a third party library and use that (like you're saying), this interface stays the same.

This approach is generally not considered ugly at all -- sometimes command utilities do what you want and you should be able to leverage them. If speed ends up being a bottleneck, your abstraction makes the process of switching to a native Python implementation transparent to your client code.

Practicality beats purity -- that's what's Pythonic. :)




回答2:


Look at this: http://code.google.com/p/pywhois/

pywhois - Python module for retrieving WHOIS information of domains

Goal: - Create a simple importable Python module which will produce parsed WHOIS data for a given domain. - Able to extract data for all the popular TLDs (com, org, net, ...) - Query a WHOIS server directly instead of going through an intermediate web service like many others do. - Works with Python 2.4+ and no external dependencies

Example:

>>> import pywhois
>>> w = pywhois.whois('google.com')
>>> w.expiration_date
['14-sep-2011']
>>> w.emails
['contact-admin@google.com',
 'dns-admin@google.com',
 'dns-admin@google.com',
 'dns-admin@google.com']
>>> print w
...



回答3:


Found this question in the process of my own search for a python whois library.

Don't know that I agree with cdleary's answer that using a library that wraps a command is always the best way to go - but I can see his reasons why he said this.

Pro: cmd-line whois handles all the hard work (socket calls, parsing, etc)

Con: not portable; module may not work depending on underlying whois command. Slower, since running a command and most likely shell in addition to whois command. Affected if not UNIX (Windows), different UNIX, older UNIX, or older whois command

I am looking for a whois module that can handle whois IP lookups and I am not interested in coding my own whois client.

Here are the modules that I (lightly) tried out and more information about it:

pywhoisapi:

  • Home: http://code.google.com/p/pywhoisapi/
  • Design: REST client accessing ARIN whois REST service
  • Pros: Able to handle IP address lookups
  • Cons: Able to pull information from whois servers of other RIRs?

BulkWhois

  • Home: http://pypi.python.org/pypi/BulkWhois/0.2.1
  • Design: telnet client accessing whois telnet query interface from RIR(?)
  • Pros: Able to handle IP address lookups
  • Cons: Able to pull information from whois servers of other RIRs?

pywhois:

  • Home: http://code.google.com/p/pywhois/
  • Design: REST client accessing RRID whois services
  • Pros: Accessses many RRIDs; has python 3.x branch
  • Cons: does not seem to handle IP address lookups

python-whois:

  • Home: http://code.google.com/p/python-whois/
  • Design: wraps "whois" command
  • Cons: does not seem to handle IP address lookups

whoisclient - fork of python-whois

  • Home: http://gitorious.org/python-whois
  • Design: wraps "whois" command
  • Depends on: IPy.py
  • Cons: does not seem to handle IP address lookups

Update: I ended up using pywhoisapi for the reverse IP lookups that I was doing




回答4:


Here is the whois client re-implemented in Python: http://code.activestate.com/recipes/577364-whois-client/




回答5:


I don't know if gwhois does something special with the server output; however, you can plainly connect to the whois server on port whois (43), send your query, read all the data in the reply and parse them. To make life a little easier, you could use the telnetlib.Telnet class (even if the whois protocol is much simpler than the telnet protocol) instead of plain sockets.

The tricky parts:

  • which whois server will you ask? RIPE, ARIN, APNIC, LACNIC, AFRINIC, JPNIC, VERIO etc LACNIC could be a useful fallback, since they tend to reply with useful data to requests outside of their domain.
  • what are the exact options and arguments for each whois server? some offer help, others don't. In general, plain domain names work without any special options.



回答6:


import socket
socket.gethostbyname_ex('url.com')

if it returns a gaierror you know know it's not registered with any DNS




回答7:


Another way to do it is to use urllib2 module to parse some other page's whois service (many sites like that exist). But that seems like even more of a hack that what you do now, and would give you a dependency on whatever whois site you chose, which is bad.

I hate to say it, but unless you want to re-implement whois in your program (which would be re-inventing the wheel), running whois on the OS and parsing the output (ie what you are doing now) seems like the right way to do it.




回答8:


Parsing another webpage woulnd't be as bad (assuming their html woulnd't be very bad), but it would actually tie me to them - if they're down, I'm down :)

Actually I found some old project on sourceforge: rwhois.py. What scares me a bit is that their last update is from 2003. But, it might seem as a good place to start reimplementation of what I do right now... Well, I felt obligued to post the link to this project anyway, just for further reference.




回答9:


here is a ready-to-use solution that works for me; written for Python 3.1 (when backporting to Py2.x, take special care of the bytes / Unicode text distinctions). your single point of access is the method DRWHO.whois(), which expects a domain name to be passed in; it will then try to resolve the name using the provider configured as DRWHO.whois_providers[ '*' ] (a more complete solution could differentiate providers according to the top level domain). DRWHO.whois() will return a dictionary with a single entry text, which contains the response text sent back by the WHOIS server. Again, a more complete solution would then try and parse the text (which must be done separately for each provider, as there is no standard format) and return a more structured format (e.g., set a flag available which specifies whether or not the domain looks available). have fun!

##########################################################################
import asyncore as                                   _sys_asyncore
from asyncore import loop as                         _sys_asyncore_loop
import socket as                                     _sys_socket



##########################################################################
class _Whois_request( _sys_asyncore.dispatcher_with_send, object ):
  # simple whois requester
  # original code by Frederik Lundh

  #-----------------------------------------------------------------------
  whoisPort = 43

  #-----------------------------------------------------------------------
  def __init__(self, consumer, host, provider ):
    _sys_asyncore.dispatcher_with_send.__init__(self)
    self.consumer = consumer
    self.query    = host
    self.create_socket( _sys_socket.AF_INET, _sys_socket.SOCK_STREAM )
    self.connect( ( provider, self.whoisPort, ) )

  #-----------------------------------------------------------------------
  def handle_connect(self):
    self.send( bytes( '%s\r\n' % ( self.query, ), 'utf-8' ) )

  #-----------------------------------------------------------------------
  def handle_expt(self):
    self.close() # connection failed, shutdown
    self.consumer.abort()

  #-----------------------------------------------------------------------
  def handle_read(self):
    # get data from server
    self.consumer.feed( self.recv( 2048 ) )

  #-----------------------------------------------------------------------
  def handle_close(self):
    self.close()
    self.consumer.close()


##########################################################################
class _Whois_consumer( object ):
  # original code by Frederik Lundh

  #-----------------------------------------------------------------------
  def __init__( self, host, provider, result ):
    self.texts_as_bytes = []
    self.host           = host
    self.provider       = provider
    self.result         = result

  #-----------------------------------------------------------------------
  def feed( self, text ):
    self.texts_as_bytes.append( text.strip() )

  #-----------------------------------------------------------------------
  def abort(self):
    del self.texts_as_bytes[:]
    self.finalize()

  #-----------------------------------------------------------------------
  def close(self):
    self.finalize()

  #-----------------------------------------------------------------------
  def finalize( self ):
    # join bytestrings and decode them (witha a guessed encoding):
    text_as_bytes         = b'\n'.join( self.texts_as_bytes )
    self.result[ 'text' ] = text_as_bytes.decode( 'utf-8' )


##########################################################################
class DRWHO:

  #-----------------------------------------------------------------------
  whois_providers = {
    '~isa':   'DRWHO/whois-providers',
    '*':      'whois.opensrs.net', }

  #-----------------------------------------------------------------------
  def whois( self, domain ):
    R         = {}
    provider  = self._get_whois_provider( '*' )
    self._fetch_whois( provider, domain, R )
    return R

  #-----------------------------------------------------------------------
  def _get_whois_provider( self, top_level_domain ):
    providers = self.whois_providers
    R         = providers.get( top_level_domain, None )
    if R is None:
      R = providers[ '*' ]
    return R

  #-----------------------------------------------------------------------
  def _fetch_whois( self, provider, domain, pod ):
    #.....................................................................
    consumer  = _Whois_consumer(           domain, provider, pod )
    request   = _Whois_request(  consumer, domain, provider )
    #.....................................................................
    _sys_asyncore_loop() # loops until requests have been processed


#=========================================================================
DRWHO = DRWHO()


domain    = 'example.com'
whois     = DRWHO.whois( domain )
print( whois[ 'text' ] )


来源:https://stackoverflow.com/questions/50394/what-python-way-would-you-suggest-to-check-whois-database-records

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!