Parsing SQL with Python

前端 未结 4 2144
天涯浪人
天涯浪人 2020-11-28 22:12

I want to create a SQL interface on top of a non-relational data store. Non-relational data store, but it makes sense to access the data in a relational manner.

I am

相关标签:
4条回答
  • 2020-11-28 22:55

    Of course, it may be best to leverage python-sqlparse on Google Code

    UPDATE: Now I see that this has been suggested - I concur that this is worthwhile:

    0 讨论(0)
  • 2020-11-28 22:57

    TwoLaid's Python SQL Parser works very well for my purposes. It's written in C and needs to be compiled. It is robust. It parses out individual elements of each clause.

    https://github.com/TwoLaid/python-sqlparser

    I'm using it to parse out queries column names to use in report headers. Here is an example.

    import sqlparser
    
    def get_query_columns(sql):
       '''Return a list of column headers from given sqls select clause'''
    
       columns = []
    
       parser = sqlparser.Parser()
    
       # Parser does not like new lines
       sql2 = sql.replace('\n', ' ')
    
       # Check for syntax errors
       if parser.check_syntax(sql2) != 0:
          raise Exception('get_query_columns: SQL invalid.')
    
       stmt = parser.get_statement(0)
       root = stmt.get_root()
       qcolumns = root.__dict__['resultColumnList']
       for qcolumn in qcolumns.list:
          if qcolumn.aliasClause:
             alias = qcolumn.aliasClause.get_text()
             columns.append(alias)
          else:
             name = qcolumn.get_text()
             name = name.split('.')[-1] # remove table alias
             columns.append(name)
    
       return columns
    
    sql = '''
    SELECT 
       a.a,
       replace(coalesce(a.b, 'x'), 'x', 'y') as jim,
       a.bla as sally  -- some comment
    FROM
       table_a as a
    WHERE
       c > 20
    '''
    
    print get_query_columns(sql)
    
    # output: ['a', 'jim', 'sally']
    
    0 讨论(0)
  • 2020-11-28 22:59

    I have looked into this issue quite extensively. Python-sqlparse is a non validating parser which is not really what you need. The examples in antlr need a lot of work to convert to a nice ast in python. The sql standard grammers are here, but it would be a full time job to convert them yourself and it is likely that you would only need a subset of them i.e no joins. You could try looking at the gadfly (a python sql database) as well, but I avoided it as they used their own parsing tool.

    For my case, I only essentially needed a where clause. I tried booleneo (a boolean expression parser) written with pyparsing but ended up using pyparsing from scratch. The first link in the reddit post of Mark Rushakoff gives a sql example using it. Whoosh a full text search engine also uses it but I have not looked at the source to see how.

    Pyparsing is very easy to use and you can very easily customize it to not be exactly the same as sql (most of the syntax you will not need). I did not like ply as it uses some magic using naming conventions.

    In short give pyparsing a try, it will most likely be powerful enough to do what you need and the simple integration with python (with easy callbacks and error handling) will make the experience pretty painless.

    0 讨论(0)
  • 2020-11-28 23:18

    This reddit post suggests Python-sqlparse as an existing implementation, among a couple other links.

    0 讨论(0)
提交回复
热议问题