I\'m on a W8 machine, where I use Python (Anaconda distribution) to connect to Impala in our Hadoop cluster using the
Try this to get tables for kerberized cluster. In my case CDH-5.14.2-1.
Make sure you have a valid ticket before running this code.
with python
2.7
having below packages.
thrift-0.9.3
thriftpy-0.3.8
thrift_sasl-0.3.0
impyla==0.14.2.2
Working Code
from impala.dbapi import connect
from impala.util import as_pandas
# 21000 is impala daemon port.
conn = connect(host='yourHost', port=21050, auth_mechanism='GSSAPI')
cursor = conn.cursor()
cursor.execute("SHOW TABLES")
# After running .execute(), Impala will store the result sets on the server
# until it is fetched. Use the method .fetchall() to pull the entire result
# set over the network (you should only do it if you know dataset is small)
tables = cursor.fetchall()
print("Displaying list of tables")
# the result is a list of tuples
for t in tables:
# we know that each row in SHOW TABLES result
# should only contains one table name
print(t[0])
# exit() enable for only one table
print("eol >>>")