Unable to insert data into Snowflake database table using pandas to_sql() method

心已入冬 提交于 2021-01-04 05:40:29

问题


I have a database SFOPT_TEST on my Snowflake instance. The database has two schemas AUDITS and PARAMS.

The schema AUDITS has a table created like this using SQLAlchemy declarative_base()-

class AccountUsageLoginHistory(Base):
    
    '''
    This model will store the account parameters of the customers instances.
    '''

    __tablename__ = constants.TABLE_ACCOUNT_USAGE_LOGIN_HISTORY
    __table_args__ = {
        'schema' : os.environ.get('SCHEMA_NAME_AUDITS')
    }

    id = Column(Integer, Sequence('id_login_history'), primary_key=True, autoincrement=True)
    event_id = Column(Integer, nullable=True)
    event_timestamp = Column(TIMESTAMP, nullable=True)
    event_type = Column(String(100), nullable=True)
    user_name = Column(String(100), nullable=True)
    client_ip = Column(String(100), nullable=True)
    reported_client_type = Column(String(100), nullable=True)
    reported_client_version = Column(String(100), nullable=True)
    first_authentication_factor = Column(String(100), nullable=True)
    second_authentication_factor = Column(String(100), nullable=True)
    is_success = Column(String(100), nullable=True)
    error_code = Column(String(200), nullable=True) 
    error_message = Column(String(100), nullable=True)
    related_event_id = Column(Integer, nullable=True)
    event = Column(String(200), nullable=True)
    instance_id = Column(Integer, nullable=True)
    company_id = Column(Integer, nullable=True)
    user_id = Column(Integer, nullable=True)
    date_run = Column(Date, nullable=True)

    def __repr__(self):
        #return the class object.
        return "<LoginHistory({})>".format(self.id)

And this is how the table has been created on instance-

I have a dataframe of columns as stated below which needs to be inserted into the table created above -

Index(['event_id', 'event_timestamp', 'event_type', 'user_name', 'client_ip',
       'reported_client_type', 'reported_client_version',
       'first_authentication_factor', 'second_authentication_factor',
       'is_success', 'error_code', 'error_message', 'related_event_id',
       'instance_id', 'user_id', 'event', 'company_id', 'date_run'],
      dtype='object')

So to insert the dataframe I have used to_sql() method as shown below -

dataframe.to_sql(table_name, self.engine, index=False, method=pd_writer, if_exists="append")

Which returns me an error -

Traceback (most recent call last):
  File "metadata_collection.py", line 59, in <module>
    y = x.collect_process_dump(sql='SELECT * FROM SNOWFLAKE.ACCOUNT_USAGE.LOGIN_HISTORY;', table_name='account_usage_login_history')
  File "metadata_collection.py", line 55, in collect_process_dump
    load_data = self.load_data.dump_data(table_name=table_name, dataframe=associate_df)
  File "/snowflake-backend/snowflake/collect_metadata/load_data.py", line 16, in dump_data
    dataframe.to_sql(table_name, self.engine, index=False, method=pd_writer, if_exists="append")
  File "/usr/local/lib/python3.7/site-packages/pandas/core/generic.py", line 2663, in to_sql
    method=method,
  File "/usr/local/lib/python3.7/site-packages/pandas/io/sql.py", line 521, in to_sql
    method=method,
  File "/usr/local/lib/python3.7/site-packages/pandas/io/sql.py", line 1317, in to_sql
    table.insert(chunksize, method=method)
  File "/usr/local/lib/python3.7/site-packages/pandas/io/sql.py", line 755, in insert
    exec_insert(conn, keys, chunk_iter)
  File "/usr/local/lib/python3.7/site-packages/snowflake/connector/pandas_tools.py", line 168, in pd_writer
    schema=table.schema)
  File "/usr/local/lib/python3.7/site-packages/snowflake/connector/pandas_tools.py", line 135, in write_pandas
    copy_results = cursor.execute(copy_into_sql, _is_internal=True).fetchall()
  File "/usr/local/lib/python3.7/site-packages/snowflake/connector/cursor.py", line 597, in execute
    errvalue)
  File "/usr/local/lib/python3.7/site-packages/snowflake/connector/errors.py", line 124, in errorhandler_wrapper
    cursor.errorhandler(connection, cursor, error_class, error_value)
  File "/usr/local/lib/python3.7/site-packages/snowflake/connector/errors.py", line 89, in default_errorhandler
    done_format_msg=error_value.get('done_format_msg'))
snowflake.connector.errors.ProgrammingError: 100072 (22000): 0198d465-0b4e-b74d-0000-d5e5000b524a: NULL result in a non-nullable column

This error is because I have a field id as a primary key in my snowflake table which cannot be null. To auto increment it I have created a Sequence as shown above in class AccountUsageLoginHistory. Also, in the screenshot attached above the default for the id is IDENTITY START 1 INCREMENT 1. All other columns are nullable=True so the issue is with the id only.

Still I am not able to insert the data into my table.


回答1:


If you are used to MSSQL or Oracle this may seem confusing, but Snowflake does not allow you to ignore the column on insert when you have a not null constraint (this is the only constraint that Snowflake enforces). However, since you are using the sequence to add default values you can set the column to nullable and your insert will succeed and it will populate the ID column with default values as you expect.

The only caveat is that if a user were to insert into the table in this manner:

INSERT INTO TABLE_ACCOUNT_USAGE_LOGIN_HISTORY(ID, EVENT_ID) 
VALUES(NULL, 2);

the query would successfully add a new row with a null value for ID.



来源:https://stackoverflow.com/questions/65230110/unable-to-insert-data-into-snowflake-database-table-using-pandas-to-sql-method

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!