问题
I have a database SFOPT_TEST
on my Snowflake instance. The database has two schemas AUDITS
and PARAMS
.
The schema AUDITS
has a table created like this using SQLAlchemy declarative_base()
-
class AccountUsageLoginHistory(Base):
'''
This model will store the account parameters of the customers instances.
'''
__tablename__ = constants.TABLE_ACCOUNT_USAGE_LOGIN_HISTORY
__table_args__ = {
'schema' : os.environ.get('SCHEMA_NAME_AUDITS')
}
id = Column(Integer, Sequence('id_login_history'), primary_key=True, autoincrement=True)
event_id = Column(Integer, nullable=True)
event_timestamp = Column(TIMESTAMP, nullable=True)
event_type = Column(String(100), nullable=True)
user_name = Column(String(100), nullable=True)
client_ip = Column(String(100), nullable=True)
reported_client_type = Column(String(100), nullable=True)
reported_client_version = Column(String(100), nullable=True)
first_authentication_factor = Column(String(100), nullable=True)
second_authentication_factor = Column(String(100), nullable=True)
is_success = Column(String(100), nullable=True)
error_code = Column(String(200), nullable=True)
error_message = Column(String(100), nullable=True)
related_event_id = Column(Integer, nullable=True)
event = Column(String(200), nullable=True)
instance_id = Column(Integer, nullable=True)
company_id = Column(Integer, nullable=True)
user_id = Column(Integer, nullable=True)
date_run = Column(Date, nullable=True)
def __repr__(self):
#return the class object.
return "<LoginHistory({})>".format(self.id)
And this is how the table has been created on instance-
I have a dataframe of columns as stated below which needs to be inserted into the table created above -
Index(['event_id', 'event_timestamp', 'event_type', 'user_name', 'client_ip',
'reported_client_type', 'reported_client_version',
'first_authentication_factor', 'second_authentication_factor',
'is_success', 'error_code', 'error_message', 'related_event_id',
'instance_id', 'user_id', 'event', 'company_id', 'date_run'],
dtype='object')
So to insert the dataframe I have used to_sql()
method as shown below -
dataframe.to_sql(table_name, self.engine, index=False, method=pd_writer, if_exists="append")
Which returns me an error -
Traceback (most recent call last):
File "metadata_collection.py", line 59, in <module>
y = x.collect_process_dump(sql='SELECT * FROM SNOWFLAKE.ACCOUNT_USAGE.LOGIN_HISTORY;', table_name='account_usage_login_history')
File "metadata_collection.py", line 55, in collect_process_dump
load_data = self.load_data.dump_data(table_name=table_name, dataframe=associate_df)
File "/snowflake-backend/snowflake/collect_metadata/load_data.py", line 16, in dump_data
dataframe.to_sql(table_name, self.engine, index=False, method=pd_writer, if_exists="append")
File "/usr/local/lib/python3.7/site-packages/pandas/core/generic.py", line 2663, in to_sql
method=method,
File "/usr/local/lib/python3.7/site-packages/pandas/io/sql.py", line 521, in to_sql
method=method,
File "/usr/local/lib/python3.7/site-packages/pandas/io/sql.py", line 1317, in to_sql
table.insert(chunksize, method=method)
File "/usr/local/lib/python3.7/site-packages/pandas/io/sql.py", line 755, in insert
exec_insert(conn, keys, chunk_iter)
File "/usr/local/lib/python3.7/site-packages/snowflake/connector/pandas_tools.py", line 168, in pd_writer
schema=table.schema)
File "/usr/local/lib/python3.7/site-packages/snowflake/connector/pandas_tools.py", line 135, in write_pandas
copy_results = cursor.execute(copy_into_sql, _is_internal=True).fetchall()
File "/usr/local/lib/python3.7/site-packages/snowflake/connector/cursor.py", line 597, in execute
errvalue)
File "/usr/local/lib/python3.7/site-packages/snowflake/connector/errors.py", line 124, in errorhandler_wrapper
cursor.errorhandler(connection, cursor, error_class, error_value)
File "/usr/local/lib/python3.7/site-packages/snowflake/connector/errors.py", line 89, in default_errorhandler
done_format_msg=error_value.get('done_format_msg'))
snowflake.connector.errors.ProgrammingError: 100072 (22000): 0198d465-0b4e-b74d-0000-d5e5000b524a: NULL result in a non-nullable column
This error is because I have a field id
as a primary key
in my snowflake table which cannot be null
. To auto increment it I have created a Sequence as shown above in class AccountUsageLoginHistory
. Also, in the screenshot attached above the default for the id
is IDENTITY START 1 INCREMENT 1
. All other columns are nullable=True so the issue is with the id
only.
Still I am not able to insert the data into my table.
回答1:
If you are used to MSSQL or Oracle this may seem confusing, but Snowflake does not allow you to ignore the column on insert when you have a not null constraint (this is the only constraint that Snowflake enforces). However, since you are using the sequence to add default values you can set the column to nullable and your insert will succeed and it will populate the ID column with default values as you expect.
The only caveat is that if a user were to insert into the table in this manner:
INSERT INTO TABLE_ACCOUNT_USAGE_LOGIN_HISTORY(ID, EVENT_ID)
VALUES(NULL, 2);
the query would successfully add a new row with a null value for ID.
来源:https://stackoverflow.com/questions/65230110/unable-to-insert-data-into-snowflake-database-table-using-pandas-to-sql-method