Delete azure sql database rows from azure databricks

梦想与她 提交于 2019-12-23 04:54:09

问题


I have a table in Azure SQL database from which I want to either delete selected rows based on some criteria or entire table from Azure Databricks. Currently I am using the truncate property of JDBC to truncate the entire table without dropping it and then re-write it with new dataframe.

df.write \
     .option('user', jdbcUsername) \
     .option('password', jdbcPassword) \
     .jdbc('<connection_string>', '<table_name>', mode = 'overwrite', properties = {'truncate' : 'true'} )

But going forward I don't want to truncate and overwrite the entire table every time but rather use delete command. I was not able to achieve this using pushdown query either. Any help on this would be greatly appreciated.


回答1:


You can also drop down to scala to do this, as the SQL Server JDBC driver is already installed. EG:

%scala

import java.util.Properties
import java.sql.DriverManager

val jdbcUsername = "xxxxx"
val jdbcPassword = "xxxxxx"
val driverClass = "com.microsoft.sqlserver.jdbc.SQLServerDriver"

// Create the JDBC URL without passing in the user and password parameters.
val jdbcUrl = s"jdbc:sqlserver://xxxxxx.database.windows.net:1433;database=AdventureWorks;encrypt=true;trustServerCertificate=false;hostNameInCertificate=*.database.windows.net;loginTimeout=30;"

// Create a Properties() object to hold the parameters.

val connectionProperties = new Properties()

connectionProperties.put("user", s"${jdbcUsername}")
connectionProperties.put("password", s"${jdbcPassword}")
connectionProperties.setProperty("Driver", driverClass)

val connection = DriverManager.getConnection(jdbcUrl, jdbcUsername, jdbcPassword)
val stmt = connection.createStatement()
val sql = "delete from sometable where someColumn > 4"

stmt.execute(sql)
connection.close()



回答2:


Use pyodbc to execute a SQL Statement.

import pyodbc
conn = pyodbc.connect( 'DRIVER={ODBC Driver 17 for SQL Server};'
                       'SERVER=mydatabe.database.azure.net;'
                       'DATABASE=AdventureWorks;UID=jonnyFast;'
                       'PWD=MyPassword')
conn.execute('DELETE TableBlah WHERE 1=2')

It's a bit of a pain to get pyodbc working on Databricks - see details here: https://datathirst.net/blog/2018/10/12/executing-sql-server-stored-procedures-on-databricks-pyspark



来源:https://stackoverflow.com/questions/54443857/delete-azure-sql-database-rows-from-azure-databricks

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!