Cannot Insert into SQL using PySpark, but works in SQL

前端 未结 1 911
陌清茗
陌清茗 2021-01-28 06:55

I have created a table below in SQL using the following:

CREATE TABLE [dbo].[Validation](
    [RuleId] [int] IDENTITY(1,1) NOT NULL,
    [AppId] [varchar](255) N         


        
相关标签:
1条回答
  • 2021-01-28 07:40

    The most straightforward solution here is use JDBC from a Scala cell. EG

    %scala
    
    import java.util.Properties
    import java.sql.DriverManager
    
    val jdbcUsername = dbutils.secrets.get(scope = "kv", key = "sqluser")
    val jdbcPassword = dbutils.secrets.get(scope = "kv", key = "sqlpassword")
    val driverClass = "com.microsoft.sqlserver.jdbc.SQLServerDriver"
    
    // Create the JDBC URL without passing in the user and password parameters.
    val jdbcUrl = s"jdbc:sqlserver://xxxx.database.windows.net:1433;database=AdventureWorks;encrypt=true;trustServerCertificate=false;hostNameInCertificate=*.database.windows.net;loginTimeout=30;"
    
    // Create a Properties() object to hold the parameters.
    
    val connectionProperties = new Properties()
    
    connectionProperties.put("user", s"${jdbcUsername}")
    connectionProperties.put("password", s"${jdbcPassword}")
    connectionProperties.setProperty("Driver", driverClass)
    
    val connection = DriverManager.getConnection(jdbcUrl, jdbcUsername, jdbcPassword)
    val stmt = connection.createStatement()
    val sql = "INSERT INTO dbo.Validation VALUES ('TestApp','2020-05-15','MemoryUsageAnomaly','2300MB')"
    
    stmt.execute(sql)
    connection.close()
    

    You could use pyodbc too, but the SQL Server ODBC drivers aren't installed by default, and the JDBC drivers are.

    A Spark solution would be to create a view in SQL Server and insert against that. eg

    create view Validation2 as
    select AppId,Date,RuleName,Value
    from Validation
    

    then

    tableName = "Validation2"
    df = spark.read.jdbc(url=jdbcUrl, table=tableName, properties=connectionProperties)
    df.createOrReplaceTempView(tableName)
    sqlContext.sql("INSERT INTO Validation2 VALUES ('TestApp','2020-05-15','MemoryUsageAnomaly','2300MB')")
    

    If you want to encapsulate the Scala and call it from another language (like Python), you can use a scala package cell.

    eg

    %scala
    
    package example
    
    import java.util.Properties
    import java.sql.DriverManager
    
    object JDBCFacade 
    {
      def runStatement(url : String, sql : String, userName : String, password: String): Unit = 
      {
        val connection = DriverManager.getConnection(url, userName, password)
        val stmt = connection.createStatement()
        try
        {
          stmt.execute(sql)  
        }
        finally
        {
          connection.close()  
        }
      }
    }
    

    and then you can call it like this:

    jdbcUsername = dbutils.secrets.get(scope = "kv", key = "sqluser")
    jdbcPassword = dbutils.secrets.get(scope = "kv", key = "sqlpassword")
    
    jdbcUrl = "jdbc:sqlserver://xxxx.database.windows.net:1433;database=AdventureWorks;encrypt=true;trustServerCertificate=false;hostNameInCertificate=*.database.windows.net;loginTimeout=30;"
    
    sql = "select 1 a into #foo from sys.objects"
    
    sc._jvm.example.JDBCFacade.runStatement(jdbcUrl,sql, jdbcUsername, jdbcPassword)
    
    0 讨论(0)
提交回复
热议问题