How to protect password and username in Spark (such as for JDBC connections/accessing RDBMS databases)?

前端 未结 2 1294
隐瞒了意图╮
隐瞒了意图╮ 2021-01-04 13:14

We have a use case where we need to export data from HDFS to a RDBMS. I saw this example . Here they have store the username and password in the code. Is there any way to hi

相关标签:
2条回答
  • 2021-01-04 13:49

    As you run the application from console using spark-submit, you can access it through Java API:

    Console console = System.console();
    char passwordArray[] = console.readPassword("Enter your secret password: ");
    account.setPassword(passwordArray);
    
    0 讨论(0)
  • 2021-01-04 13:52

    Setting the password

    At the command line as a plaintext spark config:

    spark-submit --conf spark.jdbc.password=test_pass ... 
    

    Using environment variable:

    export jdbc_password=test_pass_export
    spark-submit --conf spark.jdbc.password=$jdbc_password ...
    

    Using spark config properties file:

    echo "spark.jdbc.b64password=test_pass_prop" > credentials.properties
    spark-submit --properties-file credentials.properties
    

    With base64 encoding to "obfuscate":

    echo "spark.jdbc.b64password=$(echo -n test_pass_prop | base64)" > credentials_b64.properties
    spark-submit --properties-file credentials_b64.properties
    

    Using the password in code

    import java.util.Base64 // for base64
    import java.nio.charset.StandardCharsets // for base64
    val properties = new java.util.Properties()
    properties.put("driver", "com.mysql.jdbc.Driver")
    properties.put("url", "jdbc:mysql://mysql-host:3306")
    properties.put("user", "test_user")
    val password = new String(Base64.getDecoder().decode(spark.conf.get("spark.jdbc.b64password")), StandardCharsets.UTF_8)
    properties.put("password", password)
    val models = spark.read.jdbc(properties.get("url").toString, "ml_models", properties)
    

    Edit: spark command line interface help docs for --conf and --properties-file:

      --conf PROP=VALUE           Arbitrary Spark configuration property.
      --properties-file FILE      Path to a file from which to load extra properties. If not
                                  specified, this will look for conf/spark-defaults.conf.
    

    The properties-file name is arbitrary.

    0 讨论(0)
提交回复
热议问题