Failing integration test for Apache Spark Streaming

后端 未结 1 1448
野趣味
野趣味 2021-02-07 10:19

I\'ve been trying to track down an issue with some unit/integration tests I\'ve been writing for an Apache Spark project.

When using Spark 1.1.1 my test passed. When I t

相关标签:
1条回答
  • 2021-02-07 10:58

    Looking at the source code for SparkContext, the line causing your exception while trying to get the current user name. In version 1.2 there was a fallback default SparkContext.SPARK_UNKNOWN_USER and it did not required currectly logged in user:

     // Set SPARK_USER for user who is running SparkContext.
     val sparkUser = Option {
         Option(System.getenv("SPARK_USER")).getOrElse(System.getProperty("user.name"))
      }.getOrElse {
          SparkContext.SPARK_UNKNOWN_USER
      }
    

    This code introduced in version 1.3 does not have default user anymore hence why you don't get this error with earlier versions:

    // Set SPARK_USER for user who is running SparkContext.
    val sparkUser = Utils.getCurrentUserName()
    

    This calls the following code in Utils:

    /**
       * Returns the current user name. This is the currently logged in user, unless that's been
       * overridden by the `SPARK_USER` environment variable.
       */
      def getCurrentUserName(): String = {
        Option(System.getenv("SPARK_USER"))
          .getOrElse(UserGroupInformation.getCurrentUser().getShortUserName())
      }
    

    If you set the environment variable SPARK_USER, you should be able to prevent the branching to UserGroupInformation which leads to your exception.

    UserGroupInformation is a Hadoop Security class, and it looks like the use of PowerMock is preventing it from working properly.

    0 讨论(0)
提交回复
热议问题