i am trying to work with Pyspark in IntelliJ but i cannot figure out how to correctly install it/setup the project. I can work with Python in IntelliJ and I can use the pyspark
For example, something of this kind:
from pyspark import SparkContext, SparkConf
spark_conf = SparkConf().setAppName("scavenge some logs")
spark_context = SparkContext(conf=spark_conf)
address = "/path/to/the/log/on/hdfs/*.gz"
log = spark_context.textFile(address)
my_result = (log.
...here go your actions and transformations...
).saveAsTextFile('my_result')