I\'m using Spark Structured Streaming (PySpark) to join a Kafka Stream with a Static DataFrame using a MinHashLSH approxSimilarityJoin (which under the hood does a SortMergeJoin