cache a dataframe in pyspark

后端 未结 1 585
醉酒成梦
醉酒成梦 2021-02-19 06:34

I want to know more precisely about the use of the method cache for dataframe in pyspark

When I run df.cache() it returns a dataframe. Therefore, if I do <

相关标签:
1条回答
  • 2021-02-19 07:17

    I found the source code DataFrame.cache

    def cache(self):
        """Persists the :class:`DataFrame` with the default storage level (`MEMORY_AND_DISK`).
    
        .. note:: The default storage level has changed to `MEMORY_AND_DISK` to match Scala in 2.0.
        """
        self.is_cached = True
        self._jdf.cache()
        return self
    

    Therefore, the answer is : both

    0 讨论(0)
提交回复
热议问题