cache a dataframe in pyspark

后端未结

关注

 1  576

醉酒成梦 2021-02-19 06:34

I want to know more precisely about the use of the method cache for dataframe in pyspark

When I run df.cache() it returns a dataframe. Therefore, if I do <

1条回答

我在风中等你 (楼主)

2021-02-19 07:17

I found the source code DataFrame.cache

def cache(self):
    """Persists the :class:`DataFrame` with the default storage level (`MEMORY_AND_DISK`).

    .. note:: The default storage level has changed to `MEMORY_AND_DISK` to match Scala in 2.0.
    """
    self.is_cached = True
    self._jdf.cache()
    return self

Therefore, the answer is : both

0 讨论(0)