Is there an API function to display “Fraction Cached” for an RDD?
问题 On the Storage tab of the PySparkShell application UI ([server]:8088) I can see information about an RDD I am using. One of the column is Fraction Cached . How can I retrieve this percentage programatically? I can use getStorageLevel() to get some information about RDD caching but not Fraction Cached . Do I have to calculate it myself? 回答1: SparkContext.getRDDStorageInfo is probably the thing you're looking for. It returns an Array of RDDInfo which provides information about: Memory size.