pyspark - What additional preprocessing steps should I take to create a spark dataframe?

前端未结

关注

 0  979

I started with a few dozen pdf files and I\'ve extracted the text in each one by looping through the following:

import pdfplumber

def get_text(file):
  with


                      
              相关标签: