How best to convert from azure blob csv format to pandas dataframe while running notebook in azure ml

前端 未结 4 1961
闹比i
闹比i 2020-12-10 14:56

I have a number of large csv (tab delimited) data stored as azure blobs, and I want to create a pandas dataframe from these. I can do this locally as follows:



        
4条回答
  •  有刺的猬
    2020-12-10 15:35

    The accepted answer will not work in the latest Azure Storage SDK. MS has rewritten the SDK completely. It's kind of annoying if you are using the old version and update it. The below code should work in the new version.

    from azure.storage.blob import ContainerClient
    from io import StringIO
    import pandas as pd
    
    conn_str = ""
    container = ""
    blob_name = ""
    
    container_client = ContainerClient.from_connection_string(
        conn_str=conn_str, 
        container_name=container
        )
    # Download blob as StorageStreamDownloader object (stored in memory)
    downloaded_blob = container_client.download_blob(blob_name)
    
    df = pd.read_csv(StringIO(downloaded_blob.content_as_text()))

提交回复
热议问题