Moving data from a database to Azure blob storage

懵懂的女人 提交于 2020-04-18 05:41:12

问题


I'm able to use dask.dataframe.read_sql_table to read the data e.g. df = dd.read_sql_table(table='TABLE', uri=uri, index_col='field', npartitions=N)

What would be the next (best) steps to saving it as a parquet file in Azure blob storage?

From my small research there are a couple of options:

  • Save locally and use https://docs.microsoft.com/en-us/azure/storage/common/storage-use-azcopy-blobs?toc=/azure/storage/blobs/toc.json (not great for big data)
  • I believe adlfs is to read from blob
  • use dask.dataframe.to_parquet and work out how to point to the blob container
  • intake project (not sure where to start)

回答1:


$ pip install adlfs

dd.to_parquet(
    df=df, 
    path='absf://{BLOB}/{FILE_NAME}.parquet', 
    storage_options={'account_name': 'ACCOUNT_NAME',
                     'account_key': 'ACCOUNT_KEY'},
    )


来源:https://stackoverflow.com/questions/60765331/moving-data-from-a-database-to-azure-blob-storage

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!