azure-data-lake-gen2

Why Java Azure Function App freezes when trying to access Azure datalake?

生来就可爱ヽ(ⅴ<●) 提交于 2021-01-29 12:28:00
问题 I am developing a Java Azure function that needs to download a file from Azure Datalake Gen2. When the function tries to read the file, it freezes and no exception is thrown, and nothing is written to the console. I am using the azure-storage-file-datalake SDK for Java dependency and this is my code: import com.azure.storage.common.StorageSharedKeyCredential; import com.azure.storage.file.datalake.DataLakeDirectoryClient; import com.azure.storage.file.datalake.DataLakeFileClient; import com

Copy Different type of file from Gen1 Azur lake to Azur Gen2 lake with attribute( like last updated)

余生颓废 提交于 2021-01-28 06:24:51
问题 I need to migrate all my data from Azur data lake Gen1 to Lake Gen2. In my lake we have different types of file mixed (.txt, .zip,.json and many other). We want to move them as-it-is to GEN2 lake. Along with that we also want to maintain last updated time for all files as GEN1 lake. I was looking to use ADF for this use case. But for that we need to define dataset, and to define dataset we have to define data format(Avro,json,xml, binary etc). As we have different type of data mixed, I tried

How to connect AMLS to ADLS Gen 2?

戏子无情 提交于 2020-12-11 02:33:07
问题 I would like to register a dataset from ADLS Gen2 in my Azure Machine Learning workspace ( azureml-core==1.12.0 ). Given that service principal information is not required in the Python SDK documentation for .register_azure_data_lake_gen2() , I successfully used the following code to register ADLS gen2 as a datastore: from azureml.core import Datastore adlsgen2_datastore_name = os.environ['adlsgen2_datastore_name'] account_name=os.environ['account_name'] # ADLS Gen2 account name file_system

Transfer from ADLS2 to Compute Target very slow Azure Machine Learning

岁酱吖の 提交于 2020-07-09 13:20:12
问题 During a training script executed on a compute target, we're trying to download a registered Dataset from an ADLS2 Datastore. The problem is that it takes hours to download ~1.5Gb (splitted into ~8500 files) to the compute target with the following method : from azureml.core import Datastore, Dataset, Run, Workspace # Retrieve the run context to get Workspace RUN = Run.get_context(allow_offline=True) # Retrieve the workspace ws = RUN.experiment.workspace # Creating the Dataset object based on

Not able to see 'Lifecycle management' option for ADLS Gen2

前提是你 提交于 2020-01-14 04:34:25
问题 I have created ADLS (Azure Data Lake Storage) Gen2 resource (StorageV2 with hierarchical name space enabled). The region I created the resource in is Central US and the performance/access tier is Standard/Hot and replication is LRS. But for this resource I can't see 'Lifecycle management' option on the portal. ADLS Gen2 is simply a StorageV2 account with hierarchical namespace enabled, and since the lifecycle management option exists for StorageV2 as per Microsoft documentation , it should be