Filesystem interface to Azure-Datalake Gen1 and Gen2 Storage
Quickstart
This package can be installed using:
pip install adlfs
or
conda install -c conda-forge adlfs
The adl:/ and abfs:/ protocols are included in fsspec's known_implementations registry
in fsspec > 0.6.1, otherwise users must explicitly inform fsspec about the supported adlfs protocols.
To use the Gen1 filesystem:
import dask.dataframe as dd
storage_options={'tenant_id': TENANT_ID, 'client_id': CLIENT_ID, 'client_secret': CLIENT_SECRET}
dd.read_csv(adl:/{STORE_NAME}/{FOLDER}/*.csv , storage_options=storage_options)
To use the Gen2 filesystem you can use the protocol abfs or az:
import dask.dataframe as dd
storage_options={'account_name': ACCOUNT_NAME, 'account_key': ACCOUNT_KEY}
ddf = dd.read_csv(abfs:/{CONTAINER}/{FOLDER}/*.csv , storage_options=storage_options)
ddf = dd.read_parquet(az:/{CONTAINER}/folder.parquet , storage_options=storage_options)
Accepted protocol / uri formats include:
'PROTOCOL:/container
|