Topics with Label: Azure Data Lake Storage

Forum Posts

Sorted by:

by AlexWeh • New Contributor II

05-22-2023 12:29:26 PM

2928 Views
1 replies
2 kudos

Universal Azure Credential Passthrough

At the moment, Azure Databricks has the feature to use AzureAD login for the workspace and create single user clusters with Azure Data Lake Storage credential passthrough. But this can only be used for Data Lake Storage.Is there already a way, or are...

Data Engineering

2928 Views
1 replies
2 kudos

05-22-2023 12:29:26 PM

View Replies

Latest Reply

polivbr
New Contributor II

01-11-2024 1:41:54 PM

2 kudos

I have exactly the same issue. I have the need to call a protected API within a notebook but have no access to the current user's access token. I've had to resort to nasty workarounds involving installing and running the Azure CLI from within the not...

2 kudos

01-11-2024 1:41:54 PM

by repcak • New Contributor III

03-13-2023 9:37:15 AM

1453 Views
1 replies
2 kudos

Init Scripts with mounted azure data lake storage gen2

I'm trying to access init script which is stored on mounted azure data lake storage gen2 to dbfsI mounted storage to dbfs:/mnt/storage/container/script.shand when i try to access it i got an error:Cluster scoped init script dbfs:/mnt/storage/containe...

Data Engineering

1453 Views
1 replies
2 kudos

03-13-2023 9:37:15 AM

View Replies

Latest Reply

User16752239289
Valued Contributor

03-22-2023 4:01:57 PM

2 kudos

I do not think the init script saved under mount point work and we do not suggest that. If you specify abfss , then the cluster need to be configured so that the cluster can authenticate and access the adls gen2 folder. Otherwise, the cluster will no...

2 kudos

03-22-2023 4:01:57 PM

by manasa • Contributor

07-15-2022 1:30:11 AM

2616 Views
4 replies
1 kudos

Need help to insert huge data into cosmos db from azure data lake storage using databricks

I am trying to insert 6GB of data into cosmos db using OLTP ConnectorContainer RU's:40000Cluster Config:cfg = { "spark.cosmos.accountEndpoint" : cosmosdbendpoint, "spark.cosmos.accountKey" : cosmosdbmasterkey, "spark.cosmos.database" : cosmosd...

Data Engineering

2616 Views
4 replies
1 kudos

07-15-2022 1:30:11 AM

View Replies

Latest Reply

ImAbhishekTomar
New Contributor III

02-22-2023 1:31:08 PM

1 kudos

Did anyone find solution for this, I’m also using similar clutter and RAU and data ingestion taking lot of time….?

1 kudos

02-22-2023 1:31:08 PM

3 More Replies

by Netty • New Contributor III

12-08-2022 9:16:44 AM

2376 Views
5 replies
7 kudos

Resolved! What's the easiest way to upsert data into a table? (Azure ADLS Gen2)

I had been trying to upsert rows into a table in Azure Blob Storage (ADLS Gen 2) based on two partitions (sample code below). insert overwrite table new_clicks_table partition(client_id, mm_date) select click_id ,user_id ,click_timestamp_gmt ,ca...

Data Engineering

2376 Views
5 replies
7 kudos

12-08-2022 9:16:44 AM

View Replies

Latest Reply

Ajay-Pandey
Esteemed Contributor III

12-08-2022 11:37:54 PM

7 kudos

Below code might help youPython- (df.write .mode("overwrite") .option("partitionOverwriteMode", "dynamic") .saveAsTable("default.people10m") ) SQL- SET spark.sql.sources.partitionOverwriteMode=dynamic; INSERT OVERWRITE TABLE default.people10m...

7 kudos

12-08-2022 11:37:54 PM

4 More Replies

by enavuio • New Contributor II

09-22-2022 10:54:52 AM

890 Views
2 replies
3 kudos

Count on External Table to Azure Data Storage is taking too long

I have created an External table to Azure Data Lake Storage Gen2.The Container has about 200K Json files.The structure of the json files are created with```CREATE EXTERNAL TABLE IF NOT EXISTS dbo.table( ComponentInfo STRUCT<ComponentHost: STRING, ...

Data Engineering

890 Views
2 replies
3 kudos

09-22-2022 10:54:52 AM

View Replies

Latest Reply

Anonymous
Not applicable

10-13-2022 2:49:53 AM

3 kudos

Hi @Ena Vu Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Thanks!

3 kudos

10-13-2022 2:49:53 AM

1 More Replies

by Aran_Oribu • New Contributor II

09-08-2022 3:43:52 AM

2228 Views
5 replies
2 kudos

Resolved! Create and update a csv/json file in ADLSG2 with Eventhub in Databricks streaming

Hello ,This is my first post here and I am a total beginner with DataBricks and spark.Working on an IoT Cloud project with azure , I'm looking to set up a continuous stream processing of data.A current architecture already exists thanks to Stream Ana...

Data Engineering

2228 Views
5 replies
2 kudos

09-08-2022 3:43:52 AM

View Replies

Latest Reply

-werners-
Esteemed Contributor III

09-08-2022 3:48:23 AM

2 kudos

So the event hub creates files (json/csv) on adls.You can read those files into databricks with the spark.read.csv/json method. If you want to read many files in one go, you can use wildcards.f.e. spark.read.json("/mnt/datalake/bronze/directory/*/*...

2 kudos

09-08-2022 3:48:23 AM

4 More Replies

by Hubert-Dudek • Esteemed Contributor III

01-26-2022 1:42:06 PM

4614 Views
3 replies
26 kudos

How to connect your Azure Data Lake Storage to Azure DatabricksStandard Workspace &#xd83d;&#xdc49; Private link In your storage accounts please go to “Networ...

How to connect your Azure Data Lake Storage to Azure DatabricksStandard Workspace Private linkIn your storage accounts please go to “Networking” -> “Private endpoint connections” and click Add Private Endpoint.It is important to add private links in ...