Data Engineering

Forum Posts

Sorted by:

by sasi2 • New Contributor II

3 weeks ago

297 Views
1 replies
0 kudos

Connecting to MuleSoft from Databricks

Hi, Is there any connectivity pipeline established already to access MuleSoft or AnyPoint exchange data using Databricks. I have seen many options to access databricks data in mulesoft but can we read the data from Mulesoft into databricks. Please gi...

Data Engineering

297 Views
1 replies
0 kudos

3 weeks ago

View Replies

Latest Reply

Kaniz
Community Manager

a week ago

0 kudos

Hi @sasi2, Connecting MuleSoft or AnyPoint to exchange data with Databricks is possible, and there are several options you can explore. Let’s dive into some solutions: Using JDBC Driver for Databricks in Mule Applications: The CData JDBC Driver...

0 kudos

a week ago

by MartinH • New Contributor II

03-23-2023 3:09:56 PM

2696 Views
7 replies
5 kudos

Resolved! Azure Data Factory and Photon

Hello, we have Databricks Python workbooks accessing Delta tables. These workbooks are scheduled/invoked by Azure Data Factory. How can I enable Photon on the linked services that are used to call Databricks?If I specify new job cluster, there does n...

Data Engineering

2696 Views
7 replies
5 kudos

03-23-2023 3:09:56 PM

View Replies

Latest Reply

CharlesReily
New Contributor III

01-16-2024 11:22:48 PM

5 kudos

When you create a cluster on Databricks, you can enable Photon by selecting the "Photon" option in the cluster configuration settings. This is typically done when creating a new cluster, and you would find the option in the advanced cluster configura...

5 kudos

01-16-2024 11:22:48 PM

6 More Replies

by subha2 • New Contributor II

a week ago

458 Views
1 replies
0 kudos

Not able to read tables in Unity Catalog parallel

There are some tables under schema/database under Unity Catalog.The Notebook need to read the table parallel using loop and thread and execute the query configuredBut the sql statement is not getting executed via spark.sql() or spark.read.table().It ...

Data Engineering

458 Views
1 replies
0 kudos

a week ago

View Replies

Latest Reply

Kaniz
Community Manager

a week ago

0 kudos

Hi @subha2, It seems you’re encountering an issue related to executing SQL statements in Spark. Let’s troubleshoot this step by step: Check the Unity Catalog Configuration: Verify that the Unity Catalog configuration is correctly set up. Ensure t...

0 kudos

a week ago

by DBX-2024 • New Contributor

a week ago

272 Views
1 replies
0 kudos

Job Cluster's CPU utilization goes higher than 100% few times during the workload run

I have Data Engineering Pipeline workload that run on Databricks.Job cluster has following configuration :- Worker i3.4xlarge with 122 GB memory and 16 coresDriver i3.4xlarge with 122 GB memory and 16 cores ,Min Worker -4 and Max Worker 8 We noticed...

Data Engineering

Databricks

272 Views
1 replies
0 kudos

a week ago

View Replies

Latest Reply

Kaniz
Community Manager

a week ago

0 kudos

Hi @DBX-2024, Let’s break down your questions: High CPU Utilization Spikes: Are They Problematic? High CPU utilization spikes can be problematic depending on the context. Here are some considerations: Normal Behavior: It’s common for CPU utilizat...

0 kudos

a week ago

by gabe123 • New Contributor

a week ago

415 Views
1 replies
0 kudos

Strange Error with custom module in delta live table pipeline

The chunk of code in questionsys.path.append( spark.conf.get("util_path", "/Workspace/Repos/Production/loch-ness/utils/") ) from broker_utils import extract_day_with_suffix, proper_case_address_udf, proper_case_last_name_first_udf, proper_case_ud...

Data Engineering

415 Views
1 replies
0 kudos

a week ago

View Replies

Latest Reply

Kaniz
Community Manager

a week ago

0 kudos

Hi @gabe123 , It seems like you’re encountering a ModuleNotFoundError when trying to import the broker_utils module in your Python code. Let’s troubleshoot this issue step by step: Check Module Location: First, ensure that the broker_utils.py fil...

0 kudos

a week ago

by lieber_augustin • New Contributor

a week ago

161 Views
1 replies
0 kudos

Reading from one Postgres table result in several Scan JDBCRelation operations

Hello,I am working on a Spark job where I'm reading several tables from PostgreSQL into DataFrames as follows: df = (spark.read .format("postgresql") .option("query", query) .option("host", database_host) .option("port...

Data Engineering

161 Views
1 replies
0 kudos

a week ago

View Replies

Latest Reply

Kaniz
Community Manager

a week ago

0 kudos

Hi @lieber_augustin, Optimizing the performance of your PostgreSQL queries involves several considerations. Let’s address both the potential optimizations and the reason behind multiple Scan JDBCRelation operations. Database Design: Properly des...

0 kudos

a week ago

by Ulman • New Contributor II

a week ago

148 Views
3 replies
0 kudos

Switching to File Notification Mode with ADLS Gen2 - Encountering StorageException

Hello,We are currently utilizing an autoloader with file listing mode for a stream, which is experiencing significant latency due to the non-incremental naming of files in the directory—a condition that cannot be altered.In an effort to mitigate this...

Data Engineering

ADLS gen2

autoloader

file notification mode

148 Views
3 replies
0 kudos

a week ago

View Replies

Latest Reply

Wojciech_BUK
Contributor III

a week ago

0 kudos

Hi @Ulman ,i think that by default this method will try to create Event Grid and Storage Queue on the same Storage Account as your data.Please not that PREMIUM Blob Storage do not have QUEUE service.In my opinion the easiest way would be to create ma...

0 kudos

a week ago

2 More Replies

by jorperort • New Contributor II

2 weeks ago

280 Views
3 replies
0 kudos

[Databricks Assets Bundles] no deployment state

Good morning, I'm trying to run: databricks bundle run --debug -t dev integration_tests_job My bundle looks: bundle: name: x include: - ./resources/*.yml targets: dev: mode: development default: true workspace: host: x r...

Data Engineering

Databricks Assets Bundles

Deployment Error

pid=265687

280 Views
3 replies
0 kudos

2 weeks ago

View Replies

Latest Reply

Kaniz
Community Manager

a week ago

0 kudos

Hi @jorperort, The error message you’re seeing, “no deployment state. Did you forget to run ‘databricks bundle deploy’?”, indicates that the deployment state is missing. Here are some steps you can take to resolve this issue: Verify Deploym...

0 kudos

a week ago

2 More Replies

by mamiya • New Contributor II

2 weeks ago

176 Views
2 replies
0 kudos

ODBC PowerBI 2 commands in one query

Hello everyone,I'm trying to use the ODBC DirectQuery option in PowerBI, but I keep getting an error about another command. The SQL query works while using the SQL Editor. Do I need to change the setup of my ODBC connector?DECLARE dateFrom DATE = DA...

Data Engineering

176 Views
2 replies
0 kudos

2 weeks ago

View Replies

Latest Reply

Kaniz
Community Manager

a week ago

0 kudos

Hi @mamiya , Here are a few steps you can take to address the error: Check Power Query Editor Steps: The error might be related to a specific step in the Power Query Editor. Try opening the Power Query Editor and reviewing the steps. If there’s a...

0 kudos

a week ago

1 More Replies

by Deepak_Kandpal • New Contributor III

09-27-2022 1:21:37 AM

8291 Views
4 replies
3 kudos

Resolved! Invalid configuration value detected for fs.azure.account.key with com.crealytics:spark-excel

I have setup my Databricks notebook to use Service Principal to access ADLS using below configuration.service_credential = dbutils.secrets.get(scope="<scope>",key="<service-credential-key>") spark.conf.set("fs.azure.account.auth.type.<storage-accou...

Data Engineering

8291 Views
4 replies
3 kudos

09-27-2022 1:21:37 AM

View Replies

Latest Reply

Harsha_Dbrs
New Contributor II

a week ago

3 kudos

Below is the implementation of same code in scala:spark.sparkContext.hadoopConfiguration.set("fs.azure.account.key.<accountName>.dfs.core.windows.net",<accountKey>)

3 kudos

a week ago

3 More Replies

by prats33 • New Contributor

a week ago

126 Views
1 replies
0 kudos

schedule job termination

Hi i want to terminate my databricks job daily at 11.59am, how can i achieve this in databricks

Data Engineering

126 Views
1 replies
0 kudos

a week ago

View Replies

Latest Reply

Ajay-Pandey
Esteemed Contributor III

a week ago

0 kudos

Hi @prats33 You can use databricks cluster API for terminate your cluster at any specific time, create notebook for API and schedule it as databricks workflow job on job cluster at 11:59.

0 kudos

a week ago

by User16826987838 • Contributor

06-23-2021 12:29:05 PM

3545 Views
5 replies
7 kudos

Is there a way to integrate Databricks secrets with AWS secrets manager?

Data Engineering

3545 Views
5 replies
7 kudos

06-23-2021 12:29:05 PM

View Replies

Latest Reply

Olaoye_Somide
New Contributor II

a week ago

7 kudos

Any update on this feature?

7 kudos

a week ago

4 More Replies

by srikanth2 • New Contributor II

a week ago

238 Views
2 replies
0 kudos

Can we use Managed Identity to create mount point for ADLS Gen2

Hi,We would like to use Azure Managed Identity to create mount point to read/write data from/to ADLS Gen2?We are also using following code snippet to use MSI authentication to read data from ADLS Gen2 but it is giving error,storage_account_name = "<<...

Data Engineering

238 Views
2 replies
0 kudos

a week ago

View Replies

Latest Reply

Walter_C
Valued Contributor II

a week ago

0 kudos

It seems that using User Assigned Managed Identity to read/write from ADLS Gen2 inside a notebook is not directly supported at the moment.

0 kudos

a week ago

1 More Replies

by stepysamud • New Contributor

3 weeks ago

179 Views
1 replies
0 kudos

Workflow UI broken after creating job via the api

Hi all,I'm in the progress of migrating from Databricks Azure to Databricks AWS.One part of this is migrating all our workflows which I wanted to via the /api/2.1/jobs/create api with the workflow passed via the json body. I have successfully created...

Data Engineering

179 Views
1 replies
0 kudos

3 weeks ago

View Replies

Latest Reply

Walter_C
Valued Contributor II

a week ago

0 kudos

Hello, many thanks for your question, as per the error message showed it was mentioning a possible timeout or network issue. As first step have you tried to open the page on another browser or using incognito mode?Also have you tried using different ...

0 kudos

a week ago

by Sasikala • New Contributor

a week ago

333 Views
1 replies
0 kudos

Service Principal Managed by Databricks

I have done the below steps1. Created a databricks managed service principal2. Created a Oauth Secret3. Gave all necessary permissions to the service principalI'm trying to use this Service principal in Azure Devops to automate CI/CD. but it fails as...

Data Engineering

333 Views
1 replies
0 kudos

a week ago

View Replies

Latest Reply

Walter_C
Valued Contributor II

a week ago

0 kudos

Have you follow the steps available for service principal for CI/CD available here: https://learn.microsoft.com/en-us/azure/databricks/dev-tools/ci-cd/ci-cd-sp

0 kudos

a week ago

User

Count

1603

736

344

284

247

Databricks

Forum Posts

Connecting to MuleSoft from Databricks

Resolved! Azure Data Factory and Photon

Not able to read tables in Unity Catalog parallel

Job Cluster's CPU utilization goes higher than 100% few times during the workload run

Strange Error with custom module in delta live table pipeline

Reading from one Postgres table result in several Scan JDBCRelation operations

Switching to File Notification Mode with ADLS Gen2 - Encountering StorageException

[Databricks Assets Bundles] no deployment state

ODBC PowerBI 2 commands in one query

Resolved! Invalid configuration value detected for fs.azure.account.key with com.crealytics:spark-excel

schedule job termination

Is there a way to integrate Databricks secrets with AWS secrets manager?

Can we use Managed Identity to create mount point for ADLS Gen2

Workflow UI broken after creating job via the api

Service Principal Managed by Databricks

Unity Catalog Metastore Details

DLT table not picked in python notebook

Load multiple delta tables at once from Sql server

Starting Serverless sql cluster on GCP

"Can't login to databricks socket is closed" when ...