cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

sasi2
by New Contributor II
  • 245 Views
  • 1 replies
  • 0 kudos

Connecting to MuleSoft from Databricks

Hi, Is there any connectivity pipeline established already to access MuleSoft or AnyPoint exchange data using Databricks. I have seen many options to access databricks data in mulesoft but can we read the data from Mulesoft into databricks. Please gi...

  • 245 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz
Community Manager
  • 0 kudos

  Hi @sasi2, Connecting MuleSoft or AnyPoint to exchange data with Databricks is possible, and there are several options you can explore. Let’s dive into some solutions: Using JDBC Driver for Databricks in Mule Applications: The CData JDBC Driver...

  • 0 kudos
MartinH
by New Contributor II
  • 2476 Views
  • 7 replies
  • 4 kudos

Azure Data Factory and Photon

Hello, we have Databricks Python workbooks accessing Delta tables. These workbooks are scheduled/invoked by Azure Data Factory. How can I enable Photon on the linked services that are used to call Databricks?If I specify new job cluster, there does n...

  • 2476 Views
  • 7 replies
  • 4 kudos
Latest Reply
CharlesReily
New Contributor III
  • 4 kudos

When you create a cluster on Databricks, you can enable Photon by selecting the "Photon" option in the cluster configuration settings. This is typically done when creating a new cluster, and you would find the option in the advanced cluster configura...

  • 4 kudos
6 More Replies
subha2
by New Contributor II
  • 340 Views
  • 1 replies
  • 0 kudos

Not able to read tables in Unity Catalog parallel

There are some tables under schema/database under Unity Catalog.The Notebook need to read the table parallel using loop and thread and execute the query configuredBut the sql statement is not getting executed via spark.sql() or spark.read.table().It ...

  • 340 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz
Community Manager
  • 0 kudos

Hi @subha2, It seems you’re encountering an issue related to executing SQL statements in Spark. Let’s troubleshoot this step by step: Check the Unity Catalog Configuration: Verify that the Unity Catalog configuration is correctly set up. Ensure t...

  • 0 kudos
DBX-2024
by New Contributor
  • 74 Views
  • 1 replies
  • 0 kudos

Job Cluster's CPU utilization goes higher than 100% few times during the workload run

I have Data Engineering Pipeline workload that run on Databricks.Job cluster has following configuration :- Worker  i3.4xlarge with 122 GB memory and 16 coresDriver i3.4xlarge with 122 GB memory and 16 cores ,Min Worker -4 and Max Worker 8 We noticed...

Data Engineering
Databricks
  • 74 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz
Community Manager
  • 0 kudos

Hi @DBX-2024, Let’s break down your questions: High CPU Utilization Spikes: Are They Problematic? High CPU utilization spikes can be problematic depending on the context. Here are some considerations: Normal Behavior: It’s common for CPU utilizat...

  • 0 kudos
smukhi
by New Contributor
  • 82 Views
  • 1 replies
  • 0 kudos

Encountering Error UNITY_CREDENTIAL_SCOPE_MISSING_SCOPE

As of this morning we started receiving the following error message on a Databricks job with a single Pyspark Notebook task. The job has not had any code changes in 2 months. The cluster configuration has also not changed. The last successful run of ...

  • 82 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz
Community Manager
  • 0 kudos

Hi @smukhi, The error message you’re encountering, specifically the “Py4JJavaError” with the “Missing Credential Scope” issue, can be quite puzzling. Let’s explore some potential solutions and ideas to troubleshoot this problem: Check Cluster Con...

  • 0 kudos
Skr7
by New Contributor
  • 7 Views
  • 0 replies
  • 0 kudos

Databricks Asset Bundles

Hi, I'm implementing Databricks Asset bundles, my scripts are in GitHub and my /resource has all the .yml of my Databricks workflow which are pointing to the main branch      git_source: git_url: https://github.com/xxxx git_provider: ...

Data Engineering
Databricks
  • 7 Views
  • 0 replies
  • 0 kudos
gabe123
by New Contributor
  • 203 Views
  • 1 replies
  • 0 kudos

Strange Error with custom module in delta live table pipeline

The chunk of code in questionsys.path.append( spark.conf.get("util_path", "/Workspace/Repos/Production/loch-ness/utils/") ) from broker_utils import extract_day_with_suffix, proper_case_address_udf, proper_case_last_name_first_udf, proper_case_ud...

  • 203 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz
Community Manager
  • 0 kudos

Hi @gabe123 , It seems like you’re encountering a ModuleNotFoundError when trying to import the broker_utils module in your Python code. Let’s troubleshoot this issue step by step: Check Module Location: First, ensure that the broker_utils.py fil...

  • 0 kudos
lieber_augustin
by New Contributor
  • 102 Views
  • 1 replies
  • 0 kudos

Reading from one Postgres table result in several Scan JDBCRelation operations

Hello,I am working on a Spark job where I'm reading several tables from PostgreSQL into DataFrames as follows: df = (spark.read .format("postgresql") .option("query", query) .option("host", database_host) .option("port...

  • 102 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz
Community Manager
  • 0 kudos

Hi @lieber_augustin, Optimizing the performance of your PostgreSQL queries involves several considerations. Let’s address both the potential optimizations and the reason behind multiple Scan JDBCRelation operations. Database Design: Properly des...

  • 0 kudos
Husky
by New Contributor III
  • 1198 Views
  • 4 replies
  • 1 kudos

Resolved! Upload file from local file system to Unity Catalog Volume (via databricks-connect)

Context:IDE: IntelliJ 2023.3.2Library: databricks-connect 13.3Python: 3.10Description:I develop notebooks and python scripts locally in the IDE and I connect to the spark cluster via databricks-connect for a better developer experience.  I download a...

  • 1198 Views
  • 4 replies
  • 1 kudos
Latest Reply
lathaniel
New Contributor III
  • 1 kudos

Late to the discussion, but I too was looking for a way to do this _programmatically_, as opposed to the UI.The solution I landed on was using the Python SDK (though you could assuredly do this using an API request instead if you're not in Python):w ...

  • 1 kudos
3 More Replies
jainshasha
by New Contributor
  • 77 Views
  • 4 replies
  • 0 kudos

Job Cluster in Databricks workflow

Hi,I have configured 20 different workflows in Databricks. All of them configured with job cluster with different name. All 20 workfldows scheduled to run at same time. But even configuring different job cluster in all of them they run sequentially w...

  • 77 Views
  • 4 replies
  • 0 kudos
Latest Reply
jainshasha
New Contributor
  • 0 kudos

Hi @Kaniz Attaching the screenshots of 5 of the workflows which schedule at same time

  • 0 kudos
3 More Replies
dbdude
by New Contributor II
  • 4513 Views
  • 4 replies
  • 0 kudos

AWS Secrets Works In One Cluster But Not Another

Why can I use boto3 to go to secrets manager to retrieve a secret with a personal cluster but I get an error with a shared cluster?NoCredentialsError: Unable to locate credentials 

  • 4513 Views
  • 4 replies
  • 0 kudos
Latest Reply
Husky
New Contributor III
  • 0 kudos

Hey @dbdude, I am facing the same error. Did you find a solution to access the AWS credentials on a Shared Cluster?This article describes a way of storing credentials in a Unity Catalog Volume to fetch by the Shared Cluster:https://medium.com/@amluci...

  • 0 kudos
3 More Replies
madhumitha
by Visitor
  • 51 Views
  • 2 replies
  • 0 kudos

Connect power bi desktop semantic model output to databricks

Hello, I am trying to connect the power bi semantic model output (basically the data that has already been pre processed) to databricks. Does anybody know how to do this? I would like it to be an automated process so I would like to know any way to p...

  • 51 Views
  • 2 replies
  • 0 kudos
Latest Reply
Kaniz
Community Manager
  • 0 kudos

Hi @madhumitha, Connecting Power BI semantic model output to Databricks can be done in a few steps. Here are a couple of options: Databricks Power Query Connector: The new Databricks connector is natively integrated into Power BI. You can configu...

  • 0 kudos
1 More Replies
Ulman
by Visitor
  • 58 Views
  • 3 replies
  • 0 kudos

Switching to File Notification Mode with ADLS Gen2 - Encountering StorageException

Hello,We are currently utilizing an autoloader with file listing mode for a stream, which is experiencing significant latency due to the non-incremental naming of files in the directory—a condition that cannot be altered.In an effort to mitigate this...

Data Engineering
ADLS gen2
autoloader
file notification mode
  • 58 Views
  • 3 replies
  • 0 kudos
Latest Reply
Wojciech_BUK
Contributor III
  • 0 kudos

Hi @Ulman ,i think that by default this method will try to create Event Grid and Storage Queue on the same Storage Account as your data.Please not that PREMIUM Blob Storage do not have QUEUE service.In my opinion the easiest way would be to create ma...

  • 0 kudos
2 More Replies
RabahO
by New Contributor II
  • 8 Views
  • 0 replies
  • 0 kudos

Dashboard always display truncated data

Hello, we're working with a serverless SQL cluster to query Delta tables and display some analytics in dashboards. We have some basic group by queries that generate around 36k lines, and they are executed without the "limit" key word. So in the data ...

RabahO_0-1714985064998.png RabahO_1-1714985222841.png
  • 8 Views
  • 0 replies
  • 0 kudos
pragarwal
by New Contributor II
  • 25 Views
  • 1 replies
  • 0 kudos

Adding Member to group using account databricks rest api

Hi All,I want to add a member to a group in databricks account level using rest api (https://docs.databricks.com/api/azure/account/accountgroups/patch) as mentioned in this link I could able to authenticate but not able to add member while using belo...

  • 25 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz
Community Manager
  • 0 kudos

Hi @pragarwal,  The body you’ve shared is almost correct. However, there’s a small issue. Instead of directly providing the email address as the value, you need to provide an object with the "value" field set to the email address. Here’s the correcte...

  • 0 kudos
Labels
Top Kudoed Authors