cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

ashraf1395
by Honored Contributor
  • 7221 Views
  • 1 replies
  • 1 kudos

Optimising Clusters in Databricks on GCP

Hi there everyone,We are trying to get hands on Databricks Lakehouse for a prospective client's project.Our Major aim for the project is to Compare Datalakehosue on Databricks and Bigquery Datawarehouse in terms of Costs and time to setup and run que...

  • 7221 Views
  • 1 replies
  • 1 kudos
smedegaard
by New Contributor III
  • 2363 Views
  • 1 replies
  • 0 kudos

DLT run filas with "com.databricks.cdc.spark.DebeziumJDBCMicroBatchProvider not found"

I've created a streaming live table from a foreign catalog. When I run the DLT pipeline it fils with "com.databricks.cdc.spark.DebeziumJDBCMicroBatchProvider not found".I haven't seen any documentation that suggests I need to install Debezium manuall...

  • 2363 Views
  • 1 replies
  • 0 kudos
MartinH
by New Contributor II
  • 20543 Views
  • 7 replies
  • 6 kudos

Resolved! Azure Data Factory and Photon

Hello, we have Databricks Python workbooks accessing Delta tables. These workbooks are scheduled/invoked by Azure Data Factory. How can I enable Photon on the linked services that are used to call Databricks?If I specify new job cluster, there does n...

  • 20543 Views
  • 7 replies
  • 6 kudos
Latest Reply
CharlesReily
New Contributor III
  • 6 kudos

When you create a cluster on Databricks, you can enable Photon by selecting the "Photon" option in the cluster configuration settings. This is typically done when creating a new cluster, and you would find the option in the advanced cluster configura...

  • 6 kudos
6 More Replies
dbdude
by New Contributor II
  • 14461 Views
  • 3 replies
  • 1 kudos

AWS Secrets Works In One Cluster But Not Another

Why can I use boto3 to go to secrets manager to retrieve a secret with a personal cluster but I get an error with a shared cluster?NoCredentialsError: Unable to locate credentials 

  • 14461 Views
  • 3 replies
  • 1 kudos
Latest Reply
Husky
New Contributor III
  • 1 kudos

Hey @dbdude, I am facing the same error. Did you find a solution to access the AWS credentials on a Shared Cluster?This article describes a way of storing credentials in a Unity Catalog Volume to fetch by the Shared Cluster:https://medium.com/@amluci...

  • 1 kudos
2 More Replies
mamiya
by New Contributor II
  • 1801 Views
  • 1 replies
  • 0 kudos

ODBC PowerBI 2 commands in one query

 Hello everyone,I'm trying to use the ODBC DirectQuery option in PowerBI, but I keep getting an error about another command. The SQL query works while using the SQL Editor. Do I need to change the setup of my ODBC connector?DECLARE dateFrom DATE = DA...

mamiya_0-1714651686806.png mamiya_3-1714651948145.png
  • 1801 Views
  • 1 replies
  • 0 kudos
Deepak_Kandpal
by New Contributor III
  • 13819 Views
  • 3 replies
  • 3 kudos

Resolved! Invalid configuration value detected for fs.azure.account.key with com.crealytics:spark-excel

I have setup my Databricks notebook to use Service Principal to access ADLS using below configuration.service_credential = dbutils.secrets.get(scope="<scope>",key="<service-credential-key>")   spark.conf.set("fs.azure.account.auth.type.<storage-accou...

  • 13819 Views
  • 3 replies
  • 3 kudos
Latest Reply
Harsha_Dbrs
New Contributor II
  • 3 kudos

Below is the implementation of same code in scala:spark.sparkContext.hadoopConfiguration.set("fs.azure.account.key.<accountName>.dfs.core.windows.net",<accountKey>)

  • 3 kudos
2 More Replies
prats33
by New Contributor
  • 1125 Views
  • 1 replies
  • 0 kudos

schedule job termination

Hi i want to terminate my databricks job daily at 11.59am, how can i achieve this in databricks

  • 1125 Views
  • 1 replies
  • 0 kudos
Latest Reply
Ajay-Pandey
Databricks MVP
  • 0 kudos

Hi @prats33 You can use databricks cluster API for terminate your cluster at any specific time, create notebook for API and schedule it as databricks workflow job on job cluster at 11:59.

  • 0 kudos
srikanth2
by New Contributor II
  • 3502 Views
  • 2 replies
  • 0 kudos

Can we use Managed Identity to create mount point for ADLS Gen2

Hi,We would like to use Azure Managed Identity to create mount point to read/write data from/to ADLS Gen2?We are also using following code snippet to use MSI authentication to read data from ADLS Gen2 but it is giving error,storage_account_name = "<<...

  • 3502 Views
  • 2 replies
  • 0 kudos
Latest Reply
Walter_C
Databricks Employee
  • 0 kudos

It seems that using User Assigned Managed Identity to read/write from ADLS Gen2 inside a notebook is not directly supported at the moment.

  • 0 kudos
1 More Replies
stepysamud
by New Contributor
  • 1325 Views
  • 1 replies
  • 0 kudos

Workflow UI broken after creating job via the api

Hi all,I'm in the progress of migrating from Databricks Azure to Databricks AWS.One part of this is migrating all our workflows which I wanted to via the /api/2.1/jobs/create api with the workflow passed via the json body. I have successfully created...

stepysamud_0-1714037158355.png
  • 1325 Views
  • 1 replies
  • 0 kudos
Latest Reply
Walter_C
Databricks Employee
  • 0 kudos

Hello, many thanks for your question, as per the error message showed it was mentioning a possible timeout or network issue. As first step have you tried to open the page on another browser or using incognito mode?Also have you tried using different ...

  • 0 kudos
Sasikala
by New Contributor
  • 1688 Views
  • 1 replies
  • 0 kudos

Service Principal Managed by Databricks

I have done the below steps1. Created a databricks managed service principal2. Created a Oauth Secret3. Gave all necessary permissions to the service principalI'm trying to use this Service principal in Azure Devops to automate CI/CD. but it fails as...

  • 1688 Views
  • 1 replies
  • 0 kudos
Latest Reply
Walter_C
Databricks Employee
  • 0 kudos

Have you follow the steps available for service principal for CI/CD available here: https://learn.microsoft.com/en-us/azure/databricks/dev-tools/ci-cd/ci-cd-sp

  • 0 kudos
radothede
by Valued Contributor II
  • 2216 Views
  • 1 replies
  • 0 kudos

Can on-demand clusters be shared across multiple jobs using cluster pool with max capacity ?

I have a cluster pool with max capacity. I run multiple jobs against that cluster pool.Can on-demand clusters, created within this cluster pool, be shared across multiple different jobs, at the same time?The reason I'm asking is I can see a downgrade...

  • 2216 Views
  • 1 replies
  • 0 kudos
gabe123
by New Contributor
  • 1351 Views
  • 0 replies
  • 0 kudos

Strange Error with custom module in delta live table pipeline

The chunk of code in questionsys.path.append( spark.conf.get("util_path", "/Workspace/Repos/Production/loch-ness/utils/") ) from broker_utils import extract_day_with_suffix, proper_case_address_udf, proper_case_last_name_first_udf, proper_case_ud...

  • 1351 Views
  • 0 replies
  • 0 kudos
AKUMAR_DEngg
by New Contributor II
  • 2676 Views
  • 0 replies
  • 0 kudos

Job Cluster's CPU utilization goes higher than 100% few times during the workload run

I have Data Engineering Pipeline workload that run on Databricks.Job cluster has following configuration :- Worker  i3.4xlarge with 122 GB memory and 16 coresDriver i3.4xlarge with 122 GB memory and 16 cores ,Min Worker -4 and Max Worker 8 We noticed...

Data Engineering
Databricks
  • 2676 Views
  • 0 replies
  • 0 kudos
RicardoS
by New Contributor II
  • 10685 Views
  • 3 replies
  • 1 kudos

Value of SQL variable in IF statement using Spark SQL

Hi there,I am new to Spark SQL and would like to know if it possible to reproduce the below T-SQL query in Databricks. This is a sample query, but I want to determine if a query needs to be executed or not. DECLARE       @VariableA AS INT ,     @Vari...

  • 10685 Views
  • 3 replies
  • 1 kudos
Latest Reply
Edthehead
Contributor III
  • 1 kudos

Since you are looking for a single value back, you can use the CASE function to achieve what you need.%sqlSET var.myvarA = (SELECT 6);SET var.myvarB = (SELECT 7);SELECT CASE WHEN ${var.myvarA} = ${var.myvarB} THEN 'Equal' ELSE 'Not equal' END AS resu...

  • 1 kudos
2 More Replies
Labels