cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

prats33
by New Contributor
  • 767 Views
  • 1 replies
  • 0 kudos

schedule job termination

Hi i want to terminate my databricks job daily at 11.59am, how can i achieve this in databricks

  • 767 Views
  • 1 replies
  • 0 kudos
Latest Reply
Ajay-Pandey
Esteemed Contributor III
  • 0 kudos

Hi @prats33 You can use databricks cluster API for terminate your cluster at any specific time, create notebook for API and schedule it as databricks workflow job on job cluster at 11:59.

  • 0 kudos
srikanth2
by New Contributor II
  • 2288 Views
  • 2 replies
  • 0 kudos

Can we use Managed Identity to create mount point for ADLS Gen2

Hi,We would like to use Azure Managed Identity to create mount point to read/write data from/to ADLS Gen2?We are also using following code snippet to use MSI authentication to read data from ADLS Gen2 but it is giving error,storage_account_name = "<<...

  • 2288 Views
  • 2 replies
  • 0 kudos
Latest Reply
Walter_C
Databricks Employee
  • 0 kudos

It seems that using User Assigned Managed Identity to read/write from ADLS Gen2 inside a notebook is not directly supported at the moment.

  • 0 kudos
1 More Replies
stepysamud
by New Contributor
  • 832 Views
  • 1 replies
  • 0 kudos

Workflow UI broken after creating job via the api

Hi all,I'm in the progress of migrating from Databricks Azure to Databricks AWS.One part of this is migrating all our workflows which I wanted to via the /api/2.1/jobs/create api with the workflow passed via the json body. I have successfully created...

stepysamud_0-1714037158355.png
  • 832 Views
  • 1 replies
  • 0 kudos
Latest Reply
Walter_C
Databricks Employee
  • 0 kudos

Hello, many thanks for your question, as per the error message showed it was mentioning a possible timeout or network issue. As first step have you tried to open the page on another browser or using incognito mode?Also have you tried using different ...

  • 0 kudos
Sasikala
by New Contributor
  • 1185 Views
  • 1 replies
  • 0 kudos

Service Principal Managed by Databricks

I have done the below steps1. Created a databricks managed service principal2. Created a Oauth Secret3. Gave all necessary permissions to the service principalI'm trying to use this Service principal in Azure Devops to automate CI/CD. but it fails as...

  • 1185 Views
  • 1 replies
  • 0 kudos
Latest Reply
Walter_C
Databricks Employee
  • 0 kudos

Have you follow the steps available for service principal for CI/CD available here: https://learn.microsoft.com/en-us/azure/databricks/dev-tools/ci-cd/ci-cd-sp

  • 0 kudos
radothede
by Contributor III
  • 1556 Views
  • 1 replies
  • 0 kudos

Can on-demand clusters be shared across multiple jobs using cluster pool with max capacity ?

I have a cluster pool with max capacity. I run multiple jobs against that cluster pool.Can on-demand clusters, created within this cluster pool, be shared across multiple different jobs, at the same time?The reason I'm asking is I can see a downgrade...

  • 1556 Views
  • 1 replies
  • 0 kudos
gabe123
by New Contributor
  • 1054 Views
  • 0 replies
  • 0 kudos

Strange Error with custom module in delta live table pipeline

The chunk of code in questionsys.path.append( spark.conf.get("util_path", "/Workspace/Repos/Production/loch-ness/utils/") ) from broker_utils import extract_day_with_suffix, proper_case_address_udf, proper_case_last_name_first_udf, proper_case_ud...

  • 1054 Views
  • 0 replies
  • 0 kudos
AKUMAR_DEngg
by New Contributor II
  • 1977 Views
  • 0 replies
  • 0 kudos

Job Cluster's CPU utilization goes higher than 100% few times during the workload run

I have Data Engineering Pipeline workload that run on Databricks.Job cluster has following configuration :- Worker  i3.4xlarge with 122 GB memory and 16 coresDriver i3.4xlarge with 122 GB memory and 16 cores ,Min Worker -4 and Max Worker 8 We noticed...

Data Engineering
Databricks
  • 1977 Views
  • 0 replies
  • 0 kudos
RicardoS
by New Contributor II
  • 8542 Views
  • 3 replies
  • 1 kudos

Value of SQL variable in IF statement using Spark SQL

Hi there,I am new to Spark SQL and would like to know if it possible to reproduce the below T-SQL query in Databricks. This is a sample query, but I want to determine if a query needs to be executed or not. DECLARE       @VariableA AS INT ,     @Vari...

  • 8542 Views
  • 3 replies
  • 1 kudos
Latest Reply
Edthehead
Contributor III
  • 1 kudos

Since you are looking for a single value back, you can use the CASE function to achieve what you need.%sqlSET var.myvarA = (SELECT 6);SET var.myvarB = (SELECT 7);SELECT CASE WHEN ${var.myvarA} = ${var.myvarB} THEN 'Equal' ELSE 'Not equal' END AS resu...

  • 1 kudos
2 More Replies
Fnazar
by New Contributor II
  • 1562 Views
  • 1 replies
  • 0 kudos

Billing of Databricks Job clusters

Hi All,Please help me understand how the billing is calculated for using the Job cluster.Document says they are charged hourly basis, so if my job ran for 1hr 30mins then will be charged for the 30mins based on the hourly rate or it will be charged f...

  • 1562 Views
  • 1 replies
  • 0 kudos
Latest Reply
PL_db
Databricks Employee
  • 0 kudos

Job clusters consume DBUs per hour depending on the VM size. The Databricks billing happens at "per second granularity", see here. That means if you run your job for 1.5 hours, you will be charged DBUs/hour*1.5*SKU_price; accordingly, if you run your...

  • 0 kudos
Kayl669
by New Contributor III
  • 3198 Views
  • 5 replies
  • 0 kudos

SQL code against tables with '>' in headers suddenly failing?

Just want to post this issue we're experiencing here in case other people are facing something similar. Below is the wording of the support ticket request I've raised:SQL code that has been working is suddenly failing due to syntax errors today. Ther...

  • 3198 Views
  • 5 replies
  • 0 kudos
Latest Reply
Kayl669
New Contributor III
  • 0 kudos

The point that we've got to with this is that MS Support / Databricks have acknowledged that they did something and are working on a fix. "The issue occurred due to the regression in the recent DBR maintenance release...Our engineering team is workin...

  • 0 kudos
4 More Replies
Red1
by New Contributor III
  • 4100 Views
  • 6 replies
  • 2 kudos

Autoingest not working with Unity Catalog in DLT pipeline

Hey Everyone,I've built a very simple pipeline with a single DLT using auto ingest, and it works, provided I don't specify the output location. When I build the same pipeline but set UC as the output location, it fails when setting up S3 notification...

  • 4100 Views
  • 6 replies
  • 2 kudos
Latest Reply
Red1
New Contributor III
  • 2 kudos

Hey @Babu_Krishnan I was! I had to reach out to my Databricks support engineer directly and the resolution was to add "cloudfiles.awsAccessKey" and "cloudfiles.awsSecretKey" to the params as in the screenshot below (apologies, i don't know why the sc...

  • 2 kudos
5 More Replies
Mado
by Valued Contributor II
  • 15490 Views
  • 4 replies
  • 3 kudos

Resolved! Using "Select Expr" and "Stack" to Unpivot PySpark DataFrame doesn't produce expected results

I am trying to unpivot a PySpark DataFrame, but I don't get the correct results.Sample dataset:# Prepare Data data = [("Spain", 101, 201, 301), \ ("Taiwan", 102, 202, 302), \ ("Italy", 103, 203, 303), \ ("China", 104, 204, 304...

image image
  • 15490 Views
  • 4 replies
  • 3 kudos
Latest Reply
lukeoz
New Contributor III
  • 3 kudos

You can also use backticks around the column names that would otherwise be recognised as numbers.from pyspark.sql import functions as F   unpivotExpr = "stack(3, '2018', `2018`, '2019', `2019`, '2020', `2020`) as (Year, CPI)" unPivotDF = df.select("C...

  • 3 kudos
3 More Replies
6502
by New Contributor III
  • 2238 Views
  • 1 replies
  • 0 kudos

Delete on streaming table and starting startingVersion

I deleted for mistake some records from a streaming table, and of course, the streaming job stopped working. So I restored the table at the version before the delete was done, and attempted to restart the job using the startingVersion to the new vers...

  • 2238 Views
  • 1 replies
  • 0 kudos
Latest Reply
raphaelblg
Databricks Employee
  • 0 kudos

Hello @6502, It appears you've used the `startingVersion` parameter in your streaming query, which causes the stream to begin processing data from the version prior to the DELETE operation version. However, the DELETE operation will still be processe...

  • 0 kudos
Erik_L
by Contributor II
  • 1070 Views
  • 0 replies
  • 0 kudos

BUG: Unity Catalog kills UDF

We have UDFs in a few locations and today we noticed they died in performance. This seems to be caused by Unity Catalog.Test environment 1:Databricks Runtime Environment: 14.3 / 15.1Compute: 1 master, 4 nodesPolicy: UnrestrictedAccess Mode: SharedTes...

  • 1070 Views
  • 0 replies
  • 0 kudos

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels