cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 
Data + AI Summit 2024 - Data Engineering & Streaming

Forum Posts

srikanth2
by New Contributor II
  • 861 Views
  • 2 replies
  • 0 kudos

Can we use Managed Identity to create mount point for ADLS Gen2

Hi,We would like to use Azure Managed Identity to create mount point to read/write data from/to ADLS Gen2?We are also using following code snippet to use MSI authentication to read data from ADLS Gen2 but it is giving error,storage_account_name = "<<...

  • 861 Views
  • 2 replies
  • 0 kudos
Latest Reply
Walter_C
Honored Contributor
  • 0 kudos

It seems that using User Assigned Managed Identity to read/write from ADLS Gen2 inside a notebook is not directly supported at the moment.

  • 0 kudos
1 More Replies
stepysamud
by New Contributor
  • 430 Views
  • 1 replies
  • 0 kudos

Workflow UI broken after creating job via the api

Hi all,I'm in the progress of migrating from Databricks Azure to Databricks AWS.One part of this is migrating all our workflows which I wanted to via the /api/2.1/jobs/create api with the workflow passed via the json body. I have successfully created...

stepysamud_0-1714037158355.png
  • 430 Views
  • 1 replies
  • 0 kudos
Latest Reply
Walter_C
Honored Contributor
  • 0 kudos

Hello, many thanks for your question, as per the error message showed it was mentioning a possible timeout or network issue. As first step have you tried to open the page on another browser or using incognito mode?Also have you tried using different ...

  • 0 kudos
Sasikala
by New Contributor
  • 612 Views
  • 1 replies
  • 0 kudos

Service Principal Managed by Databricks

I have done the below steps1. Created a databricks managed service principal2. Created a Oauth Secret3. Gave all necessary permissions to the service principalI'm trying to use this Service principal in Azure Devops to automate CI/CD. but it fails as...

  • 612 Views
  • 1 replies
  • 0 kudos
Latest Reply
Walter_C
Honored Contributor
  • 0 kudos

Have you follow the steps available for service principal for CI/CD available here: https://learn.microsoft.com/en-us/azure/databricks/dev-tools/ci-cd/ci-cd-sp

  • 0 kudos
radothede
by New Contributor III
  • 706 Views
  • 2 replies
  • 1 kudos

Can on-demand clusters be shared across multiple jobs using cluster pool with max capacity ?

I have a cluster pool with max capacity. I run multiple jobs against that cluster pool.Can on-demand clusters, created within this cluster pool, be shared across multiple different jobs, at the same time?The reason I'm asking is I can see a downgrade...

  • 706 Views
  • 2 replies
  • 1 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 1 kudos

Hi @radothede, Cluster Pools and On-Demand Clusters: In Azure Databricks, a cluster pool is a collection of idle, pre-configured clusters that can be shared among multiple users or jobs. Instead of giving each user their own dedicated cluster, you...

  • 1 kudos
1 More Replies
Manzilla
by New Contributor II
  • 751 Views
  • 2 replies
  • 1 kudos

Delta Live table - Adding streaming to existing table

Currently, the bronze table ingests JSON files using @Dlt.table decorator on a spark.readStream functionA daily batch job does some transformation on bronze data and stores results in the silver table.New ProcessBronze still the same.A stream has bee...

  • 751 Views
  • 2 replies
  • 1 kudos
Latest Reply
Manzilla
New Contributor II
  • 1 kudos

Thank you thats what I understood too.  It is just nice to get validation from someone else that works with this.

  • 1 kudos
1 More Replies
gabrieleladd
by New Contributor II
  • 860 Views
  • 2 replies
  • 1 kudos

Clearing data stored by pipelines

Hi everyone! I'm new to Databricks and moving my first steps with Delta Live Tables, so please forgive my inexperience. I'm building my first DLT pipeline and there's something that I can't really grasp: how to clear all the objects generated or upda...

Data Engineering
Data Pipelines
Delta Live Tables
  • 860 Views
  • 2 replies
  • 1 kudos
Latest Reply
Lakshay
Esteemed Contributor
  • 1 kudos

If you want to reprocess all the data, you can simply for a "Full Refresh" option in the DLT pipeline. You can read more about it here: https://docs.databricks.com/en/delta-live-tables/updates.html#how-delta-live-tables-updates-tables-and-views

  • 1 kudos
1 More Replies
RicardoS
by New Contributor II
  • 5000 Views
  • 3 replies
  • 1 kudos

Value of SQL variable in IF statement using Spark SQL

Hi there,I am new to Spark SQL and would like to know if it possible to reproduce the below T-SQL query in Databricks. This is a sample query, but I want to determine if a query needs to be executed or not. DECLARE       @VariableA AS INT ,     @Vari...

  • 5000 Views
  • 3 replies
  • 1 kudos
Latest Reply
Edthehead
New Contributor III
  • 1 kudos

Since you are looking for a single value back, you can use the CASE function to achieve what you need.%sqlSET var.myvarA = (SELECT 6);SET var.myvarB = (SELECT 7);SELECT CASE WHEN ${var.myvarA} = ${var.myvarB} THEN 'Equal' ELSE 'Not equal' END AS resu...

  • 1 kudos
2 More Replies
jaredrohe
by New Contributor III
  • 2923 Views
  • 5 replies
  • 2 kudos

Instance Profiles Do Not Work with Delta Live Tables Default Cluster Policy Access Mode "Shared"

Hello,I am attempting to configure Autoloader in File Notification mode with Delta Live Tables. I configured an instance profile, but it is not working because I immediately get AWS access denied errors. This is the same issue that is referenced here...

Data Engineering
Access Mode
Delta Live Tables
Instance Profiles
No Isolation Shared
  • 2923 Views
  • 5 replies
  • 2 kudos
Latest Reply
jaredrohe
New Contributor III
  • 2 kudos

Unfortunately, I never got this to work.

  • 2 kudos
4 More Replies
vinayaka_pallak
by New Contributor
  • 625 Views
  • 1 replies
  • 0 kudos

Pytest on Notebook

 I am currently exploring testing methodologies for Databricks notebooks and would like to inquire whether it's possible to write pytest tests for notebooks that contain code not encapsulated within functions or classes.***********************a = 4b ...

  • 625 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 0 kudos

Hi @vinayaka_pallak, Testing Databricks Notebooks is essential to ensure the correctness and reliability of your code. While notebooks are often used for exploratory analysis and prototyping, it’s still possible to write tests for code blocks withi...

  • 0 kudos
htu
by New Contributor II
  • 1507 Views
  • 2 replies
  • 0 kudos

Installing Databricks Connect breaks pyspark local cluster mode

Hi, It seems that when databricks-connect is installed, pyspark is at the same time modified so that it will not anymore work with local master node. This has been especially useful in testing, when unit tests for spark-related code without any remot...

  • 1507 Views
  • 2 replies
  • 0 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 0 kudos

Hi @htu, When you install Databricks Connect, it modifies the behaviour of PySpark in a way that prevents it from working with the local master node. This can be frustrating, especially when you’re trying to run unit tests for Spark-related code w...

  • 0 kudos
1 More Replies
Fnazar
by New Contributor
  • 449 Views
  • 1 replies
  • 0 kudos

Billing of Databricks Job clusters

Hi All,Please help me understand how the billing is calculated for using the Job cluster.Document says they are charged hourly basis, so if my job ran for 1hr 30mins then will be charged for the 30mins based on the hourly rate or it will be charged f...

  • 449 Views
  • 1 replies
  • 0 kudos
Latest Reply
PL_db
New Contributor III
  • 0 kudos

Job clusters consume DBUs per hour depending on the VM size. The Databricks billing happens at "per second granularity", see here. That means if you run your job for 1.5 hours, you will be charged DBUs/hour*1.5*SKU_price; accordingly, if you run your...

  • 0 kudos
Erik_L
by Contributor II
  • 705 Views
  • 1 replies
  • 0 kudos

BUG: Unity Catalog kills UDF

We have UDFs in a few locations and today we noticed they died in performance. This seems to be caused by Unity Catalog.Test environment 1:Databricks Runtime Environment: 14.3 / 15.1Compute: 1 master, 4 nodesPolicy: UnrestrictedAccess Mode: SharedTes...

  • 705 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 0 kudos

Hi @Erik_L , It appears that you’re experiencing performance issues related to Unity Catalog in your Databricks environment. Let’s explore some potential reasons and solutions: Mismanagement of Metastores: Unity Catalog, with one metastore per re...

  • 0 kudos
Kayl669
by New Contributor III
  • 1327 Views
  • 5 replies
  • 0 kudos

SQL code against tables with '>' in headers suddenly failing?

Just want to post this issue we're experiencing here in case other people are facing something similar. Below is the wording of the support ticket request I've raised:SQL code that has been working is suddenly failing due to syntax errors today. Ther...

  • 1327 Views
  • 5 replies
  • 0 kudos
Latest Reply
Kayl669
New Contributor III
  • 0 kudos

The point that we've got to with this is that MS Support / Databricks have acknowledged that they did something and are working on a fix. "The issue occurred due to the regression in the recent DBR maintenance release...Our engineering team is workin...

  • 0 kudos
4 More Replies
Mathias_Peters
by Contributor
  • 300 Views
  • 1 replies
  • 0 kudos

On the fly transformations on DLT tables

Hi, I am loading data from a kinesis data stream using DLT. CREATE STREAMING TABLE Consumers_kinesis_2 ( ..., unbase64(data) String, ... ) AS SELECT * FROM STREAM read_kinesis (...) Is it possible to directly cast, unbase64, and/or transform the resu...

  • 300 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 0 kudos

Hi @Mathias_Peters, When working with Amazon Kinesis Data Analytics, you can indeed transform data before writing it into a streaming table. Let’s explore some options: Unbase64 Transformation: To decode Base64-encoded data, you can use the unba...

  • 0 kudos
Join 100K+ Data Experts: Register Now & Grow with Us!

Excited to expand your horizons with us? Click here to Register and begin your journey to success!

Already a member? Login and join your local regional user group! If there isn’t one near you, fill out this form and we’ll create one for you to join!

Labels