Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
Hi all.2 questions about compute serverless1) How to check or query the runtime version used by Serverless compute/workflow? This is important as I use some features supported by specific runtime or higher2) Can you confirm if spark.conf.set(......) ...
Sure: https://learn.microsoft.com/en-us/azure/databricks/sql/language-manual/functions/current_version and https://learn.microsoft.com/en-us/azure/databricks/sql/language-manual/functions/versionOnly some properties are allowed to be changed
Hi all,I am quite new to databricks. Overall I enjoyed the experience so far, but now ran into a problem, I was not able to find an acceptable solution.Here is my setup: I have a bunch of s3 buckets, and need to put the data into databricks, preferab...
Hi, I am supposed to create transformation notebook. But i am having trouble when i am trying to save the transformed file into blob storage. I didn't use any layer, just the layer which i am performing transformation in. if i use wasbs i receive dif...
Hi @yagmur ,Did you assigned required permission to service principal on storage account?And make sure you're configuring connection to storage account in proper way. You should have something similiar to the code below: configs = {
"fs.a...
Hi,So previously I was using Old Preview Databricks API version. Now I have switched to v1 of the API which usage Execute a SQL statement POST /api/2.0/sql/statements/I wanted to know how to pass the parameter class when my value is list of string fo...
I have two notebooks created for my Delta Live Table pipeline. The first is a utils notebook with functions I will be reusing for other pipelines. The second contains my actual creation of the delta live tables. I added both notebooks to the pipeline...
Hi Dave,You can solve this by putting your utils into a python file and referencing your .py file in the DLT notebook. I provided a template for the python file below:STEP 1: #import functions
from pyspark.sql import SparkSession
import IPython
dbut...
Hi Databricks community,I'm facing a challenge extracting JSON data from Elasticsearch in Azure Databricks efficiently, maintaining header information.Previously, I had to use RDDs for parallel extraction, but they're no longer supported in Databrick...
Hi,I have a string column containing a number in EU format, has comma instead of dot, e.g. 10,35I need to convert this string into a proper decimal data type as part data transformation into the target table.I could do it as below by replacing the ",...
Hi @Harsha777 ,Your solution looks good!However, you may try also to_number function, but unfortunately still will need to first to replace "," with ".". from pyspark.sql.functions import to_number, regexp_replace, lit
data = [("10,6523",), ("10,23"...
I am facing issues while Importing dlt library in Databricks Runtime 14.3. Previously while using the Runtime 13.1 The `import dlt` was working fine but now when updating the Runtime it is giving me error.This is the Cluster's Configuration Also ...
@Retired_mod Do you have any solution for this above problem ? I saw you reply in this link https://community.databricks.com/t5/data-engineering/no-module-named-dlt/td-p/21105. so i just ask you. Thank you !
I there,I have a scenario where the source XML files may have all the fields or may be 80% of fields in next run. How to we load the files in Delta tables which should handle the XML files with all field lists and also with few fields only. In smalle...
Auto Loader is not acceptable solution in my case. I tried to make an empty table using XSD file and then load the data frame. Some how it worked to meet the objective.
As someone who frequently works with large datasets in Power BI, I’ve had my fair share of frustrations with slow query performance, especially when pulling data from Databricks. After countless hours of tweaking and experimenting, I’ve finally found...
Hey folks, I have been trying to set up an SQL warehouse for Databricks in AWS (on a new account) but I keep getting this:Cluster Start-up Delayed. Please wait while we continue to try and start the cluster. No action is required from you.This kept h...
Hey Brahma,Thanks for the hints. I actually tried a lot of thinks back and forth and the only thing that finally worked was to create a new workspace. Since this was a new account, that was easy. I suppose something would have gone wrong with the con...
Hello ,I'm working on a Delta Live Tables (DLT) pipeline where I need to implement a conditional step that only triggers under specific conditions. Here's the challenge I'm facing:I have a function that checks if the data meets certain thresholds. If...
We are using Databricks extensively in the company. We found out that we can’t clone/copy *.py file using UI. We can clone notebooks but not python file. If we clone folder, we only clone notebooks inside folder but not python file.
@shiva1212 When working with Databricks and managing Python files, it's true that the UI limitations can sometimes be restrictive. you can use Databricks CLI and REST API for file management and copying.
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.
If there isn’t a group near you, start one and help create a community that brings people together.