cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 
Data + AI Summit 2024 - Data Engineering & Streaming

Forum Posts

RishabhGarg
by New Contributor II
  • 368 Views
  • 3 replies
  • 2 kudos

Keywords and Functions supported in SQL but not in Databricks SQL.

Actually, I have around 2000 SQL queries. I have to convert them in Databricks supported SQLs, so that I can run them in databricks environment. So I want to know the list of all keywords, functions or anything that is different in databricks SQL. Pl...

  • 368 Views
  • 3 replies
  • 2 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 2 kudos

Hi @RishabhGarg, Hi, Thank you for reaching out to our community! We're here to help you. To ensure we provide you with the best support, could you please take a moment to review the response and choose the one that best answers your question? Your f...

  • 2 kudos
2 More Replies
ptambe
by New Contributor III
  • 4146 Views
  • 7 replies
  • 3 kudos

Resolved! Is Concurrent Writes from multiple databricks clusters to same delta table on S3 Supported?

Does databricks have support for writing to same Delta Table from multiple clusters concurrently. I am specifically interested to know if there is any solution for https://github.com/delta-io/delta/issues/41 implemented in databricks OR if you have a...

  • 4146 Views
  • 7 replies
  • 3 kudos
Latest Reply
dennyglee
New Contributor III
  • 3 kudos

Please note, the issue noted above [Storage System] Support for AWS S3 (multiple clusters/drivers/JVMs) is for Delta Lake OSS. As noted in this issue as well as Issue 324, as of this writing, S3 lacks putIfAbsent transactional consistency. For Del...

  • 3 kudos
6 More Replies
talenik
by New Contributor III
  • 768 Views
  • 4 replies
  • 1 kudos

Resolved! Ingesting logs from Databricks (GCP) to Azure log Analytics

Hi everyone, I wanted to ask if there is any way through which we can ingest logs from GCP databricks to azure log analytics in store-sync fashion. Meaning we will save logs into some cloud bucket lets say, then from there we should be able to send l...

Data Engineering
azure log analytics
Databricks
GCP databricks
google cloud
  • 768 Views
  • 4 replies
  • 1 kudos
Latest Reply
talenik
New Contributor III
  • 1 kudos

Hi @Kaniz_Fatma ,Thanks for help. We decided to develop our own library for logging to azure log analytics. We used buffer for this. We are currently on timer based logs but in future versions we wanted to move to memory based.Thanks,Nikhil

  • 1 kudos
3 More Replies
Gary_Irick
by New Contributor III
  • 7198 Views
  • 10 replies
  • 12 kudos

Delta table partition directories when column mapping is enabled

I recently created a table on a cluster in Azure running Databricks Runtime 11.1. The table is partitioned by a "date" column. I enabled column mapping, like this:ALTER TABLE {schema}.{table_name} SET TBLPROPERTIES('delta.columnMapping.mode' = 'nam...

  • 7198 Views
  • 10 replies
  • 12 kudos
Latest Reply
talenik
New Contributor III
  • 12 kudos

Hi @Kaniz_Fatma , I have few queries on Directory Names with Column Mapping. I have this delta table on ADLS and I am trying to read it, but I am getting below error. How can we read delta tables with column mapping enabled with pyspark?Can you pleas...

  • 12 kudos
9 More Replies
kodexolabs
by New Contributor
  • 212 Views
  • 0 replies
  • 0 kudos

Federated Learning for Decentralized, Secure Model Training

Federated learning allows you to train machine learning models on decentralized data while ensuring data privacy and security by storing data on local devices and only sharing model updates. This approach assures that raw data never leaves its source...

  • 212 Views
  • 0 replies
  • 0 kudos
guangyi
by Contributor
  • 484 Views
  • 1 replies
  • 0 kudos

Resolved! How exactly to create cluster policy via Databricks CLI ?

I tried these ways they are all not working:  Save the json config into a JSON file locally and run databricks cluster-policies create --json cluster-policy.json Error message: Error: invalid character 'c' looking for beginning of valueSave the json ...

  • 484 Views
  • 1 replies
  • 0 kudos
Latest Reply
szymon_dybczak
Contributor
  • 0 kudos

Hi @guangyi ,Try to add @ before the name of json filedatabricks cluster-policies create --json @policy.json Also make sure that you're escaping quotation marks like they do in below documenation:Create a new policy | Cluster Policies API | REST API ...

  • 0 kudos
suqadi
by New Contributor
  • 153 Views
  • 1 replies
  • 0 kudos

systems table predictive_optimization_operations_history stays empty

Hi,For our lakehouse with Unity catalog enabled, we enabled predictive optimization feature for several catalogs to clean up storage with Vacuum. When we describe the catalogs, we can see that predictive optimization is enabled. The system table for ...

  • 153 Views
  • 1 replies
  • 0 kudos
Latest Reply
Walter_C
Honored Contributor
  • 0 kudos

Hello as per docs data could take 24 hours to be retrieved, can you confirm if the below requirement are met?Your region must support predictive optimization (see Databricks clouds and regions).

  • 0 kudos
prasadvaze
by Valued Contributor II
  • 294 Views
  • 1 replies
  • 2 kudos

Resolved! Grant permission on catalog but revoke from schema for the same user

I have a catalog ( in unity catalog) containing multiple schemas.  I need an AD group to have select permission on all the schemas so at catalog level I granted Select to AD grp.  Then, I need to revoke permission on one particular schema in this cat...

  • 294 Views
  • 1 replies
  • 2 kudos
Latest Reply
Walter_C
Honored Contributor
  • 2 kudos

This unfortunately is not possible due to the hierarchical mechanism in UC, you will need to grant permissions to the specific schemas directly and not by providing a major permission at the catalog level

  • 2 kudos
Abhot
by New Contributor II
  • 5204 Views
  • 4 replies
  • 0 kudos

Temp Table Vs Temp View Vs temp table function- which one is better for large Databrick data processing

Hello , 1 ) Which one is better during large data processing - Temp table vs Temporary view vs temp Table function . 2) How lazy evaluation better for processing ? and which one of the above helps in lazy evaluation

  • 5204 Views
  • 4 replies
  • 0 kudos
Latest Reply
Abhot
New Contributor II
  • 0 kudos

Does anyone have any suggestions regarding the question above?

  • 0 kudos
3 More Replies
lozik
by New Contributor II
  • 250 Views
  • 2 replies
  • 0 kudos

Python callback functions fail to trigger

How can I get sys.exceptionhook and atexit module to trigger a callback function on exit of a python notebook? These fail to work when an unhandled exception is encountered (exceptionhook), or the program exits (atexit). 

  • 250 Views
  • 2 replies
  • 0 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 0 kudos

Hi @lozik, To achieve this, you can use a combination of sys.excepthook and monkey-patch sys.exit().

  • 0 kudos
1 More Replies
greyamber
by New Contributor II
  • 219 Views
  • 1 replies
  • 0 kudos

Python UDF vs Scala UDF in pyspark code

Is there a performance difference between Python UDF vs Scala UDF in pyspark code.

  • 219 Views
  • 1 replies
  • 0 kudos
Latest Reply
szymon_dybczak
Contributor
  • 0 kudos

Hi @greyamber ,Yes, there is a difference. Scala would be faster. You read about the reason and benchmark on following blog:Spark UDF — Deep Insights in Performance | by QuantumBlack, AI by McKinsey | QuantumBlack, AI by McKinsey | Medium

  • 0 kudos
hpant
by New Contributor III
  • 465 Views
  • 3 replies
  • 0 kudos
  • 465 Views
  • 3 replies
  • 0 kudos
Latest Reply
szymon_dybczak
Contributor
  • 0 kudos

Hi @hpant ,I think they are really similiar to overall best practices when in comes to python logging, like having centralize logging configuration, using correct log levels etc.Look for example on below article:10 Best Practices for Logging in Pytho...

  • 0 kudos
2 More Replies
Tom_Greenwood
by New Contributor III
  • 6658 Views
  • 10 replies
  • 2 kudos

UDF importing from other modules

Hi community,I am using a pyspark udf. The function is being imported from a repo (in the repos section) and registered as a UDF in a the notebook. I am getting a PythonException error when the transformation is run. This is comming from the databric...

Tom_Greenwood_0-1706798998837.png
  • 6658 Views
  • 10 replies
  • 2 kudos
Latest Reply
DennisB
New Contributor III
  • 2 kudos

I was getting a similar error (full traceback below), and determined that it's related to this issue. Setting the env variables DATABRICKS_HOST and DATABRICKS_TOKEN as suggested in that Github issue resolved the problem for me (albeit it's not a grea...

  • 2 kudos
9 More Replies
Tahseen0354
by Valued Contributor
  • 3648 Views
  • 5 replies
  • 4 kudos

Resolved! Why I am not receiving any mail sent to the Azure AD Group mailbox when databricks job fails ?

I have created an Azure AD Group in "Microsoft 365" type with its own email address, which being added to the Notification of a Databricks Job (on failure). But there is no mail sent to the Azure Group mailbox when the job fails.I am able to send a d...

  • 3648 Views
  • 5 replies
  • 4 kudos
Latest Reply
SalmanDB2024
New Contributor II
  • 4 kudos

Hey Guide,Appreciate your quick response. Actually I am using the GCP Databricks and the scope of groups i am not bothered about. The thing is I as an user of the job am unable to receive the email. Groups are not configured as of now.Need help regar...

  • 4 kudos
4 More Replies
SalmanDB2024
by New Contributor II
  • 197 Views
  • 1 replies
  • 0 kudos

Email Notification not received even when configured in Alerts

Hi Experts,In the Alerts of a job when configured the email id which is whitelisted and auto prompt in the dropdown  by databricks, even when configured to receive emails notification it does not shares the email, whereas the same notification of job...

  • 197 Views
  • 1 replies
  • 0 kudos
Latest Reply
szymon_dybczak
Contributor
  • 0 kudos

Hi @SalmanDB2024 ,Are you on Azure? If so, look at below solution. Maybe it will be helpfulSolved: Re: Why I am not receiving any mail sent to the Az... - Databricks Community - 15156  

  • 0 kudos

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group
Labels