cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Hubert-Dudek
by Esteemed Contributor III
  • 2701 Views
  • 1 replies
  • 7 kudos

SQL cells in databricks notebooks can now be run in parallel, which means faster query processing and analysis. This new feature is especially helpful...

SQL cells in databricks notebooks can now be run in parallel, which means faster query processing and analysis. This new feature is especially helpful for queries that take longer to run or analyze large datasets. With parallel processing, Databricks...

paraler
  • 2701 Views
  • 1 replies
  • 7 kudos
Latest Reply
Rishabh-Pandey
Esteemed Contributor
  • 7 kudos

Informative ​

  • 7 kudos
oleole
by Contributor
  • 11559 Views
  • 1 replies
  • 1 kudos

Resolved! MERGE to update a column of a table using Spark SQL

Coming from MS SQL background, I'm trying to write a query in Spark SQL that simply update a column value of table A (source table) by INNER JOINing a new table B with a filter.MS SQL query looks like this:UPDATE T SET T.OfferAmount = OSE.EndpointEve...

  • 11559 Views
  • 1 replies
  • 1 kudos
Latest Reply
oleole
Contributor
  • 1 kudos

Posting answer to my question:   MERGE into TempOffer VIEW USING OfferSeq OSE ON VIEW.OfferId = OSE.OfferID AND OSE.OfferId = 1 WHEN MATCHED THEN UPDATE set VIEW.OfferAmount = OSE.EndpointEventAmountValue;

  • 1 kudos
RyanHager
by Contributor
  • 2437 Views
  • 5 replies
  • 2 kudos

Is there a stream / Kafka topic that we can connect to for monitoring all Databricks jobs/workflows (create/status update/fail/error/complete)?

Currently we are creating and monitoring jobs using the api. This results in a lot of polling of the API for job status. Is there a Kafka stream, we could listen to get jobs updates and significantly reduce the number of calls to the Databricks jobs...

  • 2437 Views
  • 5 replies
  • 2 kudos
Latest Reply
Anonymous
Not applicable
  • 2 kudos

Hi @Ryan Hager​ Hope everything is going great.Just wanted to check in if you were able to resolve your issue. If yes, would you be happy to mark an answer as best so that other members can find the solution more quickly? If not, please tell us so we...

  • 2 kudos
4 More Replies
Ramana
by Contributor
  • 2651 Views
  • 3 replies
  • 3 kudos

Resolved! How do we set spark_version in cluster policies to select the latest GPU ML LTS version as defaultValue?

Currently, I use the below two different JSON snippets to choose either Standard or ML runtime. Similar to the below, what is the defaultValue for spark_version to select the latest GPU ML LTS runtime version? "spark_version": {  "type": "regex",  "p...

  • 2651 Views
  • 3 replies
  • 3 kudos
Latest Reply
LandanG
Databricks Employee
  • 3 kudos

Hi @Ramana Kancharana​ ,As of right now these options are only available for non-GPU DBRs

  • 3 kudos
2 More Replies
irfanaziz
by Contributor II
  • 3753 Views
  • 1 replies
  • 3 kudos

TimestampFormat issue

The databricks notebook failed yesterday due to timestamp format issue. error:"SparkUpgradeException: You may get a different result due to the upgrading of Spark 3.0: Fail to parse '2022-08-10 00:00:14.2760000' in the new parser. You can set spark.s...

  • 3753 Views
  • 1 replies
  • 3 kudos
Latest Reply
searchs
New Contributor II
  • 3 kudos

You must have solved this issue by now but for the sake of those that encounter this again, here's the solution that worked for me:spark.sql("set spark.sql.legacy.timeParserPolicy=LEGACY")

  • 3 kudos
yzhang
by New Contributor III
  • 2894 Views
  • 5 replies
  • 0 kudos

Cannot find such info if Databricks supports nested jobs or tasks. For example, I have a 'job_a', which contains list of tasks, and another &#...

Cannot find such info if Databricks supports nested jobs or tasks. For example, I have a 'job_a', which contains list of tasks, and another 'job_b', also contains a list of tasks. Now I'd like to have a 'job_all' that will run both 'job_a' and 'job_b...

  • 2894 Views
  • 5 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Hi @Yanan Zhang​ Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the response and select the one that best answers yo...

  • 0 kudos
4 More Replies
Chris_Shehu
by Valued Contributor III
  • 3425 Views
  • 4 replies
  • 2 kudos

Resolved! No Explicit Deny for User security configurations at the group level?

Currently when you add new users to the Databricks workspace they get added to a "Users" group that has full access to the workspace. There should be a way to use group security to explicitly deny access to those same settings. This setting should ov...

image image image
  • 3425 Views
  • 4 replies
  • 2 kudos
Latest Reply
Anonymous
Not applicable
  • 2 kudos

@dean james​ I am not sure about your case why you want to deny access to the group once you create it. Anyhow, we can use deacticate/activate an user using "2.0/preview/scim/v2/Users/{id}" rest API endpoint. We can also deactivate users that have no...

  • 2 kudos
3 More Replies
andrew0117
by Contributor
  • 4307 Views
  • 4 replies
  • 0 kudos

Resolved! Can merge() function be applied to dataframe?

if I have two dataframes df_target and df_source, can I do df_target.as("t).merge(df_source.as("s"), "s.id=t.id").whenMatched().updateAll().whenNotMatched.insertAll.execute(). when I tried the code above, I got the error "merge is not a member of the...

  • 4307 Views
  • 4 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Hi @andrew li​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Thanks!

  • 0 kudos
3 More Replies
JJL
by New Contributor II
  • 14561 Views
  • 3 replies
  • 3 kudos

Resolved! Does Spark SQL can perform UPDATE with INNER JOIN and LIKE with '%' + [column] + '%' ?

Hi All,I came from MS SQL and just started to learning more about Spark SQLHere is one part that I'm trying to perform. In MS SQL, it can be easily done, but it seems like it doesn't in SparkSo, I want to make a simple update to the record, if the co...

  • 14561 Views
  • 3 replies
  • 3 kudos
Latest Reply
oleole
Contributor
  • 3 kudos

@Hubert Dudek​ Hello, I'm having the same issue with using UPDATE in spark sql and came across your answer. When you say "replace source_table_reference with view" in MERGE, do you mean to replace "P" with "VIEW" that looks something as below:%sql ME...

  • 3 kudos
2 More Replies
Anonymous
by Not applicable
  • 5681 Views
  • 1 replies
  • 1 kudos

Databricks-connect configured with service principal token but unable to retrieve information to local machine

installed databricks-connect and configured with service principal token, able to start cluster when I use command spark=SparkSession\.builder\.getOrCreate() But when trying to retrieve s3 bucket data to local machine or even i run test command ex...

  • 5681 Views
  • 1 replies
  • 1 kudos
Latest Reply
Anonymous
Not applicable
  • 1 kudos

Hi @divya08Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Thanks!

  • 1 kudos
Gaurav_Raj
by New Contributor III
  • 2668 Views
  • 3 replies
  • 3 kudos

Resolved! Lakehouse Fundamentals Accreditation Badge not received after the course completion

I completed the Databricks Lakehouse Fundamentals Accreditation course today, but I didn't receive my badge yet.I even checked in: https://credentials.databricks.com/ but shows no record/ credentials. see the screenshot below. Please help me out with...

image
  • 2668 Views
  • 3 replies
  • 3 kudos
Latest Reply
Anonymous
Not applicable
  • 3 kudos

Hi @Gaurav Raj​ Thank you for posting your question in our community! We are happy to assist you.Every best answer marked contributes to the growth and success of our community.Regards

  • 3 kudos
2 More Replies
RengarLee
by Contributor
  • 8287 Views
  • 10 replies
  • 3 kudos

Resolved! Databricks write to Azure Data Explorer writes suddenly become slower

Now, I write to Azure Data explorer using Spark streaming. one day, writes suddenly become slower. restart is no effect.I have a questions about Spark Streaming to Azure Data explorer.Q1: What should I do to get performance to reply?Figure 1 shows th...

  • 8287 Views
  • 10 replies
  • 3 kudos
Latest Reply
RengarLee
Contributor
  • 3 kudos

I'm so sorry, I just thought the issue wasn't resolvedSolutionSet maxFilesPerTrigger and maxBytesPerTrigger Enable autpoptimizeReason for the first day, it processes larger files and then eventually process smaller files。Detailed reason B...

  • 3 kudos
9 More Replies
MetaRossiVinli
by Contributor
  • 4522 Views
  • 1 replies
  • 1 kudos

Resolved! Find root path to Repo for .py file import

I want to import a Python function stored in the following file path:`<repo>/lib/lib_helpers.py`I want to import the function from any file in my repo. For instance from these:`<repo>/notebooks/etl/bronze/dlt_bronze_elt``<repo>/workers/job_worker`It ...

  • 4522 Views
  • 1 replies
  • 1 kudos
Latest Reply
MetaRossiVinli
Contributor
  • 1 kudos

Ok, I figured it out. If you just make it a Python module by adding an empty `__init__.py`, Databricks will load it on start. Then, you can just import it.

  • 1 kudos
Ancil
by Contributor II
  • 9189 Views
  • 7 replies
  • 4 kudos

Resolved! Cannot create table The associated location ('dbfs:/mnt/[REDACTED]/folder/table_location') is not empty but it's not a Delta table

Hi Team, I have created new databricks workspace for using private network. I want to read/write/update delta table created in old databricks and stored in ADLS as delta.Any one please help me on this?.....I tried to create delta table in same loca...

  • 9189 Views
  • 7 replies
  • 4 kudos
Latest Reply
Anonymous
Not applicable
  • 4 kudos

Hi @Ancil P A​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Thanks!

  • 4 kudos
6 More Replies

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group
Labels