cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Arnold_Souza
by New Contributor III
  • 5782 Views
  • 4 replies
  • 2 kudos

SAT - Security Analysis Tool implementation error

I want to implement SAT in my workspace account. I was able to execute the terraform that enable the necessary infra to work on that. When I try to execute the workflow "SAT Initializer Notebook (one-time)" it fails with the error:AnalysisException: ...

1 2
  • 5782 Views
  • 4 replies
  • 2 kudos
Latest Reply
Anonymous
Not applicable
  • 2 kudos

Hi @Arnold Souza​ Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answers ...

  • 2 kudos
3 More Replies
Hubert-Dudek
by Databricks MVP
  • 4683 Views
  • 1 replies
  • 7 kudos

SQL cells in databricks notebooks can now be run in parallel, which means faster query processing and analysis. This new feature is especially helpful...

SQL cells in databricks notebooks can now be run in parallel, which means faster query processing and analysis. This new feature is especially helpful for queries that take longer to run or analyze large datasets. With parallel processing, Databricks...

paraler
  • 4683 Views
  • 1 replies
  • 7 kudos
Latest Reply
Rishabh-Pandey
Databricks MVP
  • 7 kudos

Informative ​

  • 7 kudos
oleole
by Contributor
  • 15772 Views
  • 1 replies
  • 1 kudos

Resolved! MERGE to update a column of a table using Spark SQL

Coming from MS SQL background, I'm trying to write a query in Spark SQL that simply update a column value of table A (source table) by INNER JOINing a new table B with a filter.MS SQL query looks like this:UPDATE T SET T.OfferAmount = OSE.EndpointEve...

  • 15772 Views
  • 1 replies
  • 1 kudos
Latest Reply
oleole
Contributor
  • 1 kudos

Posting answer to my question:   MERGE into TempOffer VIEW USING OfferSeq OSE ON VIEW.OfferId = OSE.OfferID AND OSE.OfferId = 1 WHEN MATCHED THEN UPDATE set VIEW.OfferAmount = OSE.EndpointEventAmountValue;

  • 1 kudos
RyanHager
by Contributor
  • 3541 Views
  • 5 replies
  • 2 kudos

Is there a stream / Kafka topic that we can connect to for monitoring all Databricks jobs/workflows (create/status update/fail/error/complete)?

Currently we are creating and monitoring jobs using the api. This results in a lot of polling of the API for job status. Is there a Kafka stream, we could listen to get jobs updates and significantly reduce the number of calls to the Databricks jobs...

  • 3541 Views
  • 5 replies
  • 2 kudos
Latest Reply
Anonymous
Not applicable
  • 2 kudos

Hi @Ryan Hager​ Hope everything is going great.Just wanted to check in if you were able to resolve your issue. If yes, would you be happy to mark an answer as best so that other members can find the solution more quickly? If not, please tell us so we...

  • 2 kudos
4 More Replies
Ramana
by Valued Contributor
  • 3878 Views
  • 3 replies
  • 3 kudos

Resolved! How do we set spark_version in cluster policies to select the latest GPU ML LTS version as defaultValue?

Currently, I use the below two different JSON snippets to choose either Standard or ML runtime. Similar to the below, what is the defaultValue for spark_version to select the latest GPU ML LTS runtime version? "spark_version": {  "type": "regex",  "p...

  • 3878 Views
  • 3 replies
  • 3 kudos
Latest Reply
LandanG
Databricks Employee
  • 3 kudos

Hi @Ramana Kancharana​ ,As of right now these options are only available for non-GPU DBRs

  • 3 kudos
2 More Replies
irfanaziz
by Contributor II
  • 5241 Views
  • 1 replies
  • 3 kudos

TimestampFormat issue

The databricks notebook failed yesterday due to timestamp format issue. error:"SparkUpgradeException: You may get a different result due to the upgrading of Spark 3.0: Fail to parse '2022-08-10 00:00:14.2760000' in the new parser. You can set spark.s...

  • 5241 Views
  • 1 replies
  • 3 kudos
Latest Reply
searchs
New Contributor II
  • 3 kudos

You must have solved this issue by now but for the sake of those that encounter this again, here's the solution that worked for me:spark.sql("set spark.sql.legacy.timeParserPolicy=LEGACY")

  • 3 kudos
yzhang
by Contributor
  • 4136 Views
  • 5 replies
  • 0 kudos

Cannot find such info if Databricks supports nested jobs or tasks. For example, I have a 'job_a', which contains list of tasks, and another &#...

Cannot find such info if Databricks supports nested jobs or tasks. For example, I have a 'job_a', which contains list of tasks, and another 'job_b', also contains a list of tasks. Now I'd like to have a 'job_all' that will run both 'job_a' and 'job_b...

  • 4136 Views
  • 5 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Hi @Yanan Zhang​ Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the response and select the one that best answers yo...

  • 0 kudos
4 More Replies
Chris_Shehu
by Valued Contributor III
  • 5810 Views
  • 4 replies
  • 2 kudos

Resolved! No Explicit Deny for User security configurations at the group level?

Currently when you add new users to the Databricks workspace they get added to a "Users" group that has full access to the workspace. There should be a way to use group security to explicitly deny access to those same settings. This setting should ov...

image image image
  • 5810 Views
  • 4 replies
  • 2 kudos
Latest Reply
Anonymous
Not applicable
  • 2 kudos

@dean james​ I am not sure about your case why you want to deny access to the group once you create it. Anyhow, we can use deacticate/activate an user using "2.0/preview/scim/v2/Users/{id}" rest API endpoint. We can also deactivate users that have no...

  • 2 kudos
3 More Replies
andrew0117
by Contributor
  • 6276 Views
  • 4 replies
  • 0 kudos

Resolved! Can merge() function be applied to dataframe?

if I have two dataframes df_target and df_source, can I do df_target.as("t).merge(df_source.as("s"), "s.id=t.id").whenMatched().updateAll().whenNotMatched.insertAll.execute(). when I tried the code above, I got the error "merge is not a member of the...

  • 6276 Views
  • 4 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Hi @andrew li​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Thanks!

  • 0 kudos
3 More Replies
JJL
by New Contributor II
  • 19941 Views
  • 3 replies
  • 3 kudos

Resolved! Does Spark SQL can perform UPDATE with INNER JOIN and LIKE with '%' + [column] + '%' ?

Hi All,I came from MS SQL and just started to learning more about Spark SQLHere is one part that I'm trying to perform. In MS SQL, it can be easily done, but it seems like it doesn't in SparkSo, I want to make a simple update to the record, if the co...

  • 19941 Views
  • 3 replies
  • 3 kudos
Latest Reply
oleole
Contributor
  • 3 kudos

@Hubert Dudek​ Hello, I'm having the same issue with using UPDATE in spark sql and came across your answer. When you say "replace source_table_reference with view" in MERGE, do you mean to replace "P" with "VIEW" that looks something as below:%sql ME...

  • 3 kudos
2 More Replies
Anonymous
by Not applicable
  • 6305 Views
  • 1 replies
  • 1 kudos

Databricks-connect configured with service principal token but unable to retrieve information to local machine

installed databricks-connect and configured with service principal token, able to start cluster when I use command spark=SparkSession\.builder\.getOrCreate() But when trying to retrieve s3 bucket data to local machine or even i run test command ex...

  • 6305 Views
  • 1 replies
  • 1 kudos
Latest Reply
Anonymous
Not applicable
  • 1 kudos

Hi @divya08Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Thanks!

  • 1 kudos
Gaurav_Raj
by New Contributor III
  • 3863 Views
  • 3 replies
  • 3 kudos

Resolved! Lakehouse Fundamentals Accreditation Badge not received after the course completion

I completed the Databricks Lakehouse Fundamentals Accreditation course today, but I didn't receive my badge yet.I even checked in: https://credentials.databricks.com/ but shows no record/ credentials. see the screenshot below. Please help me out with...

image
  • 3863 Views
  • 3 replies
  • 3 kudos
Latest Reply
Anonymous
Not applicable
  • 3 kudos

Hi @Gaurav Raj​ Thank you for posting your question in our community! We are happy to assist you.Every best answer marked contributes to the growth and success of our community.Regards

  • 3 kudos
2 More Replies
RengarLee
by Contributor
  • 12538 Views
  • 10 replies
  • 3 kudos

Resolved! Databricks write to Azure Data Explorer writes suddenly become slower

Now, I write to Azure Data explorer using Spark streaming. one day, writes suddenly become slower. restart is no effect.I have a questions about Spark Streaming to Azure Data explorer.Q1: What should I do to get performance to reply?Figure 1 shows th...

  • 12538 Views
  • 10 replies
  • 3 kudos
Latest Reply
RengarLee
Contributor
  • 3 kudos

I'm so sorry, I just thought the issue wasn't resolvedSolutionSet maxFilesPerTrigger and maxBytesPerTrigger Enable autpoptimizeReason for the first day, it processes larger files and then eventually process smaller files。Detailed reason B...

  • 3 kudos
9 More Replies
MetaRossiVinli
by Contributor
  • 6481 Views
  • 1 replies
  • 1 kudos

Resolved! Find root path to Repo for .py file import

I want to import a Python function stored in the following file path:`<repo>/lib/lib_helpers.py`I want to import the function from any file in my repo. For instance from these:`<repo>/notebooks/etl/bronze/dlt_bronze_elt``<repo>/workers/job_worker`It ...

  • 6481 Views
  • 1 replies
  • 1 kudos
Latest Reply
MetaRossiVinli
Contributor
  • 1 kudos

Ok, I figured it out. If you just make it a Python module by adding an empty `__init__.py`, Databricks will load it on start. Then, you can just import it.

  • 1 kudos

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels