cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 
Data + AI Summit 2024 - Data Engineering & Streaming

Forum Posts

raghu2
by New Contributor III
  • 1798 Views
  • 3 replies
  • 0 kudos

DAB run

Hello All,I am running this command : databricks bundle run -t dev dltPpl_job --debugBundle name: dltPpl. Bundle was generated using: databricks bundle init --target devError message: Error: exit status 1Failed to marshal state to json: unsupported a...

  • 1798 Views
  • 3 replies
  • 0 kudos
Latest Reply
prar_shah
New Contributor II
  • 0 kudos

Update:Error is:Error: cannot create job: No task defined for my_custom_task

  • 0 kudos
2 More Replies
Nid-cbs
by Visitor
  • 89 Views
  • 7 replies
  • 3 kudos

Ownership change for table using SQL

It's not possible to use the ALTER TABLE tblname OWNER TO serviceprinc1 command in Azure Databricks, as this isn't supported. I was trying to set a catalog table's ownership, but it resulted in an error. How can I achieve this using a script

  • 89 Views
  • 7 replies
  • 3 kudos
Latest Reply
Nid-cbs
Visitor
  • 3 kudos

it turns out that The ALTER TABLE statement in Databricks SQL does not support changing the owner of a table directly.It seems we need to use REST API to do the same 

  • 3 kudos
6 More Replies
Nis
by New Contributor II
  • 1604 Views
  • 5 replies
  • 3 kudos

can we commit offset in spark structured streaming in databricks.

We are storing offset details in checkpoint location wanted to know is there a way can we commit offset once we consume the message from kafka.

  • 1604 Views
  • 5 replies
  • 3 kudos
Latest Reply
dmytro
New Contributor II
  • 3 kudos

Hi @raphaelblg , thanks a lot for providing an elaborate answer. Do you happen to you, by any chance, of some solutions that developers use to track a consumer lag when streaming with Spark from a Kafka topic? It's a rather essential knowledge to hav...

  • 3 kudos
4 More Replies
sandy311
by New Contributor
  • 86 Views
  • 4 replies
  • 3 kudos

if else conditions in databricks asset bundles

Can I use if-else conditions in databricks.yml and parameterize my asset bundles similarly to Azure Pipelines YAML?

  • 86 Views
  • 4 replies
  • 3 kudos
Latest Reply
filipniziol
New Contributor II
  • 3 kudos

Hi @sandy311 ,Could you please provide more details on what you’re trying to achieve?It seems like you are looking to use Databricks Asset Bundles as complete CI/CD pipelines. While Databricks Asset Bundles are a crucial part of the CI/CD process, th...

  • 3 kudos
3 More Replies
cm04
by New Contributor III
  • 81 Views
  • 2 replies
  • 3 kudos

Resolved! Why does my job run on shared compute instead of job compute?

I have configured a job using `databricks.yml````resources:  jobs:    my_job:      name: my_job      tasks:        - task_key: create_feature_tables          job_cluster_key: my_job_cluster          spark_python_task:            python_file: ../src/c...

cm04_0-1725643451954.png
  • 81 Views
  • 2 replies
  • 3 kudos
Latest Reply
Slash
Contributor
  • 3 kudos

Hi @cm04 ,You can try to upgrade CLI to newest version. I've seen similiar issue before and upgrading CLI was a solution back then.Solved: Yml file replacing job cluster with all-purpose cl... - Databricks Community - 72248

  • 3 kudos
1 More Replies
Deloitte_DS
by New Contributor II
  • 2716 Views
  • 3 replies
  • 0 kudos

Unable to install poppler-utils

Hi,I'm trying to install system level package "Poppler-utils" for the cluster. I added the following line to the init.sh script.sudo apt-get -f -y install poppler-utilsI got the following error: PDFInfoNotInstalledError: Unable to get page count. Is ...

  • 2716 Views
  • 3 replies
  • 0 kudos
Latest Reply
dheeraj-cir
  • 0 kudos

use a personal cluster and use !sudo apt-get updateand!sudo apt-get install -y poppler-utils

  • 0 kudos
2 More Replies
Graham
by New Contributor III
  • 3902 Views
  • 6 replies
  • 4 kudos

Resolved! Inline comment next to un-tickmarked SET statement = Syntax error

Running this code in databricks SQL works great:SET USE_CACHED_RESULT = FALSE;   -- Result: -- key value -- USE_CACHED_RESULT FALSEIf I add an inline comment, however, I get a syntax error:SET USE_CACHED_RESUL...

  • 3902 Views
  • 6 replies
  • 4 kudos
Latest Reply
rafal_walisko
New Contributor II
  • 4 kudos

Hi, I'm getting the same error when trying to execute statement through API "statement": "SET `USE_CACHED_RESULT` = FALSE; SELECT COUNT(*) FROM TABLE" Every combination fail  "status": { "state": "FAILED", "error": { "e...

  • 4 kudos
5 More Replies
ahsan_aj
by Contributor
  • 2229 Views
  • 23 replies
  • 13 kudos

Databricks connect 14.3.2 SparkConnectGrpcException Not found any cached local relation withthe hash

Hi All,I am using Databricks Connect 14.3.2 with Databricks Runtime 14.3 LTS to execute the code below. The CSV file is only 7MB, the code runs without issues on Databricks Runtime 15+ clusters but consistently produces the error message shown below ...

Data Engineering
databricks-connect
spark-connect
  • 2229 Views
  • 23 replies
  • 13 kudos
Latest Reply
CarlDaniel
Visitor
  • 13 kudos

Now same issue with version 15.4 LTS. Does the fix for 14.3 LTS work? Thanks!

  • 13 kudos
22 More Replies
MrJava
by New Contributor III
  • 6584 Views
  • 11 replies
  • 12 kudos

How to know, who started a job run?

Hi there!We have different jobs/workflows configured in our Databricks workspace running on AWS and would like to know who actually started the job run? Are they started by a user or a service principle using curl?Currently one can only see, who is t...

  • 6584 Views
  • 11 replies
  • 12 kudos
Latest Reply
ahsan_aj
Contributor
  • 12 kudos

It looks like there is no update on this, this is such a useful piece of information that ideally should have been available from the start.

  • 12 kudos
10 More Replies
shri0509
by New Contributor II
  • 526 Views
  • 5 replies
  • 1 kudos

How to avoid iteration/loop in databricks in the given scenario

Hi all, I need your input.I am new to Databricks and working with a dataset that consists of around 10,000 systems, each containing approximately 100 to 150 parts. These parts have attributes such as name, version, and serial number. The dataset size...

Data Engineering
data engineering
  • 526 Views
  • 5 replies
  • 1 kudos
Latest Reply
AnnieWhite
New Contributor II
  • 1 kudos

Thank you so much for the link.

  • 1 kudos
4 More Replies
TimB
by New Contributor III
  • 7309 Views
  • 6 replies
  • 0 kudos

Foreign catalog - Connections using insecure transport are prohibited --require_secure_transport=ON

I have added a connection to a MySql database in Azure, and I have created a foreign catalog in Databricks. But when I go to query the database I get the following error;Connections using insecure transport are prohibited while --require_secure_trans...

  • 7309 Views
  • 6 replies
  • 0 kudos
Latest Reply
TimB
New Contributor III
  • 0 kudos

I was trying to connect to our database (Azure MySQL) from DBX, but we wanted require_secure_transport to be set to ON, and we didn't want to turn it off. We ended up moving DBX within a VNET and setting up a private link to get around this.

  • 0 kudos
5 More Replies

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group
Labels