cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

prabhjot
by Databricks Partner
  • 5589 Views
  • 4 replies
  • 2 kudos

Resolved! Data lineage graph is not working

Hi Team,The issue - Data lineage graph is not working (16-feb, 17-18 Feb) –  I created the below tables but when I click the lineage graph not able to see the upstream or downstream table .... the + sign goes away after a few sec but not able to clic...

  • 5589 Views
  • 4 replies
  • 2 kudos
Latest Reply
Sikki
New Contributor III
  • 2 kudos

 Hi Kaniz,We're encountering the same issue where the lineage is not getting populated for a few tables. Could you let us know if a fix has been implemented in any runtime?"We are uaing job cluster 12.2.x .

  • 2 kudos
3 More Replies
kjoth
by Contributor II
  • 29536 Views
  • 9 replies
  • 7 kudos

How to make the job fail via code after handling exception

Hi , We are capturing the exception if an error occurs using try except. But we want the job status to be failed once we got the exception. Whats the best way to do that. We are using pyspark.

  • 29536 Views
  • 9 replies
  • 7 kudos
Latest Reply
kumar_ravi
New Contributor III
  • 7 kudos

you can do some hack arround   dbutils = get_dbutils(spark)    tables_with_exceptions = []    for table_config in table_configs:        try:            process(spark, table_config)        except Exception as e:            exception_detail = f"Error p...

  • 7 kudos
8 More Replies
monil
by Databricks Partner
  • 3486 Views
  • 3 replies
  • 1 kudos

/api/2.1/jobs/runs/get-output api response

/api/2.1/jobs/runs/get-output what are the possbile status or state values of the api?I am trying to check the status of my job run based on run id. but there is not enough detail on the response body which contains the status of the run.

  • 3486 Views
  • 3 replies
  • 1 kudos
Latest Reply
daniel_sahal
Databricks MVP
  • 1 kudos

@monil It's documented well in the API documentation.https://docs.databricks.com/api/workspace/jobs/getrunoutput 

  • 1 kudos
2 More Replies
janwoj
by New Contributor II
  • 14501 Views
  • 4 replies
  • 1 kudos

PowerApps connection to Azure Databricks

Hello,​I would like to read Databricks delta table to show the data on the screen using PowerApps gallery and insert new records to the same table also. What is the best method to achieve an efficient connection and perform above?​​Cheers

  • 14501 Views
  • 4 replies
  • 1 kudos
Latest Reply
Chris_Shehu
Valued Contributor III
  • 1 kudos

Anyone find a solution to this yet? I'm currently investigating the same issue. Currently the only one I can find is paying for a third-party tool to set it up. Thanks,

  • 1 kudos
3 More Replies
PiotrU
by Contributor II
  • 4996 Views
  • 6 replies
  • 1 kudos

Resolved! Adding extra libraries to databricks (rosbag)

HelloI have interesting challenge, I am required to install few libraries which are part of rosbag packages, for allowing some data deserialization tasks.While creating cluster I do use init_script that install this software using apt    sudo apt upd...

lakime_0-1717597430889.png lakime_1-1717597470819.png
  • 4996 Views
  • 6 replies
  • 1 kudos
Latest Reply
amandaK
New Contributor II
  • 1 kudos

@PiotrU did adding the path to sys.path resolve all of your ModuleNotFoundErrors? i'm trying to do something similar and adding the path to the sys.path resolved ModuleNotFoundError for rclpy, but i continue to see others related to ros

  • 1 kudos
5 More Replies
pranathisg97
by New Contributor III
  • 9827 Views
  • 3 replies
  • 0 kudos

Control query caching using SQL statement execution API

I want to execute this statement using databricks SQL Statement Execution API. curl -X POST -H 'Authorization: Bearer <access-token>' -H 'Content-Type: application/json' -d '{"warehouse_id": "<warehouse_id>", "statement": "set us...

image.png
  • 9827 Views
  • 3 replies
  • 0 kudos
Latest Reply
SumitAgrawal454
New Contributor II
  • 0 kudos

Looking for solution as I am facing the exact same problem

  • 0 kudos
2 More Replies
LauJohansson
by Databricks Partner
  • 2759 Views
  • 3 replies
  • 3 kudos

Resolved! Delta live table: Retrieve CDF columns

I have want to use the apply_changes feature from a bronze table to a silver table.The bronze table have no "natural" sequence_by column. Therefore, I want to use the CDF column "_commit_timestamp" as the sequence_by.How do I retrieve the columns in ...

  • 2759 Views
  • 3 replies
  • 3 kudos
Latest Reply
LauJohansson
Databricks Partner
  • 3 kudos

Thank you @raphaelblg!I chose to write an article on the subject after this discussion: https://www.linkedin.com/pulse/databricks-delta-live-tables-merging-lau-johansson-cdtce/?trackingId=L872gj0yQouXgJudM75gdw%3D%3D

  • 3 kudos
2 More Replies
BillMarshall
by Databricks Partner
  • 4350 Views
  • 2 replies
  • 0 kudos

workflow permissions errors

I have a notebook that outputs an Excel file. Through trial and error, and after consulting with various forums I discovered  the .xlsx file needed to be written to a temp file and then copied to the volume in Unity Catalog.When I run the notebook by...

  • 4350 Views
  • 2 replies
  • 0 kudos
Latest Reply
emora
New Contributor III
  • 0 kudos

Hello, yes of course you need to write the excel file in the tmp folder, but then you can move it to whatever you want without problem. In my current project we implemented this method to create the file in the tmp folder, and then move it to one spe...

  • 0 kudos
1 More Replies
Subhasis
by New Contributor III
  • 2810 Views
  • 5 replies
  • 0 kudos

Autoloader Checkpoint Fails and then the after changing the checkpoint path need to reload all data

Autoloader Checkpoint Fails and then the after changing the checkpoint path need to reload all data. I want to load all the data which are not processed . I don't want to relaod all the data.

  • 2810 Views
  • 5 replies
  • 0 kudos
Latest Reply
Subhasis
New Contributor III
  • 0 kudos

Do checkpoint has some benchmark capacity after that it stops writing data? 

  • 0 kudos
4 More Replies
SowmyaDesai
by New Contributor II
  • 3006 Views
  • 3 replies
  • 2 kudos

Run pyspark queries from outside databricks

I have written a Notebook which would execute pyspark query. I then execute it remotely from outside databricks environment using /api/2.1/jobs/run-now, which would then run the notebook. I also want to retrieve the results from this job execution. H...

  • 3006 Views
  • 3 replies
  • 2 kudos
Latest Reply
SowmyaDesai
New Contributor II
  • 2 kudos

Thanks for responding. I did go through this link. It talks about executing on SQL warehouse though. Is there a way we can execute queries on Databricks clusters instead?Databricks has this connector for SQL https://docs.databricks.com/en/dev-tools/p...

  • 2 kudos
2 More Replies
FrancisApel
by New Contributor II
  • 10194 Views
  • 4 replies
  • 0 kudos

[TASK_WRITE_FAILED] Task failed while writing rows to abfss

I am trying to insert into an already created delta table in Unity Catalog. I am getting the error:[TASK_WRITE_FAILED] Task failed while writing rows to abfss://xxxx@xxxxxxxxxxxxxxxx.dfs.core.windows.net/__unitystorage/catalogs/xxxxxxxx-c6c8-45d8-ac3...

  • 10194 Views
  • 4 replies
  • 0 kudos
Latest Reply
NikunjKakadiya
New Contributor II
  • 0 kudos

Any chance this issue got resolved?I am also seeing the same error when I am trying to incrementally read the system tables using the read stream method and writing it using the writestream method. This generally comes for the audit table but other t...

  • 0 kudos
3 More Replies
Gilg
by Contributor II
  • 5592 Views
  • 1 replies
  • 0 kudos

DLT: Waiting for resources took a long time

Hi Team,I have a DLT pipeline running in Production for quite some time now. When I check the pipeline, a couple of jobs took longer than expected. Usually, 1 job only took 10-15 minutes to complete with 2 to 3 mins to provision a resource. Then I ha...

Gilg_0-1696540251644.png
  • 5592 Views
  • 1 replies
  • 0 kudos
Latest Reply
speaker_city
New Contributor II
  • 0 kudos

I am currently trying projects from dbdemos [Full Delta Live Tables Pipeline - Loan].I keep running into this error. how do I resolve this?

  • 0 kudos
Saf4Databricks
by Contributor
  • 1819 Views
  • 2 replies
  • 1 kudos

Resolved! Testing PySpark - Document links broken

The top paragraph of this Testing PySpark page from Apache Spark team states the following - where it points to some links with title 'see here'. But no link is provided to click on. Can someone please provide those links the document is referring to...

  • 1819 Views
  • 2 replies
  • 1 kudos
Latest Reply
szymon_dybczak
Esteemed Contributor III
  • 1 kudos

Hi @Saf4Databricks ,Sure, here they are:- To view the docs for PySpark test utils, see here. spark.apache.org- To see the code for PySpark built-in test utils, check out the Spark repositorypyspark.testing.utils — PySpark 3.5.2 documentation (apache....

  • 1 kudos
1 More Replies
hanish
by New Contributor II
  • 6582 Views
  • 5 replies
  • 2 kudos

Job cluster support in jobs/runs/submit API

We are using jobs/runs/submit API of databricks to create and trigger a one-time run with new_cluster and existing_cluster configuration. We would like to check if there is provision to pass "job_clusters" in this API to reuse the same cluster across...

  • 6582 Views
  • 5 replies
  • 2 kudos
Latest Reply
Nagrjuna
New Contributor II
  • 2 kudos

Hi, Any update on the above mentioned issue? Unable to submit a one time new job run (api/2.0 or 21/jobs/runs/submit) with shared job cluster or one new cluster has to be used for all TASKs in the job 

  • 2 kudos
4 More Replies
sakuraDev
by New Contributor II
  • 2170 Views
  • 1 replies
  • 1 kudos

Resolved! schema is not enforced when using autoloader

Hi everyone,I am currently trying to enforce the following schema:  StructType([ StructField("site", StringType(), True), StructField("meter", StringType(), True), StructField("device_time", StringType(), True), StructField("data", St...

sakuraDev_0-1725389159389.png
  • 2170 Views
  • 1 replies
  • 1 kudos
Latest Reply
szymon_dybczak
Esteemed Contributor III
  • 1 kudos

Hi @sakuraDev ,I'm afraid your assumption is wrong. Here you define data field as struct type and the result is as expected. So once you have this column as struct type, you can refer to nested object using dot notation. So if you would like to get e...

  • 1 kudos
Labels