cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

eriodega
by Contributor
  • 994 Views
  • 2 replies
  • 0 kudos

system.access.table_lineage - source and target table meanings

I've been using the system.access.table_lineage table, and I'm trying to understand when the source and target tables are defined.For example, picking a specific job run and looking at the lineage:selectsource_type, source_table_full_name, target_typ...

  • 994 Views
  • 2 replies
  • 0 kudos
Latest Reply
eriodega
Contributor
  • 0 kudos

@Sidhant07thanks for the answer, I think it is good, but I am questioning scenario #5 (source=table,target=view).I'm looking at some examples in our table_lineage, and we aren't modifying the view or creating the view from within a job. I think scena...

  • 0 kudos
1 More Replies
serg-v
by New Contributor III
  • 6037 Views
  • 5 replies
  • 3 kudos

Resolved! databricks-connect 11.3

Would there be databricks-connect for cluster version 11.3 ?If yes, when we should expect it?

  • 6037 Views
  • 5 replies
  • 3 kudos
Latest Reply
Oliver_Floyd
Contributor
  • 3 kudos

It looks like there are other issues. I saved the model generated with the code above in mlflowWhen I try to reload it with this code:import mlflow model = mlflow.spark.load_model('runs:/cb6ff62587a0404cabeadd47e4c9408a/model') It works in a notebook...

  • 3 kudos
4 More Replies
JothyGanesan
by New Contributor III
  • 821 Views
  • 1 replies
  • 1 kudos

DLT - Handling Merge

Hi,In our DLT pipeline we are reading two tables. One a Apply Changes table Delta table and a streaming live table. We are able to read the latest records from the streaming live table incrementally but from the apply changes we are not able to read ...

  • 821 Views
  • 1 replies
  • 1 kudos
Latest Reply
Walter_C
Databricks Employee
  • 1 kudos

To address the challenges you are facing with your Delta Live Tables (DLT) pipeline, here are some steps and considerations to help you manage the incremental data reading and joining of the Apply Changes table and the streaming live table for SCD Ty...

  • 1 kudos
biafch
by Contributor
  • 803 Views
  • 2 replies
  • 0 kudos

Upgrading runtime 10.4 to 11.3 causing errors in my code (CASTING issues?)

Hi all,We have our medallion architecture transformation on databricks.Im currently testing upgrading to 11.3 as 10.4 won't be supported anymore from March 2025.However, I keep getting errors like this:Error inserting data into table. Type AnalysisEx...

  • 803 Views
  • 2 replies
  • 0 kudos
Latest Reply
biafch
Contributor
  • 0 kudos

Hi @Alberto_Umana Thank you for your response.That's the weird thing. The RawDataStartDate only consists of records with datetime stamps. Furthermore I am nowhere in my code casting anything of this to a boolean, or casting anything at all. All I am ...

  • 0 kudos
1 More Replies
gadapagopi1
by New Contributor III
  • 1778 Views
  • 7 replies
  • 2 kudos

Resolved! data bricks community edition login issue

I have a data bricks community edition account. I know the username and password. I used this account long time ago. I try to login this account, it is sent a verification code to my mail id. But I am unable to login my Gmail account doe to I forgot ...

  • 1778 Views
  • 7 replies
  • 2 kudos
Latest Reply
RajathKudtarkar
New Contributor II
  • 2 kudos

Hi Im having an issue while logging into the databricks community edition. Where in even if I give correct email address and OTP it says "We were not able to find a Community Edition workspace with this email."could you please help?

  • 2 kudos
6 More Replies
RobsonNLPT
by Contributor III
  • 1694 Views
  • 3 replies
  • 1 kudos

Google BigQuery Foreign Catalog - Incorrect Data Format

I've tested a foreign catalog connected to a google bigquery project.The connection was ok and I was able to see my datasets and tablesThe problem: for columns with regular data types the data format is perfect but the columns with type record and re...

  • 1694 Views
  • 3 replies
  • 1 kudos
Latest Reply
Alberto_Umana
Databricks Employee
  • 1 kudos

Hi @RobsonNLPT, This is a limitation, the data conversion issue you are facing is expected behavior due to the current data type mappings supported by the Lakehouse Federation platform. Unfortunately, this means that the JSON format you see in Google...

  • 1 kudos
2 More Replies
RiyazAliM
by Honored Contributor
  • 792 Views
  • 2 replies
  • 2 kudos

Parsing Japanese characters in Spark & Databricks

I'm trying to read the data which has Japanese headers, might as well have Japanese data. Currently when I say header is True, I see all jumbled characters. Can any one help how can I parse these Japanese characters correctly?

  • 792 Views
  • 2 replies
  • 2 kudos
Latest Reply
RiyazAliM
Honored Contributor
  • 2 kudos

Thank you, @Avinash_Narala I definitely used the encoding options to parse the data again but this time I used an encoding called `SHIFT_JIS` to solve the problem. Appreciate the quick response.!

  • 2 kudos
1 More Replies
amitca71
by Contributor II
  • 9959 Views
  • 6 replies
  • 5 kudos

Resolved! exception when using java SQL client

Hi,I try to use java sql. i can see that the query on databricks is executed properly.However, on my client i get exception (see below).versions:jdk: jdk-20.0.1 (tryed also with version 16, same results)https://www.oracle.com/il-en/java/technologies/...

  • 9959 Views
  • 6 replies
  • 5 kudos
Latest Reply
xebia
New Contributor II
  • 5 kudos

I am using java 17 and getting the same error.

  • 5 kudos
5 More Replies
kbmv
by Contributor
  • 1384 Views
  • 3 replies
  • 0 kudos

Resolved! Init script works fine on All purpose compute but have issues with Job compute created from DLT ETL

Hi I was following Databricks tutorial from https://notebooks.databricks.com/demos/llm-rag-chatbot the old one where it had reference on how to install OCR on nodes(install poppler on the cluster) to read the pdf content.I created below init script t...

  • 1384 Views
  • 3 replies
  • 0 kudos
Latest Reply
kbmv
Contributor
  • 0 kudos

Hi Alberto_Umana,Thanks for looking into it, I got solution from databricks support assigned for my corporation.The issue was more with cluster type and not Streaming or DLT. For Streaming I was able to use Single User compute but for DLT since we ca...

  • 0 kudos
2 More Replies
yash_verma
by New Contributor III
  • 2586 Views
  • 7 replies
  • 2 kudos

Resolved! error while setting up permission for job via api

Hi Guys , I am getting below error  when I am trying to setup permission for the job via api. Though I am able to create a job via api. Can anyone help to identify the issue or any one has faced below error {"error_code": "INVALID_PARAMETER_VALUE","m...

  • 2586 Views
  • 7 replies
  • 2 kudos
Latest Reply
JohnKruebbe
New Contributor II
  • 2 kudos

I get that the solution was accepted, but it is very confusing when you run the databricks command as follows:databricks clusters get-permissions my-joyous-db-cluster"access_control_list": [{"all_permissions": [{"inherited":false,"permission_level":"...

  • 2 kudos
6 More Replies
lmorrissey
by New Contributor II
  • 2085 Views
  • 1 replies
  • 1 kudos

Resolved! Cluster install of Python libraries versus notebook install

If a base set of libraries is installed on the cluster and pinned to a specific version, can/would this conflict with a notebook submitted to the cluster that defines a conflicting set of libraries for install?Is there a way to override the cluster p...

  • 2085 Views
  • 1 replies
  • 1 kudos
Latest Reply
Alberto_Umana
Databricks Employee
  • 1 kudos

When a base set of libraries is installed on a cluster, can indeed conflict with a notebook submitted to the cluster that defines a conflicting set of libraries for installation. This is because the libraries installed at the cluster level take prece...

  • 1 kudos
lmorrissey
by New Contributor II
  • 3873 Views
  • 0 replies
  • 0 kudos

GC Allocation Failure

There are a couple of related posts here and here.Seeing a similar issue with a long running job. Processes are in a "RUNNING" state, cluster is active, but stdout log shows the dreaded GC Allocation Failure. Env:I've set the following on the config:...

lmorrissey_2-1738802605421.png lmorrissey_0-1738801635404.png lmorrissey_1-1738801909227.png
  • 3873 Views
  • 0 replies
  • 0 kudos
Austin1
by New Contributor
  • 3708 Views
  • 0 replies
  • 0 kudos

VSCode Integration for Data Science Analysts

Probably not posting this in the right forum, but can't find a good fit.This is a bit convuluted because we make things hard at work. I have access to a single LLM via VSCode (Amazon Q).  Since I can't use that within Databricks but I want my team to...

  • 3708 Views
  • 0 replies
  • 0 kudos
alejandrofm
by Valued Contributor
  • 3968 Views
  • 3 replies
  • 1 kudos

Can't enable CLI 2.1 on CI

Hi! this is my CI configuration, I added the databricks jobs configure --version=2.1 command but it stills showing this error, any idea of what can I be doing wrong?Error:Resetting Databricks Job with job_id 1036...WARN: Your CLI is configured to use...

  • 3968 Views
  • 3 replies
  • 1 kudos
Latest Reply
karthik-kandiko
New Contributor II
  • 1 kudos

I got to solve this by downgrading the Databricks runtime to 13.3 and had the below commands for optimization and it worked well in my case.spark.conf.set("spark.sql.shuffle.partitions", "200")spark.conf.set("spark.sql.execution.arrow.pyspark.enabled...

  • 1 kudos
2 More Replies
TX-Aggie-00
by New Contributor III
  • 5094 Views
  • 7 replies
  • 2 kudos

Installing linux packages on cluster

Hey everyone!  We have a need to utilize libreoffice in one of our automated tasks via a notebook.  I have tried to install via a init script that I attach to the cluster, but sometimes the program gets installed and sometimes it doesn't.  For obviou...

  • 5094 Views
  • 7 replies
  • 2 kudos
Latest Reply
virtualdvid
New Contributor II
  • 2 kudos

It only works in the driver, when I try to use the whole cluster the nodes can't access the command.

  • 2 kudos
6 More Replies

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels