cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Maatari
by New Contributor III
  • 1982 Views
  • 4 replies
  • 0 kudos

Resolved! What is the behaviour of starting version with spark structured streaming ?

Looking into the followinghttps://docs.databricks.com/en/structured-streaming/delta-lake.html#specify-initial-positionI am unclear as to what is the exact difference (if any) between "startingVersion: The Delta Lake version to start from. Databricks ...

  • 1982 Views
  • 4 replies
  • 0 kudos
Latest Reply
VZLA
Databricks Employee
  • 0 kudos

Hi @dlorenzo, interesting take! I don’t agree with your statement, though. According to both the documentation and my own testing, startingVersion = "latest" explicitly skips all historical data and starts from the latest committed version at the tim...

  • 0 kudos
3 More Replies
jeremy98
by Honored Contributor
  • 1097 Views
  • 10 replies
  • 0 kudos

Allows to serveless compute to connect to postgres db

Hi Community,Is it possible to enable VNet peering between Databricks Serverless Compute and a private PostgreSQL database that is already configured with a VNet?Currently, everything works fine when I create my personal cluster because I have set up...

  • 1097 Views
  • 10 replies
  • 0 kudos
Latest Reply
Rjdudley
Honored Contributor
  • 0 kudos

Is that PostgreSQL server going to go away after you migrate to Databricks, or is it going to continue to be used?  Either way, federation works for you.  If you're going to discontinue it, just do a full extract into an archive location and a one-ti...

  • 0 kudos
9 More Replies
ClaudeR
by New Contributor III
  • 4420 Views
  • 3 replies
  • 2 kudos

Resolved! [Simba][SparkJDBCDriver](500177) Error getting http path from connection string

I'm trying to use a very basic java program to connect to Databricks using spark jdbc driver (SparkJDBC42.jar), but I get the error (mentioned above): [Simba][SparkJDBCDriver](500177) Error getting http path from connection stringHere is my code snip...

  • 4420 Views
  • 3 replies
  • 2 kudos
Latest Reply
Anonymous
Not applicable
  • 2 kudos

Hello @Claude Repono​ Thank you for posting your question in the community. It seems you were able to find the solution by yourself. That's awesome. We are going to go ahead and mark your answer as the best solution.

  • 2 kudos
2 More Replies
eriodega
by Contributor
  • 577 Views
  • 2 replies
  • 0 kudos

system.access.table_lineage - source and target table meanings

I've been using the system.access.table_lineage table, and I'm trying to understand when the source and target tables are defined.For example, picking a specific job run and looking at the lineage:selectsource_type, source_table_full_name, target_typ...

  • 577 Views
  • 2 replies
  • 0 kudos
Latest Reply
eriodega
Contributor
  • 0 kudos

@Sidhant07thanks for the answer, I think it is good, but I am questioning scenario #5 (source=table,target=view).I'm looking at some examples in our table_lineage, and we aren't modifying the view or creating the view from within a job. I think scena...

  • 0 kudos
1 More Replies
serg-v
by New Contributor III
  • 5350 Views
  • 5 replies
  • 3 kudos

Resolved! databricks-connect 11.3

Would there be databricks-connect for cluster version 11.3 ?If yes, when we should expect it?

  • 5350 Views
  • 5 replies
  • 3 kudos
Latest Reply
Oliver_Floyd
Contributor
  • 3 kudos

It looks like there are other issues. I saved the model generated with the code above in mlflowWhen I try to reload it with this code:import mlflow model = mlflow.spark.load_model('runs:/cb6ff62587a0404cabeadd47e4c9408a/model') It works in a notebook...

  • 3 kudos
4 More Replies
JothyGanesan
by New Contributor III
  • 466 Views
  • 1 replies
  • 1 kudos

DLT - Handling Merge

Hi,In our DLT pipeline we are reading two tables. One a Apply Changes table Delta table and a streaming live table. We are able to read the latest records from the streaming live table incrementally but from the apply changes we are not able to read ...

  • 466 Views
  • 1 replies
  • 1 kudos
Latest Reply
Walter_C
Databricks Employee
  • 1 kudos

To address the challenges you are facing with your Delta Live Tables (DLT) pipeline, here are some steps and considerations to help you manage the incremental data reading and joining of the Apply Changes table and the streaming live table for SCD Ty...

  • 1 kudos
biafch
by Contributor
  • 455 Views
  • 2 replies
  • 0 kudos

Upgrading runtime 10.4 to 11.3 causing errors in my code (CASTING issues?)

Hi all,We have our medallion architecture transformation on databricks.Im currently testing upgrading to 11.3 as 10.4 won't be supported anymore from March 2025.However, I keep getting errors like this:Error inserting data into table. Type AnalysisEx...

  • 455 Views
  • 2 replies
  • 0 kudos
Latest Reply
biafch
Contributor
  • 0 kudos

Hi @Alberto_Umana Thank you for your response.That's the weird thing. The RawDataStartDate only consists of records with datetime stamps. Furthermore I am nowhere in my code casting anything of this to a boolean, or casting anything at all. All I am ...

  • 0 kudos
1 More Replies
gadapagopi1
by New Contributor III
  • 1070 Views
  • 7 replies
  • 2 kudos

Resolved! data bricks community edition login issue

I have a data bricks community edition account. I know the username and password. I used this account long time ago. I try to login this account, it is sent a verification code to my mail id. But I am unable to login my Gmail account doe to I forgot ...

  • 1070 Views
  • 7 replies
  • 2 kudos
Latest Reply
RajathKudtarkar
New Contributor II
  • 2 kudos

Hi Im having an issue while logging into the databricks community edition. Where in even if I give correct email address and OTP it says "We were not able to find a Community Edition workspace with this email."could you please help?

  • 2 kudos
6 More Replies
RobsonNLPT
by Contributor III
  • 1280 Views
  • 3 replies
  • 0 kudos

Google BigQuery Foreign Catalog - Incorrect Data Format

I've tested a foreign catalog connected to a google bigquery project.The connection was ok and I was able to see my datasets and tablesThe problem: for columns with regular data types the data format is perfect but the columns with type record and re...

  • 1280 Views
  • 3 replies
  • 0 kudos
Latest Reply
Alberto_Umana
Databricks Employee
  • 0 kudos

Hi @RobsonNLPT, This is a limitation, the data conversion issue you are facing is expected behavior due to the current data type mappings supported by the Lakehouse Federation platform. Unfortunately, this means that the JSON format you see in Google...

  • 0 kudos
2 More Replies
aayrm5
by Honored Contributor
  • 497 Views
  • 2 replies
  • 2 kudos

Parsing Japanese characters in Spark & Databricks

I'm trying to read the data which has Japanese headers, might as well have Japanese data. Currently when I say header is True, I see all jumbled characters. Can any one help how can I parse these Japanese characters correctly?

  • 497 Views
  • 2 replies
  • 2 kudos
Latest Reply
aayrm5
Honored Contributor
  • 2 kudos

Thank you, @Avinash_Narala I definitely used the encoding options to parse the data again but this time I used an encoding called `SHIFT_JIS` to solve the problem. Appreciate the quick response.!

  • 2 kudos
1 More Replies
amitca71
by Contributor II
  • 8833 Views
  • 6 replies
  • 5 kudos

Resolved! exception when using java SQL client

Hi,I try to use java sql. i can see that the query on databricks is executed properly.However, on my client i get exception (see below).versions:jdk: jdk-20.0.1 (tryed also with version 16, same results)https://www.oracle.com/il-en/java/technologies/...

  • 8833 Views
  • 6 replies
  • 5 kudos
Latest Reply
xebia
New Contributor II
  • 5 kudos

I am using java 17 and getting the same error.

  • 5 kudos
5 More Replies
kbmv
by Contributor
  • 834 Views
  • 3 replies
  • 0 kudos

Resolved! Init script works fine on All purpose compute but have issues with Job compute created from DLT ETL

Hi I was following Databricks tutorial from https://notebooks.databricks.com/demos/llm-rag-chatbot the old one where it had reference on how to install OCR on nodes(install poppler on the cluster) to read the pdf content.I created below init script t...

  • 834 Views
  • 3 replies
  • 0 kudos
Latest Reply
kbmv
Contributor
  • 0 kudos

Hi Alberto_Umana,Thanks for looking into it, I got solution from databricks support assigned for my corporation.The issue was more with cluster type and not Streaming or DLT. For Streaming I was able to use Single User compute but for DLT since we ca...

  • 0 kudos
2 More Replies
AvneeshSingh
by New Contributor
  • 1483 Views
  • 0 replies
  • 0 kudos

Autloader Data Reprocess

Hi ,If possible can any please help me with some autloader options I have 2 open queries ,(i) Let assume I am running some autoloader stream and if my job fails, so instead of resetting the whole checkpoint, I want to run stream from specified timest...

Data Engineering
autoloader
  • 1483 Views
  • 0 replies
  • 0 kudos
boitumelodikoko
by Contributor II
  • 3359 Views
  • 4 replies
  • 1 kudos

[RETRIES_EXCEEDED] Error When Displaying DataFrame in Databricks Using Serverless Compute

Hi Databricks Community,I am encountering an issue when trying to display a DataFrame in a Python notebook using serverless compute. The operation seems to fail after several retries, and I get the following error message:[RETRIES_EXCEEDED] The maxim...

  • 3359 Views
  • 4 replies
  • 1 kudos
Latest Reply
mohammedkhu
New Contributor II
  • 1 kudos

@boitumelodikoko I am facing the exact same issue but on all purpose compute. It works well for smaller dataset, but for large dataset it will fails with same error.The dataset i am working on has 13M rows, and I have scaled upto n2-highmem-8 (same f...

  • 1 kudos
3 More Replies
yash_verma
by New Contributor III
  • 1577 Views
  • 7 replies
  • 2 kudos

Resolved! error while setting up permission for job via api

Hi Guys , I am getting below error  when I am trying to setup permission for the job via api. Though I am able to create a job via api. Can anyone help to identify the issue or any one has faced below error {"error_code": "INVALID_PARAMETER_VALUE","m...

  • 1577 Views
  • 7 replies
  • 2 kudos
Latest Reply
JohnKruebbe
New Contributor II
  • 2 kudos

I get that the solution was accepted, but it is very confusing when you run the databricks command as follows:databricks clusters get-permissions my-joyous-db-cluster"access_control_list": [{"all_permissions": [{"inherited":false,"permission_level":"...

  • 2 kudos
6 More Replies

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels