cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

biafch
by Contributor
  • 494 Views
  • 2 replies
  • 0 kudos

Upgrading runtime 10.4 to 11.3 causing errors in my code (CASTING issues?)

Hi all,We have our medallion architecture transformation on databricks.Im currently testing upgrading to 11.3 as 10.4 won't be supported anymore from March 2025.However, I keep getting errors like this:Error inserting data into table. Type AnalysisEx...

  • 494 Views
  • 2 replies
  • 0 kudos
Latest Reply
biafch
Contributor
  • 0 kudos

Hi @Alberto_Umana Thank you for your response.That's the weird thing. The RawDataStartDate only consists of records with datetime stamps. Furthermore I am nowhere in my code casting anything of this to a boolean, or casting anything at all. All I am ...

  • 0 kudos
1 More Replies
gadapagopi1
by New Contributor III
  • 1185 Views
  • 7 replies
  • 2 kudos

Resolved! data bricks community edition login issue

I have a data bricks community edition account. I know the username and password. I used this account long time ago. I try to login this account, it is sent a verification code to my mail id. But I am unable to login my Gmail account doe to I forgot ...

  • 1185 Views
  • 7 replies
  • 2 kudos
Latest Reply
RajathKudtarkar
New Contributor II
  • 2 kudos

Hi Im having an issue while logging into the databricks community edition. Where in even if I give correct email address and OTP it says "We were not able to find a Community Edition workspace with this email."could you please help?

  • 2 kudos
6 More Replies
RobsonNLPT
by Contributor III
  • 1320 Views
  • 3 replies
  • 0 kudos

Google BigQuery Foreign Catalog - Incorrect Data Format

I've tested a foreign catalog connected to a google bigquery project.The connection was ok and I was able to see my datasets and tablesThe problem: for columns with regular data types the data format is perfect but the columns with type record and re...

  • 1320 Views
  • 3 replies
  • 0 kudos
Latest Reply
Alberto_Umana
Databricks Employee
  • 0 kudos

Hi @RobsonNLPT, This is a limitation, the data conversion issue you are facing is expected behavior due to the current data type mappings supported by the Lakehouse Federation platform. Unfortunately, this means that the JSON format you see in Google...

  • 0 kudos
2 More Replies
aayrm5
by Honored Contributor
  • 528 Views
  • 2 replies
  • 2 kudos

Parsing Japanese characters in Spark & Databricks

I'm trying to read the data which has Japanese headers, might as well have Japanese data. Currently when I say header is True, I see all jumbled characters. Can any one help how can I parse these Japanese characters correctly?

  • 528 Views
  • 2 replies
  • 2 kudos
Latest Reply
aayrm5
Honored Contributor
  • 2 kudos

Thank you, @Avinash_Narala I definitely used the encoding options to parse the data again but this time I used an encoding called `SHIFT_JIS` to solve the problem. Appreciate the quick response.!

  • 2 kudos
1 More Replies
amitca71
by Contributor II
  • 8938 Views
  • 6 replies
  • 5 kudos

Resolved! exception when using java SQL client

Hi,I try to use java sql. i can see that the query on databricks is executed properly.However, on my client i get exception (see below).versions:jdk: jdk-20.0.1 (tryed also with version 16, same results)https://www.oracle.com/il-en/java/technologies/...

  • 8938 Views
  • 6 replies
  • 5 kudos
Latest Reply
xebia
New Contributor II
  • 5 kudos

I am using java 17 and getting the same error.

  • 5 kudos
5 More Replies
kbmv
by Contributor
  • 912 Views
  • 3 replies
  • 0 kudos

Resolved! Init script works fine on All purpose compute but have issues with Job compute created from DLT ETL

Hi I was following Databricks tutorial from https://notebooks.databricks.com/demos/llm-rag-chatbot the old one where it had reference on how to install OCR on nodes(install poppler on the cluster) to read the pdf content.I created below init script t...

  • 912 Views
  • 3 replies
  • 0 kudos
Latest Reply
kbmv
Contributor
  • 0 kudos

Hi Alberto_Umana,Thanks for looking into it, I got solution from databricks support assigned for my corporation.The issue was more with cluster type and not Streaming or DLT. For Streaming I was able to use Single User compute but for DLT since we ca...

  • 0 kudos
2 More Replies
AvneeshSingh
by New Contributor
  • 1860 Views
  • 0 replies
  • 0 kudos

Autloader Data Reprocess

Hi ,If possible can any please help me with some autloader options I have 2 open queries ,(i) Let assume I am running some autoloader stream and if my job fails, so instead of resetting the whole checkpoint, I want to run stream from specified timest...

Data Engineering
autoloader
  • 1860 Views
  • 0 replies
  • 0 kudos
yash_verma
by New Contributor III
  • 1740 Views
  • 7 replies
  • 2 kudos

Resolved! error while setting up permission for job via api

Hi Guys , I am getting below error  when I am trying to setup permission for the job via api. Though I am able to create a job via api. Can anyone help to identify the issue or any one has faced below error {"error_code": "INVALID_PARAMETER_VALUE","m...

  • 1740 Views
  • 7 replies
  • 2 kudos
Latest Reply
JohnKruebbe
New Contributor II
  • 2 kudos

I get that the solution was accepted, but it is very confusing when you run the databricks command as follows:databricks clusters get-permissions my-joyous-db-cluster"access_control_list": [{"all_permissions": [{"inherited":false,"permission_level":"...

  • 2 kudos
6 More Replies
lmorrissey
by New Contributor II
  • 450 Views
  • 1 replies
  • 1 kudos

Resolved! Cluster install of Python libraries versus notebook install

If a base set of libraries is installed on the cluster and pinned to a specific version, can/would this conflict with a notebook submitted to the cluster that defines a conflicting set of libraries for install?Is there a way to override the cluster p...

  • 450 Views
  • 1 replies
  • 1 kudos
Latest Reply
Alberto_Umana
Databricks Employee
  • 1 kudos

When a base set of libraries is installed on a cluster, can indeed conflict with a notebook submitted to the cluster that defines a conflicting set of libraries for installation. This is because the libraries installed at the cluster level take prece...

  • 1 kudos
lmorrissey
by New Contributor II
  • 2155 Views
  • 0 replies
  • 0 kudos

GC Allocation Failure

There are a couple of related posts here and here.Seeing a similar issue with a long running job. Processes are in a "RUNNING" state, cluster is active, but stdout log shows the dreaded GC Allocation Failure. Env:I've set the following on the config:...

lmorrissey_2-1738802605421.png lmorrissey_0-1738801635404.png lmorrissey_1-1738801909227.png
  • 2155 Views
  • 0 replies
  • 0 kudos
Austin1
by New Contributor
  • 2119 Views
  • 0 replies
  • 0 kudos

VSCode Integration for Data Science Analysts

Probably not posting this in the right forum, but can't find a good fit.This is a bit convuluted because we make things hard at work. I have access to a single LLM via VSCode (Amazon Q).  Since I can't use that within Databricks but I want my team to...

  • 2119 Views
  • 0 replies
  • 0 kudos
alejandrofm
by Valued Contributor
  • 3591 Views
  • 3 replies
  • 1 kudos

Can't enable CLI 2.1 on CI

Hi! this is my CI configuration, I added the databricks jobs configure --version=2.1 command but it stills showing this error, any idea of what can I be doing wrong?Error:Resetting Databricks Job with job_id 1036...WARN: Your CLI is configured to use...

  • 3591 Views
  • 3 replies
  • 1 kudos
Latest Reply
karthik-kandiko
New Contributor II
  • 1 kudos

I got to solve this by downgrading the Databricks runtime to 13.3 and had the below commands for optimization and it worked well in my case.spark.conf.set("spark.sql.shuffle.partitions", "200")spark.conf.set("spark.sql.execution.arrow.pyspark.enabled...

  • 1 kudos
2 More Replies
TX-Aggie-00
by New Contributor III
  • 3414 Views
  • 7 replies
  • 2 kudos

Installing linux packages on cluster

Hey everyone!  We have a need to utilize libreoffice in one of our automated tasks via a notebook.  I have tried to install via a init script that I attach to the cluster, but sometimes the program gets installed and sometimes it doesn't.  For obviou...

  • 3414 Views
  • 7 replies
  • 2 kudos
Latest Reply
virtualdvid
New Contributor II
  • 2 kudos

It only works in the driver, when I try to use the whole cluster the nodes can't access the command.

  • 2 kudos
6 More Replies
mangosta
by New Contributor II
  • 2804 Views
  • 5 replies
  • 1 kudos

Resolved! Query text truncated for queries longer than 153,596 characters

Hi, When using the `query_history.list` function of the python SDK workspace client the queries that have more than 153,596 characters are truncated.I could not find anywhere in the documentation this limit so I wanted to know if this is documented s...

  • 2804 Views
  • 5 replies
  • 1 kudos
Latest Reply
brockb
Databricks Employee
  • 1 kudos

Hi @mangosta , I did some testing internally and was able to replicate the behavior you described.  The query text limit is a limitation not of the SDK or the API, but rather of the backing system table `system.query.history`. More information on thi...

  • 1 kudos
4 More Replies
Sujith_i
by New Contributor
  • 1920 Views
  • 0 replies
  • 0 kudos

databricks sdk for python authentication failing

I am trying to use databricks sdk for python to do some account level operations like creating groups and created a databricks config file locally n provided the profile name as argument to AccountClient but authentication keeps failing. the same con...

  • 1920 Views
  • 0 replies
  • 0 kudos

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels