cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

sandeep91
by New Contributor III
  • 11406 Views
  • 5 replies
  • 2 kudos

Resolved! Databricks Job: Package Name and EntryPoint parameters for the Python Wheel file

I have created Python wheel file with simple file structure and uploaded into cluster library and was able to run the packages in Notebook but, when I am trying to create a Job using python wheel and provide the package name and run the task it fails...

image
  • 11406 Views
  • 5 replies
  • 2 kudos
Latest Reply
AndréSalvati
New Contributor III
  • 2 kudos

There you can see a complete template project with (the new!!!) Databricks Asset Bundles tool and a python wheel task. Please, follow the instructions for deployment.https://github.com/andre-salvati/databricks-template

  • 2 kudos
4 More Replies
DavMes
by New Contributor
  • 4654 Views
  • 2 replies
  • 0 kudos

databricks asset bundles - error demo project

Hi,I am using the v0.205.0 version of the CLI. I wanted to test the demo project (databricks bundle init) of the Databricks Asset Bundles, however I am getting an error after databricks bundle deploy (validate is ok).  artifacts.whl.AutoDetect: Detec...

Data Engineering
DAB
Databricks Asset Bundles
  • 4654 Views
  • 2 replies
  • 0 kudos
Latest Reply
AndréSalvati
New Contributor III
  • 0 kudos

There you can see a complete template project with Databricks Asset Bundles and a python wheel task. Please, follow the instructions for deployment.https://github.com/andre-salvati/databricks-template

  • 0 kudos
1 More Replies
jwilliam
by Contributor
  • 3478 Views
  • 2 replies
  • 1 kudos

Resolved! [BUG] Databricks install WHL as JAR in Python Wheel Task?

I'm using Python Wheel Task in Databricks job with WHEEL dependencies. However, the cluster installed the dependencies as JAR instead of WHEEL. Is this an expected behavior or a bug?

  • 3478 Views
  • 2 replies
  • 1 kudos
Latest Reply
AndréSalvati
New Contributor III
  • 1 kudos

There you can see a complete template project with a python wheel task and Databricks Asset Bundles. Please, follow the instructions for deployment.https://github.com/andre-salvati/databricks-template

  • 1 kudos
1 More Replies
GGG_P
by New Contributor III
  • 8500 Views
  • 3 replies
  • 0 kudos

Databricks Tasks Python wheel : How access to JobID & runID ?

I'm using Python (as Python wheel application) on Databricks.I deploy & run my jobs using dbx.I defined some Databricks Workflow using Python wheel tasks.Everything is working fine, but I'm having issue to extract "databricks_job_id" & "databricks_ru...

  • 8500 Views
  • 3 replies
  • 0 kudos
Latest Reply
AndréSalvati
New Contributor III
  • 0 kudos

There you can see a complete template project with Databricks Asset Bundles and python wheel task. Please, follow the instructions for deployment.https://github.com/andre-salvati/databricks-templateIn particular, take a look at the workflow definitio...

  • 0 kudos
2 More Replies
Oliver_Angelil
by Valued Contributor II
  • 9284 Views
  • 2 replies
  • 3 kudos

Resolved! Cell by cell execution of notebooks with VS code

I have the Databricks VS code extension setup to develop and run jobs remotely. (with Databricks Connect).I enjoy working on notebooks within the native Databricks workspace, especially for exploratory work because I can execute blocks of code step b...

  • 9284 Views
  • 2 replies
  • 3 kudos
Latest Reply
awadhesh14
New Contributor II
  • 3 kudos

Hi Folks,Is there a version upgrade for the resolution to this?

  • 3 kudos
1 More Replies
DylanStout
by Contributor
  • 12908 Views
  • 9 replies
  • 2 kudos

Resolved! Problem with tables not showing

When I use the current "result table" option it does not show the table results. This occurs when running SQL commands and the display() function for DataFrames.It is not linked to a Databricks runtime, since it occurs on all runtimes. I am not allow...

  • 12908 Views
  • 9 replies
  • 2 kudos
Latest Reply
DylanStout
Contributor
  • 2 kudos

Resizing the table causes the table to show its records in the cell 

  • 2 kudos
8 More Replies
Data_Engineer3
by Contributor III
  • 1907 Views
  • 1 replies
  • 0 kudos

Identify the associated notenook for the application running from the spark UI

In spark UI, I can see the application running with the application ID, from this spark UI, could I able to see the which notebook is running with that applications is this possible?I am interested in learning more about the jobs, stage how it works ...

Data Engineering
Databricks
  • 1907 Views
  • 1 replies
  • 0 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 0 kudos

https://spark.apache.org/docs/3.1.1/api/python/reference/api/pyspark.SparkContext.setJobDescription.htmlspark.setJobDescription("my name") will make your life easier. Just put it in the notebook.You should also put it after each action (show, count, ...

  • 0 kudos
Govind3331
by New Contributor
  • 2366 Views
  • 1 replies
  • 0 kudos

How to capture/Identify Incremental rows when No primary key columns in tables

Q1. My source is SQL server tables, I want to identify only latest records(incremental rows) and load those into BRNZE layer. Instead of full load to ADLS, we want to capture only incremental rows and load into ADLS for further processing. NOTE: Prob...

  • 2366 Views
  • 1 replies
  • 0 kudos
Latest Reply
Slaw
New Contributor II
  • 0 kudos

Hi, what kind of SQL source is it? MS SQL, MySQL, PostgreSQL?

  • 0 kudos
Etyr
by Contributor II
  • 3549 Views
  • 2 replies
  • 0 kudos

Can not change databricks-connect port

I have a Databricks cluster with 10.4 runtime, when I configure databricks-connect configure I put all the information needed and using the default port 15001, databricks-connect test works.But changing the port to 443 does not work, I tried to do a ...

Etyr_0-1709564343385.png
Data Engineering
databricks-connect
port
pyspark
spark
  • 3549 Views
  • 2 replies
  • 0 kudos
Latest Reply
Etyr
Contributor II
  • 0 kudos

@daniel_sahal Thank you for the reply, indeed port 443 is used by a lot of applucations and could be problematic. But I also tried port `15002` and it didn't work. No other ports than default one works

  • 0 kudos
1 More Replies
Olaoye_Somide
by New Contributor III
  • 2819 Views
  • 1 replies
  • 1 kudos

Avoiding Duplicate Ingestion with Autoloader and Migrated S3 Data

Hi Team,We recently migrated event files from our previous S3 bucket to a new one. While utilizing Autoloader for batch ingestion, we've encountered an issue where the migrated data is being processed as new events. This leads to duplicate records in...

Data Engineering
autoloader
RocksDB
S3
  • 2819 Views
  • 1 replies
  • 1 kudos
Latest Reply
daniel_sahal
Databricks MVP
  • 1 kudos

@Olaoye_Somide Changing the source means that Autoloader discovers the files as a new (technically - they are on a new location, so they are new indeed).To overcome the issue you can use modifiedAfter property

  • 1 kudos
joss
by New Contributor II
  • 1309 Views
  • 1 replies
  • 1 kudos

NPE on CreateJacksonParser and Databricks 14.3LTS with Spark StructuredStreaming

hello,I have a spark StructuredStreaming job : the source is a kafka topic in json.it work find with databricks 14.2, but when a change to 14.3LTS, I have a NPE in CreateJacksonParser:Caused by: NullPointerException: at org.apache.spark.sql.catalys...

  • 1309 Views
  • 1 replies
  • 1 kudos
Latest Reply
joss
New Contributor II
  • 1 kudos

Hi ,thank you for your quick reply,i found the problem :  val newSchema = spark.read.json(df.select("data").as[String]).schemaif "data" have 1 value to null, in 14.2  it work, but with 14.3LTS this function return a NPEI don't know if it is a bug

  • 1 kudos
lawrence009
by Contributor
  • 4289 Views
  • 5 replies
  • 1 kudos

Contact Support re Billing Error

How do I contact billing support? I am billed through AWS Marketplace and noticed last month the SQL Pro discount is not being reflected in my statement.

  • 4289 Views
  • 5 replies
  • 1 kudos
Latest Reply
santiagortiiz
New Contributor III
  • 1 kudos

Hi, could anybody provide a contact email? I have sent emails to many contacts described in the support page here and in AWS, but no response from any channel. My problem is that databricks charged me by the resources used during a free trial, what i...

  • 1 kudos
4 More Replies
LukeD
by New Contributor II
  • 2699 Views
  • 3 replies
  • 1 kudos

Billing support contact

Hi,What is the best way to contact Databricks support? I see the differences between AWS billing and Databricks report and I'm looking for explanation of that. I've send 3 messages last week by this form https://www.databricks.com/company/contact but...

  • 2699 Views
  • 3 replies
  • 1 kudos
Latest Reply
santiagortiiz
New Contributor III
  • 1 kudos

Hi, I'm facing the same issue with signing in my workspace, and I have a billing error, databricks charged me for a free trial, and I have sent a lot of emails, posted a topic in the community, I contacted people in AWS and they said that it must be ...

  • 1 kudos
2 More Replies
MCosta
by New Contributor III
  • 15006 Views
  • 10 replies
  • 19 kudos

Resolved! Debugging!

Hi ML folks, We are using Databricks to train deep learning models. The code, however, has a complex structure of classes. This would work fine in a perfect bug-free world like Alice in Wonderland. Debugging in Databricks is awkward. We ended up do...

  • 15006 Views
  • 10 replies
  • 19 kudos
Latest Reply
petern
New Contributor II
  • 19 kudos

Has this been solved yet; a mature way to debug code on databricks. I'm running in the same kind of issue.Variable explorer can be used and pdb, but not the same really..

  • 19 kudos
9 More Replies
DatBoi
by Contributor
  • 6371 Views
  • 2 replies
  • 2 kudos

Resolved! How big should a delta table be to benefit from liquid clustering?

My questions is pretty straightforward - how big should a delta table be to benefit from liquid clustering? I know the answer will most likely depend on the details of how you are querying the data, but what is the recommendation?I know Databricks re...

  • 6371 Views
  • 2 replies
  • 2 kudos
Latest Reply
daniel_sahal
Databricks MVP
  • 2 kudos

@DatBoi Once you watch this video you'll understand more about Liquid Clustering https://www.youtube.com/watch?v=5t6wX28JC_M&ab_channel=DeltaLakeLong story short:I know Databricks recommends not partitioning on tables less than 1 TB and aim for 1 GB ...

  • 2 kudos
1 More Replies

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels