cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

jim12321
by New Contributor II
  • 1834 Views
  • 0 replies
  • 0 kudos

Foreign Catalog SQL Server Dynamic Port

When creating a Foreign Catalog SQL Server Connection, a port number is required. However, many sql servers have dynamic ports and the port number keeps changing. Is there a solution for this?In most common cases, it should allow instance name instea...

jim12321_0-1709756538967.png
Data Engineering
Foreign Catalog
JDBC
  • 1834 Views
  • 0 replies
  • 0 kudos
397973
by New Contributor III
  • 8622 Views
  • 2 replies
  • 0 kudos

Spark submit - not reading one of my --py-files arguments

Hi. In Databricks workflows, I submit a spark job (Type = "Spark Submit"), and a bunch of parameters, starting with --py-files.This works where all the files are in the same s3 path, but I get errors when I put a "common" module in a different s3 pat...

  • 8622 Views
  • 2 replies
  • 0 kudos
Latest Reply
MichTalebzadeh
Valued Contributor
  • 0 kudos

 This below is catered for yarn modeif your application code primarily consists of Python files and does not require a separate virtual environment with specific dependencies, you can use the --py-files argument in spark-submitspark-submit --verbose ...

  • 0 kudos
1 More Replies
sandeep91
by New Contributor III
  • 8379 Views
  • 5 replies
  • 2 kudos

Resolved! Databricks Job: Package Name and EntryPoint parameters for the Python Wheel file

I have created Python wheel file with simple file structure and uploaded into cluster library and was able to run the packages in Notebook but, when I am trying to create a Job using python wheel and provide the package name and run the task it fails...

image
  • 8379 Views
  • 5 replies
  • 2 kudos
Latest Reply
AndréSalvati
New Contributor III
  • 2 kudos

There you can see a complete template project with (the new!!!) Databricks Asset Bundles tool and a python wheel task. Please, follow the instructions for deployment.https://github.com/andre-salvati/databricks-template

  • 2 kudos
4 More Replies
DavMes
by New Contributor
  • 4078 Views
  • 2 replies
  • 0 kudos

databricks asset bundles - error demo project

Hi,I am using the v0.205.0 version of the CLI. I wanted to test the demo project (databricks bundle init) of the Databricks Asset Bundles, however I am getting an error after databricks bundle deploy (validate is ok).  artifacts.whl.AutoDetect: Detec...

Data Engineering
DAB
Databricks Asset Bundles
  • 4078 Views
  • 2 replies
  • 0 kudos
Latest Reply
AndréSalvati
New Contributor III
  • 0 kudos

There you can see a complete template project with Databricks Asset Bundles and a python wheel task. Please, follow the instructions for deployment.https://github.com/andre-salvati/databricks-template

  • 0 kudos
1 More Replies
jwilliam
by Contributor
  • 2698 Views
  • 2 replies
  • 1 kudos

Resolved! [BUG] Databricks install WHL as JAR in Python Wheel Task?

I'm using Python Wheel Task in Databricks job with WHEEL dependencies. However, the cluster installed the dependencies as JAR instead of WHEEL. Is this an expected behavior or a bug?

  • 2698 Views
  • 2 replies
  • 1 kudos
Latest Reply
AndréSalvati
New Contributor III
  • 1 kudos

There you can see a complete template project with a python wheel task and Databricks Asset Bundles. Please, follow the instructions for deployment.https://github.com/andre-salvati/databricks-template

  • 1 kudos
1 More Replies
GGG_P
by New Contributor III
  • 6393 Views
  • 3 replies
  • 0 kudos

Databricks Tasks Python wheel : How access to JobID & runID ?

I'm using Python (as Python wheel application) on Databricks.I deploy & run my jobs using dbx.I defined some Databricks Workflow using Python wheel tasks.Everything is working fine, but I'm having issue to extract "databricks_job_id" & "databricks_ru...

  • 6393 Views
  • 3 replies
  • 0 kudos
Latest Reply
AndréSalvati
New Contributor III
  • 0 kudos

There you can see a complete template project with Databricks Asset Bundles and python wheel task. Please, follow the instructions for deployment.https://github.com/andre-salvati/databricks-templateIn particular, take a look at the workflow definitio...

  • 0 kudos
2 More Replies
Oliver_Angelil
by Valued Contributor II
  • 8473 Views
  • 2 replies
  • 3 kudos

Resolved! Cell by cell execution of notebooks with VS code

I have the Databricks VS code extension setup to develop and run jobs remotely. (with Databricks Connect).I enjoy working on notebooks within the native Databricks workspace, especially for exploratory work because I can execute blocks of code step b...

  • 8473 Views
  • 2 replies
  • 3 kudos
Latest Reply
awadhesh14
New Contributor II
  • 3 kudos

Hi Folks,Is there a version upgrade for the resolution to this?

  • 3 kudos
1 More Replies
DylanStout
by Contributor
  • 9368 Views
  • 9 replies
  • 2 kudos

Resolved! Problem with tables not showing

When I use the current "result table" option it does not show the table results. This occurs when running SQL commands and the display() function for DataFrames.It is not linked to a Databricks runtime, since it occurs on all runtimes. I am not allow...

  • 9368 Views
  • 9 replies
  • 2 kudos
Latest Reply
DylanStout
Contributor
  • 2 kudos

Resizing the table causes the table to show its records in the cell 

  • 2 kudos
8 More Replies
Data_Engineer3
by Contributor III
  • 1434 Views
  • 1 replies
  • 0 kudos

Identify the associated notenook for the application running from the spark UI

In spark UI, I can see the application running with the application ID, from this spark UI, could I able to see the which notebook is running with that applications is this possible?I am interested in learning more about the jobs, stage how it works ...

Data Engineering
Databricks
  • 1434 Views
  • 1 replies
  • 0 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 0 kudos

https://spark.apache.org/docs/3.1.1/api/python/reference/api/pyspark.SparkContext.setJobDescription.htmlspark.setJobDescription("my name") will make your life easier. Just put it in the notebook.You should also put it after each action (show, count, ...

  • 0 kudos
Govind3331
by New Contributor
  • 1997 Views
  • 1 replies
  • 0 kudos

How to capture/Identify Incremental rows when No primary key columns in tables

Q1. My source is SQL server tables, I want to identify only latest records(incremental rows) and load those into BRNZE layer. Instead of full load to ADLS, we want to capture only incremental rows and load into ADLS for further processing. NOTE: Prob...

  • 1997 Views
  • 1 replies
  • 0 kudos
Latest Reply
Slaw
New Contributor II
  • 0 kudos

Hi, what kind of SQL source is it? MS SQL, MySQL, PostgreSQL?

  • 0 kudos
Etyr
by Contributor
  • 2268 Views
  • 2 replies
  • 0 kudos

Can not change databricks-connect port

I have a Databricks cluster with 10.4 runtime, when I configure databricks-connect configure I put all the information needed and using the default port 15001, databricks-connect test works.But changing the port to 443 does not work, I tried to do a ...

Etyr_0-1709564343385.png
Data Engineering
databricks-connect
port
pyspark
spark
  • 2268 Views
  • 2 replies
  • 0 kudos
Latest Reply
Etyr
Contributor
  • 0 kudos

@daniel_sahal Thank you for the reply, indeed port 443 is used by a lot of applucations and could be problematic. But I also tried port `15002` and it didn't work. No other ports than default one works

  • 0 kudos
1 More Replies
Olaoye_Somide
by New Contributor III
  • 2003 Views
  • 1 replies
  • 1 kudos

Avoiding Duplicate Ingestion with Autoloader and Migrated S3 Data

Hi Team,We recently migrated event files from our previous S3 bucket to a new one. While utilizing Autoloader for batch ingestion, we've encountered an issue where the migrated data is being processed as new events. This leads to duplicate records in...

Data Engineering
autoloader
RocksDB
S3
  • 2003 Views
  • 1 replies
  • 1 kudos
Latest Reply
daniel_sahal
Esteemed Contributor
  • 1 kudos

@Olaoye_Somide Changing the source means that Autoloader discovers the files as a new (technically - they are on a new location, so they are new indeed).To overcome the issue you can use modifiedAfter property

  • 1 kudos
joss
by New Contributor II
  • 1063 Views
  • 1 replies
  • 1 kudos

NPE on CreateJacksonParser and Databricks 14.3LTS with Spark StructuredStreaming

hello,I have a spark StructuredStreaming job : the source is a kafka topic in json.it work find with databricks 14.2, but when a change to 14.3LTS, I have a NPE in CreateJacksonParser:Caused by: NullPointerException: at org.apache.spark.sql.catalys...

  • 1063 Views
  • 1 replies
  • 1 kudos
Latest Reply
joss
New Contributor II
  • 1 kudos

Hi ,thank you for your quick reply,i found the problem :  val newSchema = spark.read.json(df.select("data").as[String]).schemaif "data" have 1 value to null, in 14.2  it work, but with 14.3LTS this function return a NPEI don't know if it is a bug

  • 1 kudos
lawrence009
by Contributor
  • 3105 Views
  • 5 replies
  • 1 kudos

Contact Support re Billing Error

How do I contact billing support? I am billed through AWS Marketplace and noticed last month the SQL Pro discount is not being reflected in my statement.

  • 3105 Views
  • 5 replies
  • 1 kudos
Latest Reply
santiagortiiz
New Contributor III
  • 1 kudos

Hi, could anybody provide a contact email? I have sent emails to many contacts described in the support page here and in AWS, but no response from any channel. My problem is that databricks charged me by the resources used during a free trial, what i...

  • 1 kudos
4 More Replies

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels