cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

User16826987838
by Contributor
  • 2313 Views
  • 2 replies
  • 0 kudos

Convert pdf's is into structured data

Is there anything on Databricks to help read PDF (payment invoices and receipts for example) and convert it to structured data?

  • 2313 Views
  • 2 replies
  • 0 kudos
Latest Reply
SoniaFoster
New Contributor II
  • 0 kudos

Thanks! Converting PDF format is sometimes a difficult task as not all converters provide accuracy. I want to share with you one interesting tool I recently discovered that can make your work even more efficient. I recently came across an amazing onl...

  • 0 kudos
1 More Replies
Tam
by New Contributor III
  • 4145 Views
  • 3 replies
  • 0 kudos

Resolved! Error on Starting Databricks SQL Warehouse Serverless with Instance Profile

I have two workspaces, one in us-west-2 and the other in ap-southeast-1. I have configured the same instance profile for both workspaces. I followed the documentation to set up the instance profile for Databricks SQL Warehouse Serverless by adding th...

Tam_1-1709300806768.png
  • 4145 Views
  • 3 replies
  • 0 kudos
Latest Reply
Ayushi_Suthar
Databricks Employee
  • 0 kudos

Hi @Tam , Hope you are doing well!  I checked the error in details and it would be because the Instance Profile Name and the Role ARN name don't match exactly. Please see points 3 and 4 here in the docs: https://docs.databricks.com/sql/admin/serverle...

  • 0 kudos
2 More Replies
Stellar
by New Contributor II
  • 2599 Views
  • 1 replies
  • 0 kudos

CDC DLT

Hi all,I would appreciate some clarity regarding the DLT and CDC. So my first question would be, when it comes to the "source" table in the synta, is that CDC table or? Further, if we want to use only databricks, would mounting foreign catalog be a g...

  • 2599 Views
  • 1 replies
  • 0 kudos
Avinash_Narala
by Valued Contributor II
  • 5800 Views
  • 4 replies
  • 1 kudos

Rewrite Notebooks Programatically

Hello,I want to refactor the notebook programatically. So, written the code as follows: import requestsimport base64# Databricks Workspace API URLsworkspace_url = f"{host}/api/2.0/workspace"export_url = f"{workspace_url}/export"import_url = f"{worksp...

  • 5800 Views
  • 4 replies
  • 1 kudos
NT911
by New Contributor II
  • 2062 Views
  • 1 replies
  • 0 kudos

Databricks Error while executing this line of code

import geopandas as gpdfrom shapely.geometry import *Pd_csv_sel_pq_gg = gpd.GeoDataFrame(Points_csv_sel_pq_gg.toPandas(), geometry="geometry") Error is given below  /databricks/spark/python/pyspark/sql/pandas/utils.py:37: DeprecationWarning: distutil...

  • 2062 Views
  • 1 replies
  • 0 kudos
Avinash_Narala
by Valued Contributor II
  • 2161 Views
  • 2 replies
  • 1 kudos

Processing Notebook in python

Hi,I exported notebook from my workspace into my local machine and want to read it in my python code .Is there a way to read the content of my notebook programmatically and make necessary changes and save as dbc/html notebook. 

  • 2161 Views
  • 2 replies
  • 1 kudos
Latest Reply
ossinova
Contributor II
  • 1 kudos

Not sure what you are trying to accomplish here. If you want to export a notebook as python to do manual editing locally, and then import it back into your workspace why not use repos and connect to it using VSCode etc? You can export the notebook as...

  • 1 kudos
1 More Replies
Brad
by Contributor II
  • 7063 Views
  • 5 replies
  • 1 kudos

Dash in Databricks notebook directly

Hi team,Is there a way to embed plotly dash directly inside Databricks notebook?Thanks

  • 7063 Views
  • 5 replies
  • 1 kudos
Latest Reply
calfromplotly
New Contributor II
  • 1 kudos

Hi @Brad - Unfortunately, it's not possible today to embed Dash in a Databricks notebook cell without our Enterprise-level databricks-dash library. Longer term, we are working towards Dash natively working within Databricks notebooks, but that timeli...

  • 1 kudos
4 More Replies
jim12321
by New Contributor II
  • 1933 Views
  • 0 replies
  • 0 kudos

Foreign Catalog SQL Server Dynamic Port

When creating a Foreign Catalog SQL Server Connection, a port number is required. However, many sql servers have dynamic ports and the port number keeps changing. Is there a solution for this?In most common cases, it should allow instance name instea...

jim12321_0-1709756538967.png
Data Engineering
Foreign Catalog
JDBC
  • 1933 Views
  • 0 replies
  • 0 kudos
397973
by New Contributor III
  • 8720 Views
  • 2 replies
  • 0 kudos

Spark submit - not reading one of my --py-files arguments

Hi. In Databricks workflows, I submit a spark job (Type = "Spark Submit"), and a bunch of parameters, starting with --py-files.This works where all the files are in the same s3 path, but I get errors when I put a "common" module in a different s3 pat...

  • 8720 Views
  • 2 replies
  • 0 kudos
Latest Reply
MichTalebzadeh
Valued Contributor
  • 0 kudos

 This below is catered for yarn modeif your application code primarily consists of Python files and does not require a separate virtual environment with specific dependencies, you can use the --py-files argument in spark-submitspark-submit --verbose ...

  • 0 kudos
1 More Replies
sandeep91
by New Contributor III
  • 8610 Views
  • 5 replies
  • 2 kudos

Resolved! Databricks Job: Package Name and EntryPoint parameters for the Python Wheel file

I have created Python wheel file with simple file structure and uploaded into cluster library and was able to run the packages in Notebook but, when I am trying to create a Job using python wheel and provide the package name and run the task it fails...

image
  • 8610 Views
  • 5 replies
  • 2 kudos
Latest Reply
AndréSalvati
New Contributor III
  • 2 kudos

There you can see a complete template project with (the new!!!) Databricks Asset Bundles tool and a python wheel task. Please, follow the instructions for deployment.https://github.com/andre-salvati/databricks-template

  • 2 kudos
4 More Replies
DavMes
by New Contributor
  • 4164 Views
  • 2 replies
  • 0 kudos

databricks asset bundles - error demo project

Hi,I am using the v0.205.0 version of the CLI. I wanted to test the demo project (databricks bundle init) of the Databricks Asset Bundles, however I am getting an error after databricks bundle deploy (validate is ok).  artifacts.whl.AutoDetect: Detec...

Data Engineering
DAB
Databricks Asset Bundles
  • 4164 Views
  • 2 replies
  • 0 kudos
Latest Reply
AndréSalvati
New Contributor III
  • 0 kudos

There you can see a complete template project with Databricks Asset Bundles and a python wheel task. Please, follow the instructions for deployment.https://github.com/andre-salvati/databricks-template

  • 0 kudos
1 More Replies
jwilliam
by Contributor
  • 2860 Views
  • 2 replies
  • 1 kudos

Resolved! [BUG] Databricks install WHL as JAR in Python Wheel Task?

I'm using Python Wheel Task in Databricks job with WHEEL dependencies. However, the cluster installed the dependencies as JAR instead of WHEEL. Is this an expected behavior or a bug?

  • 2860 Views
  • 2 replies
  • 1 kudos
Latest Reply
AndréSalvati
New Contributor III
  • 1 kudos

There you can see a complete template project with a python wheel task and Databricks Asset Bundles. Please, follow the instructions for deployment.https://github.com/andre-salvati/databricks-template

  • 1 kudos
1 More Replies
GGG_P
by New Contributor III
  • 6534 Views
  • 3 replies
  • 0 kudos

Databricks Tasks Python wheel : How access to JobID & runID ?

I'm using Python (as Python wheel application) on Databricks.I deploy & run my jobs using dbx.I defined some Databricks Workflow using Python wheel tasks.Everything is working fine, but I'm having issue to extract "databricks_job_id" & "databricks_ru...

  • 6534 Views
  • 3 replies
  • 0 kudos
Latest Reply
AndréSalvati
New Contributor III
  • 0 kudos

There you can see a complete template project with Databricks Asset Bundles and python wheel task. Please, follow the instructions for deployment.https://github.com/andre-salvati/databricks-templateIn particular, take a look at the workflow definitio...

  • 0 kudos
2 More Replies
Oliver_Angelil
by Valued Contributor II
  • 8603 Views
  • 2 replies
  • 3 kudos

Resolved! Cell by cell execution of notebooks with VS code

I have the Databricks VS code extension setup to develop and run jobs remotely. (with Databricks Connect).I enjoy working on notebooks within the native Databricks workspace, especially for exploratory work because I can execute blocks of code step b...

  • 8603 Views
  • 2 replies
  • 3 kudos
Latest Reply
awadhesh14
New Contributor II
  • 3 kudos

Hi Folks,Is there a version upgrade for the resolution to this?

  • 3 kudos
1 More Replies

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels