cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Splush
by New Contributor II
  • 339 Views
  • 1 replies
  • 0 kudos

JDBC Oracle Connection change Container Statement

Hey,Im running into a weird issue while running the following code:def getDf(query, preamble_sql=None): jdbc_url = f"jdbc:oracle:thin:@//{host}:{port}/{service_name}" request = spark.read \ .format("jdbc") \ .o...

  • 339 Views
  • 1 replies
  • 0 kudos
Latest Reply
BigRoux
Databricks Employee
  • 0 kudos

Here is something to consider: The issue you're experiencing likely stems from differences in behavior when accessing Oracle database objects via Spark JDBC versus other database clients like DBeaver. Specifically, Spark's JDBC interface may perform ...

  • 0 kudos
JD2
by Contributor
  • 5801 Views
  • 6 replies
  • 7 kudos

Resolved! Auto Loader for Shape File

Hello: As you can see from below link, that it support 7 file formats. I am dealing with GeoSpatial Shape files and I want to know if Auto Loader can support Shape Files ???Any help on this is greatly appreciated. Thanks. https://docs.microsoft.com/...

  • 5801 Views
  • 6 replies
  • 7 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 7 kudos

You could try to use the binary file type. But the disadvantage of this is that the content of the shape files will be put into a column, that might not be what you want.If you absolutely want to use the autoloader, maybe some thinking outside the b...

  • 7 kudos
5 More Replies
petehart92
by New Contributor II
  • 6266 Views
  • 6 replies
  • 6 kudos

Error While Rendering Visualization -- Map (Markers)

I have a table with latitude and longitude for a few addresses (no more than 10 at the moment) but when I select the appropriate columns in the visualization editor for Map (Markers) I get an message that states "error while rendering visualization"....

Not a lot of detail...
  • 6266 Views
  • 6 replies
  • 6 kudos
Latest Reply
Gabi_A
New Contributor II
  • 6 kudos

Having the same issue. Every time I update my SQL, all the widgets drop and show the error 'Unable to render visualization'. The only way I found to fix is to manually duplicate all my widgets and delete the old ones with errors, which is a pain and ...

  • 6 kudos
5 More Replies
martheelise
by New Contributor
  • 344 Views
  • 1 replies
  • 0 kudos

What happens when you change from .ipynb to .py as default fileformat for notebooks

Hi, I was struggling to do Pull Requests with the "new" default fileformat for Notebooks and wanted to change it back to source(.py). My questions are:1) Does this affect the whole workspace for all users?2) Does this change the format of old .ipynb ...

  • 344 Views
  • 1 replies
  • 0 kudos
Latest Reply
Walter_C
Databricks Employee
  • 0 kudos

Changing the default notebook file format from .ipynb to .py in Databricks has several implications based on current implementations and user scenarios: User Experience: The .ipynb format captures more comprehensive data, including environment setti...

  • 0 kudos
Sahil0007
by New Contributor III
  • 1042 Views
  • 8 replies
  • 0 kudos

Databricks Delta table Merge Command Issue

I have one table customer and 1 temp view which I am creating from the incremental file and using as a source in merge command. Earlier the notebook was working fine from the adf pipeline, but from past few days, I am getting an error states that my ...

  • 1042 Views
  • 8 replies
  • 0 kudos
Latest Reply
MujtabaNoori
New Contributor III
  • 0 kudos

Hi @Sahil0007 ,The [REDACTED] value you're seeing is being retrieved from Key Vault.Here is the workaround, you can reverse the value twice to decode it and retrieve the original string. Alternatively, you can slice the string into two parts and conc...

  • 0 kudos
7 More Replies
jorhona
by New Contributor III
  • 543 Views
  • 2 replies
  • 0 kudos

Resolved! Deleted schema leads to DLT pipeline problems

Hello. When testing a dlt table pipeline i accidentally mispelt the target schema. The pipeline worked and created the tables. After realising my mistake, i deleted the tables and the schema - thinking nothing of it. However when I run the pipeline w...

Data Engineering
Databricks
dlt
pipeline
  • 543 Views
  • 2 replies
  • 0 kudos
Latest Reply
jorhona
New Contributor III
  • 0 kudos

In the end i deleted and recreated the pipeline which fixed the problem. Luckily it was only in dev so didnt lose any history of pipeline success etc in prod. Still, is a bit of a pain for dlt, along with the problem of multiple developers not being ...

  • 0 kudos
1 More Replies
ForestDD
by New Contributor
  • 8645 Views
  • 5 replies
  • 1 kudos

java.lang.NoSuchMethodError after upgrade to Databricks Runtime 13

We use spark mssql connector to connect sql server, it works well on dbr runtime 10.*, 11.* and 12.*. But when we use dbr 13.*, we got the error below. It happens when we are trying to use df.write to save the data to the sql database.We have encount...

  • 8645 Views
  • 5 replies
  • 1 kudos
Latest Reply
AradhanaSahu
New Contributor II
  • 1 kudos

I was also facing the same issue while writing to a sql server. Was able to resolve it by updating the format to "jdbc" instead of "com.micorsoft.sqlserver.jdbc.spark".df.write.format("jdbc") works on DBR 13.3 LTS using the connector: com.microsoft.a...

  • 1 kudos
4 More Replies
iarregui
by New Contributor
  • 4320 Views
  • 3 replies
  • 0 kudos

Getting a Databricks static IP

Hello. I want to connect from my Databricks workspace to an external API to extract some data. The owner of the API asks for an IP to provide the token necessary for the extraction of data. Therefore I would need to set a static IP in Databricks that...

  • 4320 Views
  • 3 replies
  • 0 kudos
Latest Reply
Wojciech_BUK
Valued Contributor III
  • 0 kudos

Hello, the easiest way (in Azure) is to deploy Workspace in VNET Injection mode and attach NAT Gateway to you VNET. NAT GW require Public IP. This IP will be your static egress IP for all Cluster in for this Workspace.Note: both NAT GW and IP Address...

  • 0 kudos
2 More Replies
samtech
by New Contributor
  • 370 Views
  • 1 replies
  • 0 kudos

Regional Workspaces . How to consolidate

Hi,We have similar catalog (specific to regional data) in APAC worksace and America workspace. Our goal is to have silver table created in each regional worksapce and then consolidate as gold in one of the workspace. So if i create silver in APAC and...

  • 370 Views
  • 1 replies
  • 0 kudos
Latest Reply
lingareddy_Alva
Honored Contributor III
  • 0 kudos

Hi @samtech Yes, you're on the right track! For cross-workspace data access in Databricks.Yes, Delta Sharing is the recommended approach for accessing tables across different Databricks workspaces/regions.

  • 0 kudos
Bart_DE
by New Contributor II
  • 468 Views
  • 1 replies
  • 0 kudos

Resolved! Databricks Asset Bundle conditional job cluster size?

Hey folks,Can someone please suggest if there is a way to spawn a job cluster of a given size if a parameter of the job invocation (e.g file_name) contains a desired value? I have a job which 90% of the time deals with very small files, but the remai...

  • 468 Views
  • 1 replies
  • 0 kudos
Latest Reply
lingareddy_Alva
Honored Contributor III
  • 0 kudos

Hi @Bart_DE No — a single job.yml file can’t “look inside” a parameter like file_name and then decide to spin up a different job-cluster size on the fly.Job-cluster definitions in Databricks Workflows (Jobs) are static. All the heavy-lifting has to b...

  • 0 kudos
Vasu_Kumar_T
by New Contributor II
  • 302 Views
  • 1 replies
  • 0 kudos

Job performance issue : Configurations

Hello All, One job taking more than 7hrs, when we added below configuration its taking <2:30 mins but after deployment with same parameters again its taking 7+hrs. 1) spark.conf.set("spark.sql.shuffle.partitions", 500) --> spark.conf.set("spark.sql.s...

  • 302 Views
  • 1 replies
  • 0 kudos
Latest Reply
lingareddy_Alva
Honored Contributor III
  • 0 kudos

Hi @Vasu_Kumar_T This is a classic Spark performance inconsistency issue. The fact that it works fine in your notebookbut degrades after deployment suggests several potential causes. Here are the most likely culprits and solutions:Primary Suspects1. ...

  • 0 kudos
Mahtab67
by New Contributor
  • 649 Views
  • 1 replies
  • 0 kudos

Spark Kafka Client Not Using Certs from Default truststore

Hi Team, I'm working on connecting Databricks to an external Kafka cluster secured with SASL_SSL (SCRAM-SHA-512 + certificate trust). We've encountered an issue where certificates imported into the default JVM truststore (cacerts) via an init script ...

  • 649 Views
  • 1 replies
  • 0 kudos
Latest Reply
lingareddy_Alva
Honored Contributor III
  • 0 kudos

Hi @Mahtab67 This is a common issue with Databricks and Kafka SSL connectivity.The problem stems from how Spark's Kafka connector handles SSL context initialization versus the JVM's default truststore.Root Cause Analysis:The Spark Kafka connector cre...

  • 0 kudos
Sainath368
by New Contributor III
  • 425 Views
  • 1 replies
  • 0 kudos

COMPUTE DELTA STATISTICS vs COMPUTE STATISTICS - Data Skipping

Hi all,I recently altered the data skipping stats columns on my Delta Lake table to optimize data skipping. Now, I’m wondering about the best practice for updating statistics:Is running ANALYZE TABLE <table_name> COMPUTE DELTA STATISTICS sufficient a...

  • 425 Views
  • 1 replies
  • 0 kudos
Latest Reply
Advika
Databricks Employee
  • 0 kudos

Hello @Sainath368! Running ANALYZE TABLE <table_name> COMPUTE DELTA STATISTICS is a good practice after modifying data skipping stats columns on a Delta Lake table. However, this command doesn’t update query optimizer stats. For that, you’ll need to ...

  • 0 kudos
Miloud_G
by New Contributor III
  • 851 Views
  • 2 replies
  • 2 kudos

Resolved! issue on databricks bundle deploy

HiI am trying to configure Databricks Asset Bundle, but got error on deploymentDatabricks bundle init ----------- OKDatabricks bundle validate ----- OKDatabricks bundle deploy ------ Failerror : PS C:\Databricks_DABs\DABs_Init\DABS_Init> databricks b...

  • 851 Views
  • 2 replies
  • 2 kudos
Latest Reply
Miloud_G
New Contributor III
  • 2 kudos

Thank you AdvilaI was enable to enable worspace files with scrip :from databricks.sdk.core import ApiClientclient = ApiClient()client.do("PATCH", "/api/2.0/workspace-conf", body={"enableWorkspaceFilesystem": "true"}, headers={"Content-Type": "applica...

  • 2 kudos
1 More Replies
ankit001mittal
by New Contributor III
  • 434 Views
  • 1 replies
  • 0 kudos

How to stop access SQL AI Functions usage

Hi Guys,Recently, Databricks came up with a new feature  SQL AI FunctionsIs there a way to stop users from using it without downgrading the runtime on cluster? by using Policies?Also, is there a way to stop users from using serverless, before there w...

  • 434 Views
  • 1 replies
  • 0 kudos
Latest Reply
Advika
Databricks Employee
  • 0 kudos

Hello @ankit001mittal! Currently, there's no direct way to disable SQL AI Functions in Databricks. To restrict the use of serverless compute, you can set up serverless budget policies that allow you to monitor and limit usage to some extent. However,...

  • 0 kudos

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels