cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

HansAdriaans
by New Contributor II
  • 3490 Views
  • 2 replies
  • 1 kudos

Can not open socket to local (127.0.0.1)

Hi, I'm running a databricks pipeline hourly using python notebooks checked out from git with on-demand compute (using r6gd.xlarge 32GB + 4 CPU's Gravaton). Most of the times the pipeline runs without problems. However, sometimes the first notebook f...

  • 3490 Views
  • 2 replies
  • 1 kudos
Latest Reply
prasad_dhongade
New Contributor II
  • 1 kudos

HI I am facing similar error, the cluster runs 24/7 and this issue is observed for a few runs in the day. The data volume being processed is not huge but the logic that this needs to go though is complex. I do not want to include the display in produ...

  • 1 kudos
1 More Replies
Dedescoat
by New Contributor
  • 367 Views
  • 1 replies
  • 2 kudos

Resolved! JDBC with serverless compute

Hi community,We have a scenario where we need to ingest data into Lakebase. Currently, we are trying to use JDBC to write data in a notebook with serverless compute. However, the documentation on serverless limitations (link) mentions that JAR librar...

  • 367 Views
  • 1 replies
  • 2 kudos
Latest Reply
Louis_Frolio
Databricks Employee
  • 2 kudos

Hello @Dedescoat , I did some poking around in our documentation and would like to offer some tips/tricks to help you further diagnose the issue.  Yes — using a Unity Catalog JDBC connection to load a driver from a UC volume and write from serverless...

  • 2 kudos
rokata
by New Contributor II
  • 1446 Views
  • 3 replies
  • 1 kudos

Resolved! How to access artifacts from job run?

In a workflow, is there a way to access task artifacts from within the run?I have a job with a task TasksA, which is a dbt task that creates some artifacts. I want to store these artifacts, but the job artifacts seems to be saved in a location I cann...

  • 1446 Views
  • 3 replies
  • 1 kudos
Latest Reply
BlackCurrantDS
New Contributor II
  • 1 kudos

is there a better way to access artifacts now?

  • 1 kudos
2 More Replies
ajay_wavicle
by Contributor
  • 177 Views
  • 3 replies
  • 1 kudos

Resolved! Connect to spark session and uc tables in python file

How to Connect to spark session and uc tables in python file. I want to read uc tables in python modules in databricks workspace. How to access the current sparksession 

  • 177 Views
  • 3 replies
  • 1 kudos
Latest Reply
szymon_dybczak
Esteemed Contributor III
  • 1 kudos

Hi @ajay_wavicle ,Azure Databricks automatically creates a SparkContext for each compute cluster, and creates an isolated SparkSession for each notebook or job executed against the cluster. So following should work in python module in Databricks Work...

  • 1 kudos
2 More Replies
JEAG
by New Contributor III
  • 43652 Views
  • 15 replies
  • 6 kudos

Error writing parquet files

Hi, we are having this chain of errors every day in different files and processes:An error occurred while calling o11255.parquet.: org.apache.spark.SparkException: Job aborted.Caused by: org.apache.spark.SparkException: Job aborted due to stage failu...

  • 43652 Views
  • 15 replies
  • 6 kudos
Latest Reply
Kolana
New Contributor II
  • 6 kudos

HiEven I am facing this issue nowDid you identified the fix?

  • 6 kudos
14 More Replies
yit337
by Contributor
  • 1025 Views
  • 1 replies
  • 2 kudos

Resolved! Database schema migration tools: Flyway vs Liquibase

I'm working on comparison between these tools for database schema migration on Databricks Delta tables.Any experience with the tools? Pros/cons?I've been through most of the blogs on how to implement them. I seek for comparison or real production exp...

  • 1025 Views
  • 1 replies
  • 2 kudos
Latest Reply
pradeep_singh
Contributor
  • 2 kudos

Pick Flyway if you prefer a simple, lightweight solution with sequential migrations and minimal overhead. It’s well suited for teams managing straightforward schema changes—mainly add, drop, or alter operations—through small, explicit, versioned scri...

  • 2 kudos
adeosthali
by New Contributor II
  • 450 Views
  • 1 replies
  • 1 kudos

Resolved! External to Managed

We are looking to migrate to managed tables using ALTER TABLE fq_table_name SET MANAGED During migration process we need to have ability to switch between external & managed tables & vice versa.UNSET MANAGED works for 14 days. But I'm unable to just ...

  • 450 Views
  • 1 replies
  • 1 kudos
Latest Reply
anshu_roy
Databricks Employee
  • 1 kudos

Hello, Thanks for sharing your investigation. You’re correct: you can’t immediately recreate the converted table as an external table on the same original path after dropping it. The Unity Catalog table object remains in a soft‑deleted state for 7 da...

  • 1 kudos
mj
by New Contributor
  • 162 Views
  • 2 replies
  • 2 kudos
  • 162 Views
  • 2 replies
  • 2 kudos
Latest Reply
Louis_Frolio
Databricks Employee
  • 2 kudos

@mj  â€” can you share a bit more context on what you’re ultimately trying to accomplish? As @balajij8  pointed out, Unity Catalog is the right foundation for sharing data, but with a little more detail I can be much more prescriptive about the best ap...

  • 2 kudos
1 More Replies
DhivyaKeerthana
by New Contributor III
  • 323 Views
  • 3 replies
  • 6 kudos

Resolved! Lakeflow Spark Declarative Pipelines do not support read_files with format excel yet

Error: Failed to find the data source: excel. Make sure the provider name is correct and the package is properly registered and compatible with your Spark version.Upon checking the cluster details, it is running with 16.4 yet. When will it be updated...

  • 323 Views
  • 3 replies
  • 6 kudos
Latest Reply
DhivyaKeerthana
New Contributor III
  • 6 kudos

Thank you @szymon_dybczak, it works!!

  • 6 kudos
2 More Replies
Yoshe1101
by New Contributor III
  • 5628 Views
  • 4 replies
  • 1 kudos

Resolved! Cluster terminated. Reason: Npip Tunnel Setup Failure

Hi, I have recently deployed a new Workspace in AWS and getting the following error when trying to start the cluster:"NPIP tunnel setup failure during launch. Please try again later and contact Databricks if the problem persists. Instance bootstrap f...

  • 5628 Views
  • 4 replies
  • 1 kudos
Latest Reply
Yoshe1101
New Contributor III
  • 1 kudos

Finally, this error was fixed by changing the DHCP configuration of the VPC.

  • 1 kudos
3 More Replies
rcostanza
by New Contributor III
  • 838 Views
  • 2 replies
  • 0 kudos

Lakeflow pipeline (formerly DLT pipeline) performance progressively degrades on a persistent cluster

I have a small (under 20 tables, all streaming) DLT pipeline running in triggered mode, scheduled every 15min during the workday.  For development I've set `pipelines.clusterShutdown.delay` to avoid having to start a cluster every update.I've noticed...

  • 838 Views
  • 2 replies
  • 0 kudos
Latest Reply
JargerBiirli
New Contributor II
  • 0 kudos

I'm facing this exact issue, only with a standard job instead of a DLT pipeline. I can't use serverless or restart the cluster periodically due to things out of my control. Any specific advice on diagnosis and resolving? I don't think it can be check...

  • 0 kudos
1 More Replies
seefoods
by Valued Contributor
  • 225 Views
  • 1 replies
  • 1 kudos

Resolved! ingest unstructure data like sharepoint with databricks

Hello everyone, Actually, i am using lakeflow connector to load sharepoint into delta table but the connector is not mature. Someone knows how can i use rest Api sharepoint to load into delta table? Cordially, Seefoods

  • 225 Views
  • 1 replies
  • 1 kudos
Latest Reply
pradeep_singh
Contributor
  • 1 kudos

Hi @seefoods curious to know what issues you are facing with the connector  assuming you are using the standard SharePoint connector with a Unity Catalog connection and then use spark.read / Auto Loader / COPY INTO against SharePoint URLs.  

  • 1 kudos
Phani1
by Databricks MVP
  • 3882 Views
  • 10 replies
  • 0 kudos

Triggering DLT Pipelines with Dynamic Parameters

Hi Team,We have a scenario where we need to pass a dynamic parameter to a Spark job that will trigger a DLT pipeline in append mode. Can you please suggest an approach for this?Regards,Phani

  • 3882 Views
  • 10 replies
  • 0 kudos
Latest Reply
pradeep_singh
Contributor
  • 0 kudos

If you’re looking to build a dynamic, configuration-driven DLT pipeline, a better approach is to use a configuration table. This table should include fields such as table_name, pipeline_name, table_properties, and other relevant settings. Your notebo...

  • 0 kudos
9 More Replies
mguirao
by New Contributor II
  • 268 Views
  • 1 replies
  • 1 kudos

Resolved! Bigquery using foreign catalog change behavior from runtime 15.4 to 16.4

Hello,I wanted to update my all-purpose cluster from 15.4 to higher (like 16.4) but I noticed a change in the behavior of my query on a BigQuery catalog.Using 15.4 my query runs only 1 job Using 16.4, same query on same resources produce 35 jobs (and...

databricks_no_issue.jpg databricks_issue.jpeg
  • 268 Views
  • 1 replies
  • 1 kudos
Latest Reply
pradeep_singh
Contributor
  • 1 kudos

In DBR 16.1+ Databricks switched the BigQuery federation connector from JDBC to the BigQuery Storage API,which parallelizes reads (more jobs) and transfers data directly from BigQuery to Databricks compute—so cross‑cloud queries (Azure → GCP) can inc...

  • 1 kudos
Hari_P
by New Contributor II
  • 1878 Views
  • 11 replies
  • 0 kudos

IBM DataStage to Databricks Migration

Hi All,We are currently exploring a use case involving migration from IBM DataStage to Databricks. I noticed that LakeBridge supports automated code conversion for this process. If anyone has experience using LakeBridge, could you please share any be...

  • 1878 Views
  • 11 replies
  • 0 kudos
Latest Reply
thelogicplus
Contributor II
  • 0 kudos

@pradeep_singh  check the ranking on this website best datastage to databricks migration tool

  • 0 kudos
10 More Replies
Labels