Data Engineering

Forum Posts

Sorted by:

by dkxxx-rc • Contributor

03-31-2025 1:21:30 PM

2326 Views
2 replies
1 kudos

Can't "run all below" - "command is part of a batch that is still running"

Weirdness in Databricks on AWS. In a notebook that is doing absolutely nothing, I click the "Run All Above" or "Run All Below" button on a cell, and it won't do anything at all except pop up a little message near the general "Run All" button, saying...

Data Engineering

2326 Views
2 replies
1 kudos

03-31-2025 1:21:30 PM

View Replies

Latest Reply

Advika
Community Manager

04-04-2025 5:35:42 AM

1 kudos

Hello @dkxxx-rc! Can you check if any background processes are still running in your notebook that might be interfering with new executions? If you are using Databricks Runtime 14.0 or above, cells run in batches, so any error halts execution, and in...

1 kudos

04-04-2025 5:35:42 AM

1 More Replies

by Prabakar • Databricks Employee

10-22-2021 3:30:57 PM

3598 Views
1 replies
2 kudos

Accessing the regions that are disabled by default in AWS from Databricks. In AWS we have 4 regions that are disabled by default. You must first enabl...

Accessing the regions that are disabled by default in AWS from Databricks.In AWS we have 4 regions that are disabled by default. You must first enable it before you can create and manage resources. The following Regions are disabled by default:Africa...

Data Engineering

3598 Views
1 replies
2 kudos

10-22-2021 3:30:57 PM

View Replies

Latest Reply

AndreaCuda
New Contributor II

04-08-2025 6:46:34 AM

2 kudos

Hello - We are looking to deploy and run Databricks in AWS in Bahrain, or UAE. Is this possible? This post is older so wondering if this is a viable option.

2 kudos

04-08-2025 6:46:34 AM

by JooseSauli • New Contributor II

03-28-2025 8:09:34 AM

2159 Views
3 replies
3 kudos

How to make .py files available for import?

Hello,I've looked around, but cannot find an answer. In my Azure Databricks workspace, users have Python notebooks which all make use of the same helper functions and classes. Instead of housing the helper code in notebooks and having %run magics in ...

Data Engineering

2159 Views
3 replies
3 kudos

03-28-2025 8:09:34 AM

View Replies

Latest Reply

JooseSauli
New Contributor II

04-08-2025 5:52:24 AM

3 kudos

Hi Brahmareddy,Thanks for your reply. Your second approach is quite close to what I already tried earlier. Your post got me to do some more testing, and although I don't know how to set the sys.path via the init script (it says here and here that it'...

3 kudos

04-08-2025 5:52:24 AM

2 More Replies

by MDV • Databricks Partner

04-08-2025 2:03:41 AM

1092 Views
2 replies
0 kudos

Problem with df.first() or collect() when collation different from UTF8_BINARY

I'm getting a error when I want to select the first() or collect() from a dataframe when using a collation different than UTF8_BINARYExample that reproduces the issue :This works :df_result = spark.sql(f""" SELECT 'en-us' AS ET...

Data Engineering

1092 Views
2 replies
0 kudos

04-08-2025 2:03:41 AM

View Replies

Latest Reply

SP_6721
Honored Contributor II

04-08-2025 2:54:05 AM

0 kudos

Hi @MDV I guess the issue likely comes from how non-default collations like UTF8_LCASE behave during serialization when using first() or collect(). As a workaround wrap the value in a subquery and re-cast the collation back to UTF8_BINARY before acce...

0 kudos

04-08-2025 2:54:05 AM

1 More Replies

by 21f3001806 • New Contributor III

04-07-2025 10:56:04 PM

1312 Views
3 replies
1 kudos

Resolved! Dynamic inference tasks in workflows using dabs

I have some workflows where we use dynamic inference to set task values or capture job executions counts or output rows. Is there any way I can set these dynamic values using the ui but can i do the same at the time of dabs workflow creation. Can you...

Data Engineering

1312 Views
3 replies
1 kudos

04-07-2025 10:56:04 PM

View Replies

Latest Reply

21f3001806
New Contributor III

04-08-2025 4:20:31 AM

1 kudos

Thanks @ashraf1395 , I got the idea of what I was looking for.

1 kudos

04-08-2025 4:20:31 AM

2 More Replies

by bigkahunaburger • New Contributor II

04-08-2025 1:26:31 AM

2182 Views
1 replies
0 kudos

Databricks SQL row limits

hi there,my dataset is approx 408K rows. i am trying to run a query that will return everything. but the results set seems to stop at 64K rows.i've seen a few posts in here asking about it, but they are several years old and a solution is promised. b...

Data Engineering

2182 Views
1 replies
0 kudos

04-08-2025 1:26:31 AM

View Replies

Latest Reply

Renu_
Valued Contributor II

04-08-2025 2:23:16 AM

0 kudos

Hi @bigkahunaburger,The 64k row limit in Databricks SQL applies only to the UI display, not the actual data processing. To access your full dataset, you can use the Download full results option to save the query output.Or use Spark or JDBC/ODBC conne...

0 kudos

04-08-2025 2:23:16 AM

by Soufiane_Darraz • New Contributor II

04-07-2025 7:20:22 AM

2084 Views
2 replies
4 kudos

Resolved! Generic pipeline with Databricks workflows with multiple triggers on a single job

A big limitation of Databricks Workflows is that you can’t have multiple triggers on a single job. If you have a generic pipeline using Databricks notebooks and need to trigger it at different times for different sources, there’s no built-in way to h...

Data Engineering

2084 Views
2 replies
4 kudos

04-07-2025 7:20:22 AM

View Replies

Latest Reply

ashraf1395
Honored Contributor

04-07-2025 10:49:37 PM

4 kudos

Hi there @Soufiane_Darraz , completely aggreed with this point. it becomes frustrating when we cannot use multiple triggers in our workflows. Some examples we use in our databricks works or have seen being used in the industry are- Simple : Using an ...

4 kudos

04-07-2025 10:49:37 PM

1 More Replies

by Anonymous • Not applicable

02-10-2022 4:10:04 AM

8541 Views
9 replies
2 kudos

Resolved! Issue in creating workspace - Custom AWS Configuration

We have tried to create new workspace using "Custom AWS Configuration" and we have given our own VPC (Customer managed VPC) and tried but workspace failed to launch. We are getting below error which couldn't understand where the issue is in.Workspace...

Data Engineering

8541 Views
9 replies
2 kudos

02-10-2022 4:10:04 AM

View Replies

Latest Reply

Briggsrr
New Contributor II

04-07-2025 11:36:40 PM

2 kudos

Experiencing workspace launch failures with custom AWS configuration is frustrating. The "MALFORMED_REQUEST" error and failed network validation checks suggest a VPC configuration issue. It feels like playing Infinite Craft, endlessly combining eleme...

2 kudos

04-07-2025 11:36:40 PM

8 More Replies

by minhhung0507 • Valued Contributor

04-04-2025 6:33:24 AM

3900 Views
4 replies
2 kudos

Optimizing Spark Read Performance on Delta Tables with Deletion Vectors Enabled

Hi Databricks Experts,I'm currently using Delta Live Table to generate master data managed within Unity Catalog, with the data stored directly in Google Cloud Storage. I then utilize Spark to read these master data from the GCS bucket. However, I’m ...

Data Engineering

3900 Views
4 replies
2 kudos

04-04-2025 6:33:24 AM

View Replies

Latest Reply

minhhung0507
Valued Contributor

04-07-2025 9:18:59 PM

2 kudos

Hi @Louis_Frolio , thanks for your explaination.In case we can't optimize spark locally as fast as databicks. Do you have any suggestion for us to optimize performance in this scenario?

2 kudos

04-07-2025 9:18:59 PM

3 More Replies

by Håkon_Åmdal • New Contributor III

10-25-2021 2:11:51 AM

7728 Views
2 replies
1 kudos

Resolved! Incorrect length for `string` returned by the Databricks ODBC driver

Dear Databricks and community,I have been struggling with a bug related to using golang and the Databricks ODBC driver.It turns out that `SQLDescribeColW` consequently returns 256 as a length for `string` columns. However, in Spark, strings might b...

Data Engineering

7728 Views
2 replies
1 kudos

10-25-2021 2:11:51 AM

View Replies

Latest Reply

JR-20773
New Contributor II

04-07-2025 7:09:34 PM

1 kudos

Is there a permanent solution for this? We are still seeing this issue and it's been years.

1 kudos

04-07-2025 7:09:34 PM

1 More Replies

by dixonantony • New Contributor III

11-24-2024 11:20:55 PM

2262 Views
5 replies
1 kudos

not able to create table from pyspark sql using databricks unity catalog open apis

I was trying to access databricks and do DDL/DML operations using databricks unity catalog open apis. The create schema and select tables are working, but create table is not working due to below issues, could you please help?I was using pyspark sql ...

Data Engineering

2262 Views
5 replies
1 kudos

11-24-2024 11:20:55 PM

View Replies

Latest Reply

Alberto_Umana
Databricks Employee

11-25-2024 6:04:06 AM

1 kudos

Hello @dixonantony Can you try running this command? spark.sql("create table datatest.dischema.demoTab1(id int, name VARCHAR(10), age int)") Ensure that you have the necessary permissions to create tables in Unity Catalog. You need the CREATE TABLE ...

1 kudos

11-25-2024 6:04:06 AM

4 More Replies

by SamAdams • Contributor

03-25-2025 2:08:35 PM

2536 Views
4 replies
3 kudos

Migrating source directory in an existing DLT Pipeline with Autoloader

I have a DLT pipeline that reads data in S3 into an append-only bronze layer using Autoloader. The data sink needs to be changed to a new s3 bucket in a new account, and data in the existing s3 bucket migrated to the new one.Will Autoloader still be ...

Data Engineering

2536 Views
4 replies
3 kudos

03-25-2025 2:08:35 PM

View Replies

Latest Reply

Brahmareddy
Esteemed Contributor

04-07-2025 12:13:58 PM

3 kudos

Hi SamAdams,How are doing today? , Really appreciate you sharing what worked and what didn’t in your case! You're absolutely right—when switching buckets, not just folders within a bucket, that spark.databricks.cloudFiles.checkSourceChanged config be...

3 kudos

04-07-2025 12:13:58 PM

3 More Replies

by Shetty_1338 • New Contributor III

04-02-2025 1:25:42 PM

6168 Views
1 replies
0 kudos

Trying to connect SFTP directly in Databricks

Hi, As a proof of concept, I have created an ADLS and enabled SFTP> Created Local User and SSH Private key. Now, I am trying to connect this SFTP connection directly in Databricks to create table or data frame. Below is the code snipper I have used a...

Data Engineering

6168 Views
1 replies
0 kudos

04-02-2025 1:25:42 PM

View Replies

Latest Reply

Louis_Frolio
Databricks Employee

04-07-2025 11:53:24 AM

0 kudos

```scala// For Spark 3.x with Scala 2.12 (common in newer Databricks runtimes)com.springml:spark-sftp_2.12:1.1.5``` ```scalacom.springml:spark-sftp_2.11:1.1.5``` ```scalaval df = spark.read.format("com.springml.spark.sftp").option("host", "your-sft...

0 kudos

04-07-2025 11:53:24 AM

by Bathri • New Contributor II

04-07-2025 1:33:25 AM

1154 Views
2 replies
0 kudos

Compute Launch Failed

Hi Team,Facing this error while we are creating a compute in DatBricksCannot launch the cluster because the user specified an invalid argument.Instance ID: failed-4d70d115-f338-43a0-9Internal error message: The VM launch request to AWS failed, please...

Data Engineering

1154 Views
2 replies
0 kudos

04-07-2025 1:33:25 AM

View Replies

Latest Reply

Bathri
New Contributor II

04-07-2025 8:49:38 AM

0 kudos

@Renu_ I have the necessary permissions, even the admin permissions as well and moreover with the same policy in another account it works fine but it not working for specific account

0 kudos

04-07-2025 8:49:38 AM

1 More Replies

by mrstevegross • Contributor III

03-25-2025 7:50:33 AM

2267 Views
5 replies
1 kudos

Resolved! Attempt to use a custom container with an instance pool fails

I am trying to run a job with (1) custom containers, and (2) via an instance pool. Here's the setup:The custom container is just the DBR-provided `databricksruntime/standard:12.2-LTS`The instance pool is defined via the UI (see screenshot, below).At ...

Data Engineering

2267 Views
5 replies
1 kudos

03-25-2025 7:50:33 AM

View Replies

Latest Reply

mrstevegross
Contributor III

04-07-2025 7:33:56 AM

1 kudos

If anyone from DBR is monitoring this thread, can y'all confirm my understanding and--if so--update the docs to reflect this requirement?

1 kudos

04-07-2025 7:33:56 AM

4 More Replies

Databricks Community

Forum Posts

Can't "run all below" - "command is part of a batch that is still running"

Accessing the regions that are disabled by default in AWS from Databricks. In AWS we have 4 regions that are disabled by default. You must first enabl...

How to make .py files available for import?

Problem with df.first() or collect() when collation different from UTF8_BINARY

Resolved! Dynamic inference tasks in workflows using dabs

Databricks SQL row limits

Resolved! Generic pipeline with Databricks workflows with multiple triggers on a single job

Resolved! Issue in creating workspace - Custom AWS Configuration

Optimizing Spark Read Performance on Delta Tables with Deletion Vectors Enabled

Resolved! Incorrect length for `string` returned by the Databricks ODBC driver

not able to create table from pyspark sql using databricks unity catalog open apis

Migrating source directory in an existing DLT Pipeline with Autoloader

Trying to connect SFTP directly in Databricks

Compute Launch Failed

Resolved! Attempt to use a custom container with an instance pool fails

Databricks optimization for query perfomance and p...

Parametrize the DLT pipeline for dynamic loading o...

File Arrival Trigger - Multiple tables

Issue while handling Deletes and Inserts in Struct...

DLT with CDC and schema changes in streaming pipel...