Data Engineering

Forum Posts

Sorted by:

by Avinash_Narala • Databricks Partner

01-07-2025 11:24:47 PM

879 Views
1 replies
1 kudos

Resolved! which type of cluster to use

Hi,Recently, I had some logic to collect the dataframe and process row by row. I am using 128GB driver node but it is taking significantly more time (like 2 hours for just 700 rows of data).May I know which type of cluster should I use and the driver...

Data Engineering

879 Views
1 replies
1 kudos

01-07-2025 11:24:47 PM

View Replies

Latest Reply

Ayushi_Suthar
Databricks Employee

01-08-2025 12:35:24 AM

1 kudos

Hi @Avinash_Narala , Good Day! For right-sizing the cluster, the recommended approach is a hybrid approach for node provisioning in the cluster along with autoscaling. This involves defining the number of on-demand instances and spot instances for t...

1 kudos

01-08-2025 12:35:24 AM

by Michael_Galli • Databricks Partner

06-12-2024 6:58:56 AM

4210 Views
5 replies
1 kudos

Resolved! Importing data into Excel from Databricks over ODBC OAuth / Simba Spark Driver

Hi all,I am refering to this articleConnect to Azure Databricks from Microsoft Excel - Azure Databricks | Microsoft LearnI use the latest SimbaSparkODBC-2.8.2.1013-Windows-64bit driver and configured in like in that documentation.In Databricks I use ...

Data Engineering

4210 Views
5 replies
1 kudos

06-12-2024 6:58:56 AM

View Replies

Latest Reply

Aydin
New Contributor II

01-07-2025 8:07:57 PM

1 kudos

Hi @Michael_Galli, we're currently experiencing the same issue. I've just asked our internal support team to raise a ticket with Microsoft but thought it would be worth reaching out to you. Have you had any luck resolving this issue?

1 kudos

01-07-2025 8:07:57 PM

4 More Replies

by sgannavaram • New Contributor III

04-06-2022 9:23:45 AM

4507 Views
3 replies
1 kudos

How to connect to IBM MQ from Databricks notebook?

We are trying to connect to IBM MQ and post message to MQ, which eventually consumed by mainframe application.What are the IBM MQ clients .jars / libraries installed in cluster ? if you have any sample code for connectivity that would be helpful.

Data Engineering

4507 Views
3 replies
1 kudos

04-06-2022 9:23:45 AM

View Replies

Latest Reply

none_ranjeet
New Contributor III

01-07-2025 1:46:59 PM

1 kudos

Were you able to do this connection other than rest API which have problem in reading Binary messages, Please suggest

1 kudos

01-07-2025 1:46:59 PM

2 More Replies

by aliacovella • Contributor

01-06-2025 12:30:22 PM

1529 Views
2 replies
2 kudos

Resolved! DLT Vs Notebook runs

I have this behavior that I'm not understanding. I have a notebook that defines a DLT from a Kinesis stream and a view from that DLT. This works when I run it from within workflow, configured using the DLT pipeline. If, however, I create a workflow a...

Data Engineering

1529 Views
2 replies
2 kudos

01-06-2025 12:30:22 PM

View Replies

Latest Reply

hari-prasad
Valued Contributor II

01-07-2025 1:11:28 PM

2 kudos

Hi @aliacovella , DLT notebooks/codes only works with DLT pipelines.And regular Spark or SQL notebooks work with workflows.

2 kudos

01-07-2025 1:11:28 PM

1 More Replies

by johnnwanosike • New Contributor III

01-07-2025 9:32:29 AM

993 Views
2 replies
0 kudos

Unable to connect internal hive metastore

I am unable to find the correct password for the internal Hive metastore I created. The protocol used was JDBC. What is the best way to connect to it? Additionally, I want to connect to an external Hive metastore as well.

Data Engineering

993 Views
2 replies
0 kudos

01-07-2025 9:32:29 AM

View Replies

Latest Reply

Alberto_Umana
Databricks Employee

01-07-2025 9:37:08 AM

0 kudos

Hi @johnnwanosike, How have you created the metastore? have you followed any documentation. About external hive you can refer to: https://docs.databricks.com/ja/archive/external-metastores/external-hive-metastore.html

0 kudos

01-07-2025 9:37:08 AM

1 More Replies

by PabloCSD • Valued Contributor II

01-06-2025 1:05:28 PM

5599 Views
4 replies
1 kudos

Resolved! How to connect via JDBC to SAP-HANA in a Databricks Notebook?

I have a set of connection credentials for SAP-HANA, how can I retrieve data from that location using JDBC?I have already installed in my cluster the ngdbc.jar (for the driver), but this simple Query has already taken more than 5 minutes and I don't ...

Data Engineering

5599 Views
4 replies
1 kudos

01-06-2025 1:05:28 PM

View Replies

Latest Reply

PabloCSD
Valued Contributor II

01-07-2025 8:52:57 AM

1 kudos

It worked changing the port to: 30041, the port for the next tenant (reference: https://community.sap.com/t5/technology-q-a/hana-connectivity-and-ports/qaq-p/12193927 ).jdbcQuery = '(SELECT * FROM DUMMY)' df_sap_hana_dummy_table = (spark.read .form...

1 kudos

01-07-2025 8:52:57 AM

3 More Replies

by jeremy98 • Honored Contributor

01-07-2025 8:38:25 AM

548 Views
1 replies
0 kudos

Dynamic scheduling again and again

Hi Community,Is it possible to dynamic scheduling a databricks job definition as is possible to do it on Airflow Dags? If not, which could be a way to handle it?

Data Engineering

548 Views
1 replies
0 kudos

01-07-2025 8:38:25 AM

View Replies

Latest Reply

Alberto_Umana
Databricks Employee

01-07-2025 8:43:55 AM

0 kudos

Hi @jeremy98, Databricks does not natively support dynamic scheduling of job definitions in the same way that Apache Airflow does with its Directed Acyclic Graphs (DAGs). However, there are ways to achieve similar functionality using Databricks Jobs:...

0 kudos

01-07-2025 8:43:55 AM

by KristiLogos • Contributor

01-06-2025 8:51:39 AM

1901 Views
6 replies
0 kudos

Resolved! Connection from BigQuery to Databricks populating dictionary keys as "v"

I was able to connect our Bigquery account to our Databricks catalog. However, all the keys in the nested dictionary columsn populate as 'v'. For example:{"v":[{"v":{"f":[{"v":"engagement_time_msec"},{"v":{"f":[{"v":null},{"v":"2"},{"v":null},{"v":nu...

Data Engineering

1901 Views
6 replies
0 kudos

01-06-2025 8:51:39 AM

View Replies

Latest Reply

KristiLogos
Contributor

01-06-2025 2:02:18 PM

0 kudos

@szymon_dybczak I couldn't run select TO_JSON_STRING(event_params) as event_params FROM ...I don't think thats a built-in Databricks. Is there another way you've had success?error:[UNRESOLVED_ROUTINE] Cannot resolve routine `TO_JSON_STRING` on search...

0 kudos

01-06-2025 2:02:18 PM

5 More Replies

by LGABI • New Contributor

01-07-2025 7:17:05 AM

1196 Views
2 replies
0 kudos

How to connect to Tableau Server FROM within Databricks Notebooks and publish data to Tableau Serv?

My company is having trouble connecting Databricks to Tableau Server. We need to be able to publish Hyper Files that are developed using Python on Databricks Notebooks to our Tableau Server, but it seems impossible to get a connection established des...

Data Engineering

1196 Views
2 replies
0 kudos

01-07-2025 7:17:05 AM

View Replies

Latest Reply

pgo
New Contributor III

01-07-2025 7:23:51 AM

0 kudos

Please use netcat command for testing connection.

0 kudos

01-07-2025 7:23:51 AM

1 More Replies

by jeremy98 • Honored Contributor

01-07-2025 7:08:01 AM

987 Views
1 replies
0 kudos

Resolved! Is there a INTERVAL data type?

Hi community,I was using a column in postgresSQL that is a DATETIME.TIMEDELTA, is it possible to have the same data type also in Databricks?

Data Engineering

987 Views
1 replies
0 kudos

01-07-2025 7:08:01 AM

View Replies

Latest Reply

Alberto_Umana
Databricks Employee

01-07-2025 7:09:51 AM

0 kudos

Hi @jeremy98, You can use the TIMESTAMP and TIMESTAMP_NTZ data types to handle date and time values, similar to the DATETIME type in PostgreSQL. However, Databricks does not have a direct equivalent to PostgreSQL's TIMEDELTA type https://docs.databri...

0 kudos

01-07-2025 7:09:51 AM

by pavan_yndpl • New Contributor

01-07-2025 1:56:25 AM

2359 Views
1 replies
0 kudos

How to resolve SSL_connect error when VPN is enabled

I am trying to connect to Databricks using ODBC protocol with Simba Driver DSN. I am able to successfully connect and access the data when our corporate VPN is turned OFF. but when it's turned ON , I am getting the following error "[Simba][ThriftExte...

Data Engineering

2359 Views
1 replies
0 kudos

01-07-2025 1:56:25 AM

View Replies

Latest Reply

Walter_C
Databricks Employee

01-07-2025 2:54:41 AM

0 kudos

The error you are encountering, " (14) Unexpected response from server during a HTTP connection: SSL_connect:", when trying to connect to Databricks using the ODBC protocol with the Simba Driver DSN while the corporate VPN is turned on, is likely rel...

0 kudos

01-07-2025 2:54:41 AM

by raghu2 • Databricks Partner

01-06-2025 12:53:47 PM

1897 Views
2 replies
1 kudos

Liquid Cluster enabled table - concurrent writes

I am trying to insert rows into a Liquid cluster enabled delta table using multiple threads. This link, states that liquid clustering is used for : Tables with concurrent write requirements.I get this error: [DELTA_CONCURRENT_APPEND] ConcurrentAppend...

Data Engineering

1897 Views
2 replies
1 kudos

01-06-2025 12:53:47 PM

View Replies

Latest Reply

TejeshS
Contributor

01-07-2025 1:24:44 AM

1 kudos

We encountered a similar issue as well, and the workaround we tried was partitioning those columns, as Liquid clustering can sometimes trigger this error.

1 kudos

01-07-2025 1:24:44 AM

1 More Replies

by shanisolomon • New Contributor II

01-06-2025 3:47:43 PM

1379 Views
2 replies
0 kudos

Databricks inconsistent count and select

Hi, I have a table with 2 versions:1. Add txn: path = "a.parquet" numRecords = 10 deletionVector = null2. Add txn: path = "a.parquet" numRecords = 10 deletionVector = (..., cardinality = 2)Please note both transactions point to the same physical path...

Data Engineering

1379 Views
2 replies
0 kudos

01-06-2025 3:47:43 PM

View Replies

Latest Reply

Walter_C
Databricks Employee

01-06-2025 5:42:26 PM

0 kudos

Hello the behavior observed indeed seems to be inconsistent with the expected behavior in delta, do you have a support contract to open a support ticket so this can be further analyzed?

0 kudos

01-06-2025 5:42:26 PM

1 More Replies

by Tej_04 • New Contributor

01-06-2025 2:13:27 PM

3148 Views
1 replies
0 kudos

Avoid scientific values

I am trying to insert data into catalog tables on data bricks but the values are being displayed in scientific notation which i am trying to avoid how do I view the data in standard formatfor example - 0.0000000 is being displayed as 0E-7

Data Engineering

catalogtables scientificnotation

3148 Views
1 replies
0 kudos

01-06-2025 2:13:27 PM

View Replies

Latest Reply

Alberto_Umana
Databricks Employee

01-06-2025 2:35:37 PM

0 kudos

Hi @Tej_04, Can you try with format_number Function. SELECT format_number(column_name, decimal_places) AS column_name FROM table_name; https://docs.databricks.com/en/sql/language-manual/functions/format_number.html

0 kudos

01-06-2025 2:35:37 PM

by noimeta • Contributor III

09-29-2022 1:11:43 AM

18677 Views
15 replies
12 kudos

Resolved! Error when create an external location using code

I'm trying to create an external location from notebook, and I got this kind of error[PARSE_SYNTAX_ERROR] Syntax error at or near 'LOCATION'(line 1, pos 16) == SQL == CREATE EXTERNAL LOCATION IF NOT EXISTS test_location URL 's3://test-bronze/db/tes...

Data Engineering

18677 Views
15 replies
12 kudos

09-29-2022 1:11:43 AM

View Replies

Latest Reply

Lokeshv
New Contributor II

02-13-2024 8:48:41 AM

12 kudos

Hey everyone,I'm facing an issue with retrieving data from a volume or table that contains a string with a symbol, for example, 'databricks+'. Whenever I try to retrieve this data, I encounter a syntax error. Can anyone help me resolve this issue?

12 kudos

02-13-2024 8:48:41 AM

14 More Replies

Databricks Community

Forum Posts

Resolved! which type of cluster to use

Resolved! Importing data into Excel from Databricks over ODBC OAuth / Simba Spark Driver

How to connect to IBM MQ from Databricks notebook?

Resolved! DLT Vs Notebook runs

Unable to connect internal hive metastore

Resolved! How to connect via JDBC to SAP-HANA in a Databricks Notebook?

Dynamic scheduling again and again

Resolved! Connection from BigQuery to Databricks populating dictionary keys as "v"

How to connect to Tableau Server FROM within Databricks Notebooks and publish data to Tableau Serv?

Resolved! Is there a INTERVAL data type?

How to resolve SSL_connect error when VPN is enabled

Liquid Cluster enabled table - concurrent writes

Databricks inconsistent count and select

Avoid scientific values

Resolved! Error when create an external location using code

File Arrival Trigger - Multiple tables

Issue while handling Deletes and Inserts in Struct...

DLT with CDC and schema changes in streaming pipel...

how to update not tracked column only in new row v...

Databricks Cost Estimation Template