Data Engineering

Forum Posts

Sorted by:

by KrishnaWahi • New Contributor II

04-28-2022 6:54:59 AM

1597 Views
4 replies
2 kudos

Resolved! Query Databricks SQL Endpoint Database from NodeJs

I have my Databricks SQL Endpoint running, I have created many tables there using AWS S3 Delta. Now I want to query the Databricks SQL Endpoint from NodeJs. So It is possible ?I tried to find and researched a lot but didn't get any useful tutorial fo...

Data Engineering

1597 Views
4 replies
2 kudos

04-28-2022 6:54:59 AM

View Replies

Latest Reply

Kaniz
Community Manager

05-13-2022 5:58:08 AM

2 kudos

Hi @Krishna Wahi, Just a friendly follow-up. Do you still need help, or @Bilal Aslam's response help you to find the solution? Please let us know.

2 kudos

05-13-2022 5:58:08 AM

3 More Replies

by findinpath • Contributor

05-04-2022 7:41:58 AM

2058 Views
9 replies
4 kudos

Resolved! Please share Databricks JDBC Driver on Maven Central

Can you please share the Databricks JDBC Driver on Maven Central ?I see it available on : https://databricks.com/spark/jdbc-drivers-download . However I can’t find it on Maven Central to make use of it in automated tests connecting to Databricks infr...

Data Engineering

2058 Views
9 replies
4 kudos

05-04-2022 7:41:58 AM

View Replies

Latest Reply

findinpath
Contributor

05-13-2022 5:15:47 AM

4 kudos

Thank you for the assistance and for releasing the jdbc driver to Maven Central.I consider the issue closed.

4 kudos

05-13-2022 5:15:47 AM

8 More Replies

by Hila_DG • New Contributor II

01-12-2022 2:40:56 PM

1622 Views
5 replies
4 kudos

Resolved! How to proactively monitor the use of the cache for driver node?

The problem:We have a dataframe which is based on the query:SELECT * FROM Very_Big_TableThis table returns over 4 GB of data, and when we try to push the data to Power BI we get the error message:ODBC: ERROR [HY000] [Microsoft][Hardy] (35) Error from...

Data Engineering

1622 Views
5 replies
4 kudos

01-12-2022 2:40:56 PM

View Replies

Latest Reply

Anonymous
Not applicable

05-13-2022 5:23:11 AM

4 kudos

Hey @Hila Galapo Hope everything is going good. Just wanted to check in if you were able to resolve your issue or do you need more help? We'd love to hear from you.Thanks!

4 kudos

05-13-2022 5:23:11 AM

4 More Replies

by herry • New Contributor III

12-09-2021 2:02:36 PM

1033 Views
4 replies
1 kudos

Resolved! Using AWS glue schema registry in Databricks Autoloader

Hi All,I plan to store the schema of my table in AWS glue schema registry. Is there any simple way to use it in Databricks Autoloader?My goal is to build a data pipeline with Autoloader for schema validation.

Data Engineering

1033 Views
4 replies
1 kudos

12-09-2021 2:02:36 PM

View Replies

Latest Reply

Anonymous
Not applicable

05-13-2022 5:14:02 AM

1 kudos

Hey there @Herry Ramli Hope all is well!Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution? Else please let us know if you need more help. We'd love to hear from you.Thanks!

1 kudos

05-13-2022 5:14:02 AM

3 More Replies

by Jessy • New Contributor

07-27-2021 7:10:26 AM

7534 Views
6 replies
3 kudos

Resolved! Azure databrick throwing 'Py4JJavaError: An error occurred while calling o267._run.' error while calling one notebook from another notebook. Has someone come across such error? Can some one suggest the solution if faced similar issue.

'Py4JJavaError: An error occurred while calling o267._run.'

Data Engineering

7534 Views
6 replies
3 kudos

07-27-2021 7:10:26 AM

View Replies

Latest Reply

Anonymous
Not applicable

05-13-2022 4:36:02 AM

3 kudos

Hi @Pierre MASSE Thank you so much for getting back to us. It's really great of you to send in the solution and mark the answer as best. We really appreciate your time.Wish you a great Databricks journey ahead!

3 kudos

05-13-2022 4:36:02 AM

5 More Replies

by Hubert-Dudek • Esteemed Contributor III

05-06-2022 8:36:29 AM

517 Views
1 replies
20 kudos

From Databricks runtime 10.5 you can get metadata using the hidden _metadata column. Currently, the column contains input files information (file_path...

From Databricks runtime 10.5 you can get metadata using the hidden _metadata column. Currently, the column contains input files information (file_path, file_name, file_size and file_modification_time)

Data Engineering

517 Views
1 replies
20 kudos

05-06-2022 8:36:29 AM

View Replies

Latest Reply

Kaniz
Community Manager

05-13-2022 3:54:26 AM

20 kudos

Amazing post @Hubert Dudek !

20 kudos

05-13-2022 3:54:26 AM

by Hubert-Dudek • Esteemed Contributor III

05-07-2022 3:34:11 AM

536 Views
1 replies
20 kudos

From databricks runtime 10.5 ARRAY_SIZE function was added.

Data Engineering

536 Views
1 replies
20 kudos

05-07-2022 3:34:11 AM

View Replies

Latest Reply

Kaniz
Community Manager

05-13-2022 3:47:02 AM

20 kudos

Thank you for sharing this information with the community @Hubert Dudek !

20 kudos

05-13-2022 3:47:02 AM

by karolinalbinsso • New Contributor II

05-03-2022 2:18:01 AM

1743 Views
2 replies
3 kudos

Resolved! How to access the job-Scheduling Date from within the notebook?

I have created a job that contains a notebook that reads a file from Azure Storage. The file-name contains the date of when the file was transferred to the storage. A new file arrives every Monday, and the read-job is scheduled to run every Monday. I...

Data Engineering

1743 Views
2 replies
3 kudos

05-03-2022 2:18:01 AM

View Replies

Latest Reply

Kaniz
Community Manager

05-13-2022 3:27:08 AM

3 kudos

Hi @Karolin Albinsson , Just a friendly follow-up. Do you still need help, or @Hubert Dudek (Customer) 's response help you to find the solution? Please let us know.

3 kudos

05-13-2022 3:27:08 AM

1 More Replies

by Kiedi7 • New Contributor

05-11-2022 1:08:53 PM

2519 Views
2 replies
3 kudos

Resolved! Databricks Community Edition - Enable Databricks SQL

Hi,I am trying to enable the Databricks SQL environment from the Community Edition workspace (using left menu pane). However, the options I see in the dropdown menu are or Data Science & Eng and Machine Learning workspaces/environments. Is Databricks...

Data Engineering

2519 Views
2 replies
3 kudos

05-11-2022 1:08:53 PM

View Replies

Latest Reply

BilalAslamDbrx
Honored Contributor II

05-12-2022 11:14:14 PM

3 kudos

+1 on what Andrew said. We are not currently planning on enabling Databricks SQL in Community. Please use a trial account on one of the supported clouds.

3 kudos

05-12-2022 11:14:14 PM

1 More Replies

by SimhadriRaju • New Contributor

08-17-2021 4:11:02 AM

2553 Views
2 replies
1 kudos

rename a mount point folder

I am reading the data from a folder /mnt/lake/customer where mnt/lake is the mount path referring to ADLS Gen 2, Now I would like to rename a folder from /mnt/lake/customer to /mnt/lake/customeraddress without copying the data from one folder to ano...

Data Engineering

2553 Views
2 replies
1 kudos

08-17-2021 4:11:02 AM

View Replies

Latest Reply

Atanu
Esteemed Contributor

05-12-2022 8:17:40 PM

1 kudos

https://docs.databricks.com/data/databricks-file-system.html#local-file-api-limitations this might help @Simhadri Raju

1 kudos

05-12-2022 8:17:40 PM

1 More Replies

by Vik1 • New Contributor II

01-21-2022 9:30:02 AM

3268 Views
4 replies
2 kudos

Resolved! Data persistence, Dataframe, and Delta

I am new to databricks platform. what is the best way to keep data persistent so that once I restart the cluster I don't need to run all the codes again?So that I can simply continue developing my notebook with the cached data.I have created many dat...

Data Engineering

3268 Views
4 replies
2 kudos

01-21-2022 9:30:02 AM

View Replies

Latest Reply

Anonymous
Not applicable

05-12-2022 6:50:49 AM

2 kudos

Hey there @Vivek Ranjan Hope you are doing great!Just wanted to check in if you were able to resolve your issue or do you need more help? We'd love to hear from you.Thanks!

2 kudos

05-12-2022 6:50:49 AM

3 More Replies

by Orianh • Valued Contributor II

05-12-2022 1:30:00 AM

1409 Views
0 replies
0 kudos

Retrieve a row from indexed spark data frame.

Hello guys, I'm having an issue when trying to get a row values from spark data frame.I have a DF with index column, and i need to be able to return a row based on index in fastest way possible .I tried to partitionBy index column, optimize with zor...

Data Engineering

1409 Views
0 replies
0 kudos

05-12-2022 1:30:00 AM

by CrisBerg_65149 • New Contributor III

04-14-2022 2:34:45 AM

1674 Views
6 replies
6 kudos

Resolved! SELECT * FROM delta doesn't work on Spark 3.2

Using DBR 10 or later and I’m getting an error when running the following querySELECT * FROM delta.`s3://some_path`getting org.apache.spark.SparkException: Unable to fetch tables of db deltaFor 3.2.0+ they recommend reading like this:CREATE TEMPORAR...

Data Engineering

1674 Views
6 replies
6 kudos

04-14-2022 2:34:45 AM

View Replies

Latest Reply

CrisBerg_65149
New Contributor III

05-11-2022 5:46:31 AM

6 kudos

Got support from Databricks.Unfortunately, someone created a DB called delta, so the query was done against that DB instead. Issue was solved

6 kudos

05-11-2022 5:46:31 AM

5 More Replies

by zyx • New Contributor II

04-22-2022 12:30:34 AM

656 Views
4 replies
3 kudos

data bricks bi tool Supported from pdf formatted ?

as per Reporting point of view pdf formatted supporting or not.

Data Engineering

656 Views
4 replies
3 kudos

04-22-2022 12:30:34 AM

View Replies

Latest Reply

Kaniz
Community Manager

05-11-2022 5:09:41 AM

3 kudos

Hi @Bhanu aravapalli , Just a friendly follow-up. Do you still need help? Please let us know.

3 kudos

05-11-2022 5:09:41 AM

3 More Replies

by Development • New Contributor III

04-12-2022 11:25:00 PM

2445 Views
8 replies
5 kudos

Delta Table with 130 columns taking time

Hi All,We are facing one un-usual issue while loading data into Delta table using Spark SQL. We have one delta table which have around 135 columns and also having PARTITIONED BY. For this trying to load 15 millions of data volume but its not loading ...

Data Engineering

2445 Views
8 replies
5 kudos

04-12-2022 11:25:00 PM

View Replies

Latest Reply

Development
New Contributor III

04-27-2022 8:27:46 AM

5 kudos

@Kaniz Fatma @Parker Temple I found an root cause its because of serialization. we are using UDF to drive an column on dataframe, when we are trying to load data into delta table or write data into parquet file we are facing serialization issue ....

5 kudos

04-27-2022 8:27:46 AM

7 More Replies

User

Count

1601

736

343

284

246

Databricks

Forum Posts

Resolved! Query Databricks SQL Endpoint Database from NodeJs

Resolved! Please share Databricks JDBC Driver on Maven Central

Resolved! How to proactively monitor the use of the cache for driver node?

Resolved! Using AWS glue schema registry in Databricks Autoloader

Resolved! Azure databrick throwing 'Py4JJavaError: An error occurred while calling o267._run.' error while calling one notebook from another notebook. Has someone come across such error? Can some one suggest the solution if faced similar issue.

From Databricks runtime 10.5 you can get metadata using the hidden _metadata column. Currently, the column contains input files information (file_path...

From databricks runtime 10.5 ARRAY_SIZE function was added.

Resolved! How to access the job-Scheduling Date from within the notebook?

Resolved! Databricks Community Edition - Enable Databricks SQL

rename a mount point folder

Resolved! Data persistence, Dataframe, and Delta

Retrieve a row from indexed spark data frame.

Resolved! SELECT * FROM delta doesn't work on Spark 3.2

data bricks bi tool Supported from pdf formatted ?

Delta Table with 130 columns taking time

DELTA_EXCEED_CHAR_VARCHAR_LIMIT

Not able to set run_as service_principal_name

Pyspark operations slowness in CLuster 14.3LTS as ...

[Databricks Assets Bundles] Workflow trigger on fi...

Addressing Pipeline Error Handling in Databricks b...