Data Engineering

Forum Posts

Sorted by:

by DennisB • New Contributor III

07-24-2023 4:47:42 AM

2824 Views
4 replies
2 kudos

Resolved! Better Worker Node Core Utilisation

Hi everyone,Hoping someone can help me with this problem. I have an embarrassingly parallel workload, which I'm parallelising over 4 worker nodes (of type Standard_F4, so 4 cores each). Each workload is single-threaded, so I believe that only one cor...

Data Engineering

2824 Views
4 replies
2 kudos

07-24-2023 4:47:42 AM

View Replies

Latest Reply

DennisB
New Contributor III

07-26-2023 5:51:24 AM

2 kudos

So I managed to get the 1-core-per-executor working successfully. The bit that wasn't working was spark.executor.memory -- this was too high, but lowering it so that the sum of the executors memory was ~90% of the worker node's memory allowed it to w...

2 kudos

07-26-2023 5:51:24 AM

3 More Replies

by MadrasSenpai • New Contributor II

06-12-2023 12:14:21 PM

1352 Views
3 replies
2 kudos

How to install cmdstanpy in dbx cluster

I have built an HMC model using cmdstand. In my local machine, I have install cmdstan for the following approach. import cmdstanpy cmdstanpy.install_cmdstan()But in Databricks I need to reinstall it every time when I train a new model, from the noteb...

Data Engineering

1352 Views
3 replies
2 kudos

06-12-2023 12:14:21 PM

View Replies

Latest Reply

Anonymous
Not applicable

06-15-2023 11:10:31 PM

2 kudos

Hi @Rajamannar Aanjaram Krishnamoorthy Great to meet you, and thanks for your question! Let's see if your peers in the community have an answer to your question. Thanks.

2 kudos

06-15-2023 11:10:31 PM

2 More Replies

by sarguido • New Contributor II

02-21-2023 5:13:09 AM

2215 Views
4 replies
2 kudos

Delta Live Tables: bulk import of historical data?

Hello! I'm very new to working with Delta Live Tables and I'm having some issues. I'm trying to import a large amount of historical data into DLT. However letting the DLT pipeline run forever doesn't work with the database we're trying to import from...

Data Engineering

2215 Views
4 replies
2 kudos

02-21-2023 5:13:09 AM

View Replies

Latest Reply

Anonymous
Not applicable

04-21-2023 11:31:20 PM

2 kudos

Hi @Sarah Guido Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answers y...

2 kudos

04-21-2023 11:31:20 PM

3 More Replies

by NWIEFInance • New Contributor

07-25-2023 9:45:28 AM

714 Views
1 replies
2 kudos

Connect to EXCEL

> I have hard time connecting to Excel, any help connecting Data Bricks to EXCEL

Data Engineering

714 Views
1 replies
2 kudos

07-25-2023 9:45:28 AM

View Replies

Latest Reply

Kaniz_Fatma
Community Manager

07-26-2023 2:38:15 AM

2 kudos

Hi @NWIEFInance, This article describes using the Databricks ODBC driver to connect Databricks to Microsoft Excel. After establishing the connection, you can access the data in Databricks from Excel. You can also use Excel to analyze the data further...

2 kudos

07-26-2023 2:38:15 AM

by Priyag1 • Honored Contributor II

05-05-2023 11:55:35 PM

1495 Views
2 replies
11 kudos

Query parameters in dashboardsQueries can optionally leverage parameters or static values. When a visualization based on a parameterized query is adde...

Query parameters in dashboardsQueries can optionally leverage parameters or static values. When a visualization based on a parameterized query is added to a dashboard, the visualization can either be configured to use a:Widget parameterWidget paramet...

Data Engineering

1495 Views
2 replies
11 kudos

05-05-2023 11:55:35 PM

View Replies

Latest Reply

Natalie_NL
New Contributor II

07-26-2023 2:31:59 AM

11 kudos

Hi, I build a dashboard with dashboard parameters, it works pretty easy!The advantage of dashboard parameters is that you do not have to set a default (it can be: all). This is convenient when you need to filter on values that change every time the q...

11 kudos

07-26-2023 2:31:59 AM

1 More Replies

by The_raj • New Contributor

07-26-2023 12:42:33 AM

3329 Views
1 replies
2 kudos

Error while reading file <file path>. [DEFAULT_FILE_NOT_FOUND]

Hi,I have a workflow created where there are 5 notebooks in it. One of the notebooks is failing with below error. I have tried refreshing the table. Still facing the same issue. When I try to run the notebook manually, it works fine. Can someone plea...

Data Engineering

3329 Views
1 replies
2 kudos

07-26-2023 12:42:33 AM

View Replies

Latest Reply

Kaniz_Fatma
Community Manager

07-26-2023 12:50:02 AM

2 kudos

Hi @The_raj , The error message you are encountering indicates a failure during the execution of a Spark job on Databricks. Specifically, it seems that Task 736 in Stage 92.0 failed multiple times, and the most recent loss was due to a "DEFAULT_FILE...

2 kudos

07-26-2023 12:50:02 AM

by mickniz • Contributor

10-12-2022 8:31:27 AM

18042 Views
7 replies
18 kudos

cannot import name 'sql' from 'databricks'

I am working on Databricks version 10.4 premium cluster and while importing sql from databricks module I am getting below error. cannot import name 'sql' from 'databricks' (/databricks/python/lib/python3.8/site-packages/databricks/__init__.py).Trying...

Data Engineering

18042 Views
7 replies
18 kudos

10-12-2022 8:31:27 AM

View Replies

Latest Reply

wallystart
New Contributor II

07-25-2023 4:31:29 PM

18 kudos

I resolve the same error installing library from cluster interface (UI)

18 kudos

07-25-2023 4:31:29 PM

6 More Replies

by dvmentalmadess • Valued Contributor

06-28-2023 5:03:32 PM

1448 Views
3 replies
0 kudos

Ingestion Time Clustering on initial load

We are migrating our data into Databricks and I was looking at the recommendations for partitioning here: https://docs.databricks.com/tables/partitions.html. This recommends not specifying partitioning and allowing "Ingestion Time Partitioning" (ITP)...

Data Engineering

1448 Views
3 replies
0 kudos

06-28-2023 5:03:32 PM

View Replies

Latest Reply

Anonymous
Not applicable

07-22-2023 9:08:50 PM

0 kudos

Hi @dvmentalmadess Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you. T...

0 kudos

07-22-2023 9:08:50 PM

2 More Replies

by RamozanbekS • New Contributor III

07-25-2023 6:25:17 AM

1361 Views
1 replies
0 kudos

Resolved! Databricks SQL Statement Execution API

I'm trying to follow the example provided here https://github.com/databricks-demos/dbsql-rest-api/blob/main/python/external_links.pyIt fails when it comes to downloading the data chunks. The statement status turns from SUCCEEDED to CLOSED right away ...

Data Engineering

1361 Views
1 replies
0 kudos

07-25-2023 6:25:17 AM

View Replies

Latest Reply

RamozanbekS
New Contributor III

07-25-2023 6:52:10 AM

0 kudos

It turns out that if the response is small and can fit 16mb limit, then status check will also provide single external link to download the data.So I need a condition here. Maybe even something like thisif len(chunks) == 1: external_url = respons...

0 kudos

07-25-2023 6:52:10 AM

by Asterol • New Contributor III

07-24-2023 12:00:45 AM

936 Views
1 replies
1 kudos

Creating a test schema - what is the best practice?

Hey, I've created a schema with few tables with historical data (prod), now I would like to have a Dev/testing environment with exactly the same data.What do you recommend? CTAS? Shallow clone? Deep clone? I wonder if shallow clone would be sufficien...

Data Engineering

Clone ctas

936 Views
1 replies
1 kudos

07-24-2023 12:00:45 AM

View Replies

Latest Reply

Tharun-Kumar
Honored Contributor II

07-24-2023 10:17:39 PM

1 kudos

@Asterol If you would like to have the same data for your Dev/testing environment, I would recommend using Deep Clone. Deep clone copies the metadata and creates an independent copy of the table data. Shallow clone only copies the metadata and will h...

1 kudos

07-24-2023 10:17:39 PM

by NathanSundarara • Contributor

07-24-2023 5:52:22 PM

1151 Views
0 replies
0 kudos

Sample code to read json from service bus queue in Azure

Hi,I'm looking for sample notebook or code snippet to read messages from Azure Service bus queues. I looked for documentation couldn't find anything. Any help would be appreciated. First we are thinking of batch mode before we move on to Streaming. P...

Data Engineering

azure

deltalivetable

messagequeue

servicebus

Servicebus azure deltalivetables message queue

1151 Views
0 replies
0 kudos

07-24-2023 5:52:22 PM

by Navashakthi • New Contributor

06-13-2023 6:59:42 AM

1414 Views
4 replies
2 kudos

Resolved! Community Edition Sign-up Issue

Hi, I'm trying to signup community edition for learning purpose. The sign up page has issue in selecting country. The select dropdown doesn't work and continue option redirects to same page. Couldn't complete signup. Kindly help!

Data Engineering

1414 Views
4 replies
2 kudos

06-13-2023 6:59:42 AM

View Replies

Latest Reply

amitdas2k6
New Contributor II

07-24-2023 3:35:59 PM

2 kudos

for me it is alwas displaying below error but entered correct user name and passowrd,my user name : amit.das2k16@gmail.com Invalid email address or passwordNote: Emails/usernames are case-sensitive

2 kudos

07-24-2023 3:35:59 PM

3 More Replies

by Shadowsong27 • New Contributor III

11-15-2021 5:38:33 PM

10613 Views
14 replies
4 kudos

Resolved! Mongo Spark Connector 3.0.1 seems not working with Databricks-Connect, but works fine in Databricks Cloud

On latest DB-Connect==9.1.3 and dbr == 9.1, retrieving data from mongo using Maven coordinate of Mongo Spark Connector: org.mongodb.spark:mongo-spark-connector_2.12:3.0.1 - https://docs.mongodb.com/spark-connector/current/ - working fine previously t...

Data Engineering

10613 Views
14 replies
4 kudos

11-15-2021 5:38:33 PM

View Replies

Latest Reply

mehdi3x
New Contributor II

07-24-2023 6:53:45 AM

4 kudos

Hi everyone the solution for me it was to replace spark.read.format("mongo") by spark.read.format("mongodb") my spark version is 3.3.2 and my mongodb version is 6.0.6 .

4 kudos

07-24-2023 6:53:45 AM

13 More Replies

by erigaud • Honored Contributor

07-20-2023 11:45:10 PM

1550 Views
4 replies
1 kudos

Deploying existing queries and alerts to other workspaces

I have several queries and associated alerts in a workspace, and I would like to be able to deploy them to an other workspace, for example an higher environment. Since both queries and objects are not supported in repos, what is the way to go to easi...

Data Engineering

1550 Views
4 replies
1 kudos

07-20-2023 11:45:10 PM

View Replies

Latest Reply

Anonymous
Not applicable

07-22-2023 9:22:03 PM

1 kudos

Hi @erigaud We haven't heard from you since the last response from @btafur , and I was checking back to see if her suggestions helped you. Or else, If you have any solution, please share it with the community, as it can be helpful to others. Also, ...

1 kudos

07-22-2023 9:22:03 PM

3 More Replies

by dprutean • New Contributor III

07-24-2023 2:12:15 AM

535 Views
0 replies
0 kudos

JDBC DatabaseMetaData.getCatalogs()

Calling the DatabaseMetaData.getCatalogs() returns 'spark_catalogs instead' of 'hive_metastore', when connected to tradition version of databricks cluster which is not signed with uc_catalog tag.Please check this.

Data Engineering

535 Views
0 replies
0 kudos

07-24-2023 2:12:15 AM

User

Count

1603

744

348

285

247

Databricks Community

Forum Posts

Resolved! Better Worker Node Core Utilisation

How to install cmdstanpy in dbx cluster

Delta Live Tables: bulk import of historical data?

Connect to EXCEL

Query parameters in dashboardsQueries can optionally leverage parameters or static values. When a visualization based on a parameterized query is adde...

Error while reading file <file path>. [DEFAULT_FILE_NOT_FOUND]

cannot import name 'sql' from 'databricks'

Ingestion Time Clustering on initial load

Resolved! Databricks SQL Statement Execution API

Creating a test schema - what is the best practice?

Sample code to read json from service bus queue in Azure

Resolved! Community Edition Sign-up Issue

Resolved! Mongo Spark Connector 3.0.1 seems not working with Databricks-Connect, but works fine in Databricks Cloud

Deploying existing queries and alerts to other workspaces

JDBC DatabaseMetaData.getCatalogs()

Lost Databricks' dependency in a job.

Compute Policy Does Not Install Libraries

Is there a way to let the DLT pipeline retry by it...

Can't create Catalog on Databricks on AWS

Executing Notebooks - Run All Cells vs Run All Bel...