Data Engineering

Forum Posts

Sorted by:

by Anonymous • Not applicable

04-26-2022 5:32:19 PM

2583 Views
0 replies
0 kudos

How Can we pass parameters from the data factory to databricks Job that is using a notebook

How Can I pass parameters from the data factory to databricks Jobs that is using a notebook but I know how to pass parameters from data factory to databricks notebooks when ADF calling directly the Notebook.

Data Engineering

2583 Views
0 replies
0 kudos

04-26-2022 5:32:19 PM

by Emiel_Smeenk • New Contributor III

04-12-2022 10:15:02 AM

18503 Views
5 replies
8 kudos

Resolved! Databricks Runtime 10.4 LTS - AnalysisException: No such struct field id in 0, 1 after upgrading

Hello,We are working to migrate to databricks runtime 10.4 LTS from 9.1 LTS but we're running into weird behavioral issues. Our existing code works up until runtime 10.3 and in 10.4 it stopped working.Problem:We have a nested json file that we are fl...

Data Engineering

18503 Views
5 replies
8 kudos

04-12-2022 10:15:02 AM

View Replies

Latest Reply

Emiel_Smeenk
New Contributor III

04-20-2022 8:59:22 AM

8 kudos

It seems like the issue was miraculously resolved. I did not make any code changes but everything is now running as expected. Maybe the latest runtime 10.4 fix released on April 19th also resolved this issue unintentionally.

8 kudos

04-20-2022 8:59:22 AM

4 More Replies

by nickg • New Contributor III

03-30-2022 11:16:10 AM

6977 Views
6 replies
3 kudos

Resolved! I am looking to use the pivot function with Spark SQL (not Python)

Hello. I am trying to using the Pivot function for email addresses. This is what I have so far:Select fname, lname, awUniqueID, Email1, Email2From xxxxxxxxPivot ( count(Email) as Test For Email In (1 as Email1, 2 as Email2) )I get everyth...

Data Engineering

6977 Views
6 replies
3 kudos

03-30-2022 11:16:10 AM

View Replies

Latest Reply

nickg
New Contributor III

03-30-2022 11:46:23 AM

3 kudos

source data:fname lname awUniqueID EmailJohn Smith 22 jsmith@gmail.comJODI JONES 22 jsmith@live.comDesired output:fname lname awUniqueID Em...

3 kudos

03-30-2022 11:46:23 AM

5 More Replies

by HarshaK • New Contributor III

04-07-2022 1:43:53 AM

20069 Views
4 replies
6 kudos

Resolved! Partition By () on Delta Files

Hi All,I am trying to Partition By () on Delta file in pyspark language and using command:df.write.format("delta").mode("overwrite").option("overwriteSchema","true").partitionBy("Partition Column").save("Partition file path") -- It doesnt seems to w...

Data Engineering

20069 Views
4 replies
6 kudos

04-07-2022 1:43:53 AM

View Replies

Latest Reply

Anonymous
Not applicable

04-26-2022 9:36:14 AM

6 kudos

Hey @Harsha kriplani Hope you are well. Thank you for posting in here. It is awesome that you found a solution. Would you like to mark Hubert's answer as best? It would be really helpful for the other members too.Cheers!

6 kudos

04-26-2022 9:36:14 AM

3 More Replies

by Manoj • Contributor II

04-04-2022 6:58:02 AM

2664 Views
2 replies
5 kudos

Resolved! Does job cluster helps the jobs that are fighting for Resources on all purpose cluster ?

Hi Team, Does job cluster helps the jobs that are fighting for Resources on all purpose cluster ?With job cluster the drawback that i see is creation of cluster every time when the job starts, Its taking 2 mins for spinning up the cluster. Instead of...

Data Engineering

2664 Views
2 replies
5 kudos

04-04-2022 6:58:02 AM

View Replies

Latest Reply

Hubert-Dudek
Databricks MVP

04-04-2022 7:07:52 AM

5 kudos

@Manoj Kumar Rayalla , You can in the job set to use an all-purpose cluster (that feature was added recently)You can use the pool to limit job cluster starting time (but it still can take a moment),

5 kudos

04-04-2022 7:07:52 AM

1 More Replies

by LorenRD • Contributor

04-04-2022 5:02:54 AM

14346 Views
9 replies
13 kudos

Resolved! Is it possible to connect Databricks SQL with AWS Redshift DB?

I would like to know if it's possible to connect Databricks SQL module with not just internal Metastore DB and tables from Data Science and Engineering module but also connect with an AWS Redshift DB to do queries and create alerts.

Data Engineering

14346 Views
9 replies
13 kudos

04-04-2022 5:02:54 AM

View Replies

Latest Reply

LorenRD
Contributor

04-26-2022 6:20:33 AM

13 kudos

Hi @Kaniz Fatma I contacted Customer support explaining this issue, they told me that this feature is not implemented yet but it's in the roadmap with no ETA. It would be great if you ping me back when it's possible to access Redshift tables from SQ...

13 kudos

04-26-2022 6:20:33 AM

8 More Replies

by gazzyjuruj • Contributor II

04-18-2022 10:17:20 PM

2625 Views
1 replies
4 kudos

Resolved! databricks_error_message: time out placing nodes

Hi, today i'm receiving this error:-databricks_error_message :Timed out while placing nodes. what should be done to fix it?

Data Engineering

2625 Views
1 replies
4 kudos

04-18-2022 10:17:20 PM

View Replies

Latest Reply

User16764241763
Databricks Employee

04-26-2022 5:37:35 AM

4 kudos

Hello @Ghazanfar Uruj This can happen for a bunch of reasons. Could you please file a support case with details, if the issue still persists?

4 kudos

04-26-2022 5:37:35 AM

by AmanSehgal • Honored Contributor III

04-20-2022 9:44:53 PM

5178 Views
2 replies
10 kudos

Migrating data from delta lake to RDS MySQL and ElasticSearch

There are mechanisms (like DMS) to get data from RDS to delta lake and store the data in parquet format, but is it possible to reverse of this in AWS?I want to send data from data lake to MySQL RDS tables in batch mode.And the next step is to send th...

Data Engineering

5178 Views
2 replies
10 kudos

04-20-2022 9:44:53 PM

View Replies

Latest Reply

AmanSehgal
Honored Contributor III

04-26-2022 5:05:28 AM

10 kudos

@Kaniz Fatma and @Hubert Dudek - writing to MySQL RDS is relatively simpler. I'm finding ways to export data into Elasticsearch

10 kudos

04-26-2022 5:05:28 AM

1 More Replies

by kjoth • Contributor II

04-26-2022 4:20:29 AM

1801 Views
0 replies
0 kudos

Unmanaged Table - Newly added data directories are not reflected in the table We have created an unmanaged table with partitions on the dbfs location, using SQL. After creating the tables, via SQL we are running

We have created an unmanaged table with partitions on the dbfs location, using SQL.example: %sql CREATE TABLE EnterpriseDailyTrafficSummarytest(EnterpriseID String,ServiceLocationID String, ReportDate String ) USING parquet PARTITIONED BY(ReportDate)...

Data Engineering

1801 Views
0 replies
0 kudos

04-26-2022 4:20:29 AM

by Daba • New Contributor III

04-10-2022 8:22:33 AM

7232 Views
3 replies
5 kudos

Resolved! DLT+AutoLoader: where is the schema and checkpoint hide?

Hi, I'm exploring the DLT with AutoLoader feature and wondering where are the schema and checkpoint hide? I want to wipe these two to reset/reinitialize the flow but unlike the "regular" AutoLoader the checkpoint and schema folder are not there.Thank...

Data Engineering

7232 Views
3 replies
5 kudos

04-10-2022 8:22:33 AM

View Replies

Latest Reply

Hubert-Dudek
Databricks MVP

04-11-2022 12:23:33 PM

5 kudos

@Alexander Plepler , There is a storage option in pipeline settings - A path to a DBFS directory for storing checkpoints and tables created by the pipeline.Additionally, delta is registered in metastore, so the table schema is there.

5 kudos

04-11-2022 12:23:33 PM

2 More Replies

by Karthik1 • New Contributor II

04-22-2022 4:11:32 AM

3793 Views
2 replies
0 kudos

Datab

Hi Databricks Team, I had given Databricks certified spark developer-Python exam on 15th April’22 and passed with 81.66% score but till now I didn’t receive my certificate or badge. I need to submit my badge to my employer. Kindly release my badge. T...

Data Engineering

3793 Views
2 replies
0 kudos

04-22-2022 4:11:32 AM

View Replies

by gazzyjuruj • Contributor II

04-23-2022 6:01:54 AM

3825 Views
1 replies
2 kudos

Client.UserInitiatedShutdown

Hi,Everything seemed fine until right now i've been getting Client.UserInitiatedShutdown errorWhat is wrong?Thanks.

Data Engineering

3825 Views
1 replies
2 kudos

04-23-2022 6:01:54 AM

View Replies

by sannycse • New Contributor II

03-30-2022 11:51:13 AM

2792 Views
2 replies
3 kudos

Resolved! display password as shown in example using spark scala

Table has the following Columns:First_Name, Last_Name, Department_Id,Contact_No, Hire_DateDisplay the emplopyee First_name, Count of Characters in the firstname,password.Password should be first 4 letters of first name in lower case and the date and ...

Data Engineering

2792 Views
2 replies
3 kudos

03-30-2022 11:51:13 AM

View Replies

Latest Reply

Hubert-Dudek
Databricks MVP

03-30-2022 12:15:23 PM

3 kudos

@SANJEEV BANDRU , SELECT CONCAT(substring(First_Name, 0, 2) , substring(Hire_Date, 0, 2), substring(Hire_Date, 3, 2)) as password FROM table;If Hire_date is timestamp you may need to add date_format()

3 kudos

03-30-2022 12:15:23 PM

1 More Replies

by Syed1 • New Contributor III

03-24-2022 5:23:49 PM

28787 Views
7 replies
13 kudos

Resolved! Python Graph not showing

Hi , I have run this code import matplotlib.pyplot as pltimport numpy as npplt.style.use('bmh')%matplotlib inlinex = np.array([5,7,8,7,2,17,2,9,4,11,12,9,6])y = np.array([99,86,87,88,111,86,103,87,94,78,77,85,86])p= plt.scatter(x, y)display command r...

Data Engineering

28787 Views
7 replies
13 kudos

03-24-2022 5:23:49 PM

View Replies

Latest Reply

User16725394280
Databricks Employee

04-08-2022 4:43:50 AM

13 kudos

@Syed Ubaid i tried with 7.3 LTS and its works fine.

13 kudos

04-08-2022 4:43:50 AM

6 More Replies

by Anonymous • Not applicable

03-09-2022 8:41:51 PM

12559 Views
12 replies
13 kudos

Resolved! Not able to run notebook even when cluster is running and databases/tables are not visible in "data" tab.

We are using Dataricks in AWS. i am not able to run a notebook even when cluster is running. When i run a cell, it returns "cancel". When i check the event log for the cluster, it shows "Metastore is down". Couldn't see any databases or tables that i...

Data Engineering

12559 Views
12 replies
13 kudos

03-09-2022 8:41:51 PM

View Replies

Latest Reply

User16753725182
Databricks Employee

04-13-2022 5:34:41 AM

13 kudos

This means the network is fine, but something in the spark config is amiss.What are the DBR version and the hive version? Please check f you are using a compatible version.If you don't specify any version, it will take 1.3 and you wouldn't have to us...

13 kudos

04-13-2022 5:34:41 AM

11 More Replies

Databricks Community

Forum Posts

How Can we pass parameters from the data factory to databricks Job that is using a notebook

Resolved! Databricks Runtime 10.4 LTS - AnalysisException: No such struct field id in 0, 1 after upgrading

Resolved! I am looking to use the pivot function with Spark SQL (not Python)

Resolved! Partition By () on Delta Files

Resolved! Does job cluster helps the jobs that are fighting for Resources on all purpose cluster ?

Resolved! Is it possible to connect Databricks SQL with AWS Redshift DB?

Resolved! databricks_error_message: time out placing nodes

Migrating data from delta lake to RDS MySQL and ElasticSearch

Unmanaged Table - Newly added data directories are not reflected in the table We have created an unmanaged table with partitions on the dbfs location, using SQL. After creating the tables, via SQL we are running

Resolved! DLT+AutoLoader: where is the schema and checkpoint hide?

Datab

Client.UserInitiatedShutdown

Resolved! display password as shown in example using spark scala

Resolved! Python Graph not showing

Resolved! Not able to run notebook even when cluster is running and databases/tables are not visible in "data" tab.

File Arrival Trigger - Multiple tables

Issue while handling Deletes and Inserts in Struct...

DLT with CDC and schema changes in streaming pipel...

how to update not tracked column only in new row v...

Databricks Cost Estimation Template