Data Engineering

Forum Posts

Sorted by:

by Trung • Contributor

08-01-2022 12:18:20 AM

1790 Views
5 replies
5 kudos

Job fail due to Access Denied

please help me to solve the problem that my data bricks account can not start the Job by triggering manually or scheduling although I can run the script without error.

Data Engineering

1790 Views
5 replies
5 kudos

08-01-2022 12:18:20 AM

View Replies

Latest Reply

Vivian_Wilfred
Honored Contributor

08-01-2022 3:41:05 PM

5 kudos

Hi @trung nguyen , Please check if you have the necessary instance profile attached to the Job cluster. You are definitely missing something related to the IAM.

5 kudos

08-01-2022 3:41:05 PM

4 More Replies

by ekdz__ • New Contributor III

06-28-2022 3:06:53 AM

3083 Views
5 replies
10 kudos

Is there any way to save the notebook in the "Results Only" view?

Hi! I'm looking for a solution to save a notebook in HTML format that has the "Results Only" view (without the executed code). Is there any possibility to do that?Thank you

Data Engineering

3083 Views
5 replies
10 kudos

06-28-2022 3:06:53 AM

View Replies

Latest Reply

Kaniz
Community Manager

06-28-2022 12:42:51 PM

10 kudos

Hi @Eryk Kądziela, We haven’t heard from you on the last response from @hubert, and I was checking back to see if his suggestions helped you. Or else, If you have any solution, please share it with the community as it can be helpful to others.

10 kudos

06-28-2022 12:42:51 PM

4 More Replies

by Anonymous • Not applicable

06-14-2022 5:23:26 AM

801 Views
4 replies
4 kudos

Invalid shard address

I'm running pyspark through databricks-connect and getting an error saying```ERROR SparkClientManager: Fail to get the SparkClientjava.util.concurrent.ExecutionException: com.databricks.service.SparkServiceConnectionException: Invalid shard address:`...

Data Engineering

801 Views
4 replies
4 kudos

06-14-2022 5:23:26 AM

View Replies

Latest Reply

Prabakar
Esteemed Contributor III

06-15-2022 11:43:36 PM

4 kudos

hi @Marco Wong was this working before and failing now? Are you behind a VPN or firewall? If so can you check by disabling it?enable traces at wireshark and collected dump to check if there is traffic going to workspace?Check if you can get curl wor...

4 kudos

06-15-2022 11:43:36 PM

3 More Replies

by krsimons • New Contributor

06-29-2022 3:06:25 PM

571 Views
3 replies
0 kudos

How do I automate my Databricks script?

Data Engineering

571 Views
3 replies
0 kudos

06-29-2022 3:06:25 PM

View Replies

Latest Reply

Vartika
Moderator

08-30-2022 8:33:25 AM

0 kudos

Hey there @Kayla Simons Hope everything is going great.Just wanted to check in if you were able to resolve your issue. If yes, would you be happy to mark an answer as best so that other members can find the solution more quickly? If not, please tell...

0 kudos

08-30-2022 8:33:25 AM

2 More Replies

by fshimamoto • New Contributor III

06-29-2022 12:59:26 PM

1092 Views
3 replies
2 kudos

What are the best practices for schema drift using Delta Live tables, in a scenario where the main source is a no sql database and we have a lot of ch...

What are the best practices for schema drift using Delta Live tables, in a scenario where the main source is a no sql database and we have a lot of changes in the schema?

Data Engineering

1092 Views
3 replies
2 kudos

06-29-2022 12:59:26 PM

View Replies

Latest Reply

Vartika
Moderator

08-30-2022 8:28:51 AM

2 kudos

Hey there @Fernando Martin Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from...

2 kudos

08-30-2022 8:28:51 AM

2 More Replies

by RajeshRK • Contributor

06-29-2022 7:10:18 AM

3196 Views
10 replies
4 kudos

Databricks job fails while creating table.

Hi Team,The Databricks job fails with the below error while creating EXTERNAL table.com.simba.spark.jdbc41.internal.apache.http.wire - Error running query: MetaException(message:Got exception: org.apache.hadoop.fs.azure.AzureException com.microsoft.a...

Data Engineering

3196 Views
10 replies
4 kudos

06-29-2022 7:10:18 AM

View Replies

Latest Reply

Vartika
Moderator

08-30-2022 8:06:59 AM

4 kudos

Hey there @Rajesh Kannan R Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from...

4 kudos

08-30-2022 8:06:59 AM

9 More Replies

by 159312 • New Contributor III

08-16-2022 12:36:48 PM

1576 Views
1 replies
0 kudos

How to set pipelines.incompatibleViewCheck.enabled = false

I tried to load a static table as source to a streaming dlt pipeline. I understand this is not optimum, but it provides the best path toward eventually having a full streaming pipeline. When I do I get the following error:pyspark.sql.utils.Analysis...

Data Engineering

1576 Views
1 replies
0 kudos

08-16-2022 12:36:48 PM

View Replies

Latest Reply

kfoster
Contributor

08-30-2022 4:57:25 AM

0 kudos

when you declare a table or view, you can pass use something as this: @dlt.table( spark_conf={ "pipelines.incompatibleViewCheck.enabled": "false" } )

0 kudos

08-30-2022 4:57:25 AM

by PrebenOlsen • New Contributor III

08-29-2022 8:49:25 AM

974 Views
1 replies
1 kudos

Resolved! Why does @dlt.table from a table give different results than from a view?

I have some data in silver that I read in as a view using the __apply_changes function on. I create a table based on this, and I then want to create my gold-table, after doing a .groupBy() and .pivot(). The transformations I do in the gold-table aren...

Data Engineering

974 Views
1 replies
1 kudos

08-29-2022 8:49:25 AM

View Replies

Latest Reply

PrebenOlsen
New Contributor III

08-30-2022 3:44:51 AM

1 kudos

I have found a temporary solution to solve this. The .pivot("columnName") should automatically grab all the values it can find, but for some reason it does not. I need to specify the values, using.pivot("group_name", "group0", "group1", "group2"...) ...

1 kudos

08-30-2022 3:44:51 AM

by SatishGunjal • New Contributor

07-19-2021 1:42:50 AM

1756 Views
1 replies
0 kudos

Data frame takes long time to print count of rows

We have a pyspark data frame with 50 MN records. We can display records from it, but it takes around 10 minutes to print the shape of dataframe. We aim to use this data for modelling that will take some numerical features based on the final data fra...

Data Engineering

1756 Views
1 replies
0 kudos

07-19-2021 1:42:50 AM

View Replies

Latest Reply

Hanna08
New Contributor II

08-30-2022 2:44:20 AM

0 kudos

Thanks for the detailed explanation. For those who want to have constant technical support for their work processes, I recommend JD Young. Here is only the latest information about the update in the world of information technology solutions and cyber...

0 kudos

08-30-2022 2:44:20 AM

by Cano • New Contributor III

08-29-2022 3:06:44 PM

730 Views
1 replies
2 kudos

How to add notebook to my Databricks jdbc url?

Please how do I add a notebook to the jdbc url in order to run queries externally?jdbc:databricks://dbc-a1b2345c-d6e7.cloud.databricks.com:443/default;transportMode=http;ssl=1;httpPath=sql/protocolv1/o/1234567890123456/1234-567890-reef123;AuthMech=3;...

Data Engineering

730 Views
1 replies
2 kudos

08-29-2022 3:06:44 PM

View Replies

Latest Reply

ranged_coop
Valued Contributor II

08-30-2022 1:01:51 AM

2 kudos

Not sure if it is possible.Alternatively you could try adding your notebook to a job, and then triggering that job via jobs api.Please refer below link Jobs API 2.1 | Databricks on AWS

2 kudos

08-30-2022 1:01:51 AM

by Anonymous • Not applicable

04-21-2022 2:27:24 AM

2055 Views
7 replies
5 kudos

COPY INTO command can not recognise MAP type value from JSON file

I have a delta table in Databricks with single column of type map<string, string> and I have a data file in JSON format created by Hive 3 for the table with thecolumn of same type. And I want to load data from file to Databricks's table using COPY IN...

Data Engineering

2055 Views
7 replies
5 kudos

04-21-2022 2:27:24 AM

View Replies

Latest Reply

jose_gonzalez
Moderator

08-29-2022 3:57:22 PM

5 kudos

Hi Alexey,Just a friendly follow-up. Did any of the responses help you to resolve your question? if it did, please mark it as best. Otherwise, please let us know if you still need help.

5 kudos

08-29-2022 3:57:22 PM

6 More Replies

by tomnguyen_195 • New Contributor III

06-30-2022 9:31:29 AM

1055 Views
2 replies
3 kudos

DLT maintenance job got stuck

Hi all,Recently we just realize a huge cost associate with our databricks account and the main culprit of it is DLT's pipeline maintenance job that got auto-scheduled to run but got stucked and cost us thousand of DBU. Do you know what would be the r...

Data Engineering

1055 Views
2 replies
3 kudos

06-30-2022 9:31:29 AM

View Replies

Latest Reply

tinai_long
New Contributor III

06-30-2022 9:41:49 AM

3 kudos

Same question. These maintenance jobs run for the maximum timeout (168 hours) and do not terminate. Example below:

3 kudos

06-30-2022 9:41:49 AM

1 More Replies

by Sha_1890 • New Contributor III

07-25-2022 5:39:20 AM

2535 Views
8 replies
0 kudos

How to execute a series of stored procedures using scala in databricks

I am working in a migration project, where lift and shift method is used to migrate SQL server DB from onprem to AZure Cloud. There are a lot of stored procedures used for integration in On prem. Now here in On prem , to process the XMl file and exec...

Data Engineering

2535 Views
8 replies
0 kudos

07-25-2022 5:39:20 AM

View Replies

Latest Reply

Noopur_Nigam
Valued Contributor II

08-29-2022 10:12:40 AM

0 kudos

Hi @shafana Roohi Jahubar I hope that your queries are answered. Please let me know if you have more doubts.

0 kudos

08-29-2022 10:12:40 AM

7 More Replies

by TMNGB • New Contributor II

07-26-2022 7:01:33 AM

1013 Views
2 replies
2 kudos

Resolved! Does MERGE statement preserve order? (Slowly Changing Dimensions)

In the case of processing multiple source files - with potentially, one or multiple entity versions per source - being able to use the MERGE statement whilst preserving the order is key to ensure the correct versioning of entity versions (aka, versio...

Data Engineering

1013 Views
2 replies
2 kudos

07-26-2022 7:01:33 AM

View Replies

Latest Reply

Noopur_Nigam
Valued Contributor II

08-29-2022 10:09:22 AM

2 kudos

Hi @Guilherme Banhudo I hope that werners answer would have helped you. Please let me know if you still have doubts or queries.

2 kudos

08-29-2022 10:09:22 AM

1 More Replies

by 77796 • New Contributor II

08-23-2022 9:21:15 AM

2701 Views
4 replies
0 kudos

Databricks S3A error - java.lang.ClassNotFoundException: Class org.apache.hadoop.fs.s3a.commit.S3ACommitterFactory not found

We are getting the below error for runtime 10.x and 11.x when writing to s3 via saveAsNewAPIHadoopFile function. The same jobs are running fine on runtime 9.x and 7.x. The difference betwen 9.x and 10.x is the former has hadoop 2.7 bindings with sp...

Data Engineering

2701 Views
4 replies
0 kudos

08-23-2022 9:21:15 AM

View Replies

Latest Reply

77796
New Contributor II

08-28-2022 9:25:08 AM

0 kudos

We have resolved this issue by using s3 scheme instead of s3a i.e. pairRDD.saveAsNewAPIHadoopFile("s3://bucket/testout.dat",

0 kudos

08-28-2022 9:25:08 AM

3 More Replies

User

Count

1602

736

343

284

247

Databricks

Forum Posts

Job fail due to Access Denied

Is there any way to save the notebook in the "Results Only" view?

Invalid shard address

How do I automate my Databricks script?

What are the best practices for schema drift using Delta Live tables, in a scenario where the main source is a no sql database and we have a lot of ch...

Databricks job fails while creating table.

How to set pipelines.incompatibleViewCheck.enabled = false

Resolved! Why does @dlt.table from a table give different results than from a view?

Data frame takes long time to print count of rows

How to add notebook to my Databricks jdbc url?

COPY INTO command can not recognise MAP type value from JSON file

DLT maintenance job got stuck

How to execute a series of stored procedures using scala in databricks

Resolved! Does MERGE statement preserve order? (Slowly Changing Dimensions)

Databricks S3A error - java.lang.ClassNotFoundException: Class org.apache.hadoop.fs.s3a.commit.S3ACommitterFactory not found

Best way to parse Google Analytics data in Databri...

DELTA_EXCEED_CHAR_VARCHAR_LIMIT

Not able to set run_as service_principal_name

Pyspark operations slowness in CLuster 14.3LTS as ...

[Databricks Assets Bundles] Workflow trigger on fi...