Data Engineering

Forum Posts

Sorted by:

by ossinova • Contributor II

11-29-2022 4:27:04 AM

1591 Views
1 replies
0 kudos

Schedule reload of system.information_schema for external tables in platform

Probably not feasible, but is there a way to update (via STORED PROCEDURE, FUNCTION or SQL query) the information schema of all external tables within Databricks. Last updated that I can see was when I converted the tables to Unity. From my understa...

Data Engineering

1591 Views
1 replies
0 kudos

11-29-2022 4:27:04 AM

View Replies

Latest Reply

Own
Contributor

11-29-2022 1:28:35 PM

0 kudos

You can try optimize and cache with the internal tables such as schema tables to fetch updated information.

0 kudos

11-29-2022 1:28:35 PM

by rammy • Contributor III

11-21-2022 10:17:34 PM

3603 Views
3 replies
11 kudos

How would i retrieve data JSON data with namespaces using spark SQL?

File.json from the below code contains huge JSON data with each key containing namespace prefix(This JSON file converted from the XML file).I could able to retrieve if JSON does not contain namespaces but what could be the approach to retrieve record...

Data Engineering

3603 Views
3 replies
11 kudos

11-21-2022 10:17:34 PM

View Replies

Latest Reply

SS2
Valued Contributor

11-29-2022 12:45:22 PM

11 kudos

I case of struct you can use (.) For extracting the value

11 kudos

11-29-2022 12:45:22 PM

2 More Replies

by allan-silva • New Contributor III

11-23-2022 5:42:11 AM

4351 Views
3 replies
4 kudos

Resolved! Can't create database - UnsupportedFileSystemException No FileSystem for scheme "dbfs"

I'm following a class "DE 3.1 - Databases and Tables on Databricks", but it is not possible create databases due to "AnalysisException: org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:Got exception: org.apache.hadoop.fs.Unsupp...

Data Engineering

4351 Views
3 replies
4 kudos

11-23-2022 5:42:11 AM

View Replies

Latest Reply

allan-silva
New Contributor III

11-29-2022 12:15:06 PM

4 kudos

A colleague from my work figured out the problem: the cluster being used wasn't configured to use DBFS when running notebooks.

4 kudos

11-29-2022 12:15:06 PM

2 More Replies

by Shiva_Dsouz • New Contributor II

11-24-2022 6:22:13 AM

2091 Views
1 replies
1 kudos

How to get spark streaming metrics like input rows, processed rows and batch duration to Prometheus for monitoring

I have been reading this article https://www.databricks.com/session_na20/native-support-of-prometheus-monitoring-in-apache-spark-3-0 and it has been mentioned that we can get the spark streaming metrics like input rows, processing rate and batch dura...

Data Engineering

2091 Views
1 replies
1 kudos

11-24-2022 6:22:13 AM

View Replies

Latest Reply

SS2
Valued Contributor

11-29-2022 11:37:37 AM

1 kudos

I think you can use spark UI to see deep level details

1 kudos

11-29-2022 11:37:37 AM

by andalo • New Contributor II

11-24-2022 2:45:53 PM

3009 Views
3 replies
2 kudos

Databricks cluster failure

do you help me with the next error?MessageCluster terminated. Reason: Azure Vm Extension FailureHelpInstance bootstrap failed.Failure message: Cloud Provider Failure. Azure VM Extension stuck on transitioning state. Please try again later.VM extensio...

Data Engineering

3009 Views
3 replies
2 kudos

11-24-2022 2:45:53 PM

View Replies

Latest Reply

SS2
Valued Contributor

11-29-2022 11:36:44 AM

2 kudos

You can restart the cluster and check once.

2 kudos

11-29-2022 11:36:44 AM

2 More Replies

by mickniz • Contributor

11-24-2022 11:30:37 PM

4438 Views
6 replies
10 kudos

What is the best way to take care of Drop and Rename a column in Schema evaluation.

I would need some suggestion from DataBricks Folks. As per documentation in Schema Evaluation for Drop and Rename Data is overwritten. Does it means we loose data (because I read data is not deleted but kind of staged). Is it possible to query old da...

Data Engineering

4438 Views
6 replies
10 kudos

11-24-2022 11:30:37 PM

View Replies

Latest Reply

SS2
Valued Contributor

11-29-2022 11:31:31 AM

10 kudos

Overwritte option will overwritte your data. If you want to change column name then you can first alter the delta table as per your need then you can append new data as well. So both problems you can resolve

10 kudos

11-29-2022 11:31:31 AM

5 More Replies

by Shirley • New Contributor III

11-25-2022 9:12:10 AM

9650 Views
12 replies
8 kudos

Cluster terminated after 120 mins and cannot restart

Last night the cluster was working properly, but this morning the cluster was terminated automatically and cannot be restarted. Got an error message under sparkUI: Could not find data to load UI for driver 5526297689623955253 in cluster 1125-062259-i...

Data Engineering

9650 Views
12 replies
8 kudos

11-25-2022 9:12:10 AM

View Replies

Latest Reply

SS2
Valued Contributor

11-29-2022 11:24:15 AM

8 kudos

Then can use.

8 kudos

11-29-2022 11:24:15 AM

11 More Replies

by kodvakare • New Contributor III

11-25-2022 7:25:18 AM

6222 Views
5 replies
9 kudos

Resolved! How to write same code in different locations in the DB notebook?

The old version of the notebook had this feature, where you could Ctrl+click on different positions in a notebook cell to bring the cursor there, and type to update the code in both the positions like in JupyterLab. The newer version is awesome but s...

Old DataBricks version, update in multiple positions like Jupyter IDE

Data Engineering

6222 Views
5 replies
9 kudos

11-25-2022 7:25:18 AM

View Replies

Latest Reply

SS2
Valued Contributor

11-29-2022 11:13:11 AM

9 kudos

Alt+click is working fine

9 kudos

11-29-2022 11:13:11 AM

4 More Replies

by SindhujaRaghupa • New Contributor II

03-21-2018 9:44:37 AM

9573 Views
2 replies
1 kudos

Job aborted due to stage failure: Task 0 in stage 4.0 failed 1 times, most recent failure: Lost task 0.0 in stage 4.0 (TID 4, localhost, executor driver): java.lang.NullPointerException

I have uploaded a csv file which have well formatted data and I was trying to use display(questions) where questions=spark.read.option("header","true").csv("/FileStore/tables/Questions.csv")This is throwing an error as follows:SparkException: Job abo...

Data Engineering

9573 Views
2 replies
1 kudos

03-21-2018 9:44:37 AM

View Replies

Latest Reply

SS2
Valued Contributor

11-29-2022 11:05:45 AM

1 kudos

You can use inferschema

1 kudos

11-29-2022 11:05:45 AM

1 More Replies

by pkgltn • New Contributor III

11-29-2022 9:40:03 AM

1123 Views
0 replies
0 kudos

Mounting a Azure Storage Account path on Databricks

Hi,I have a Databricks instance and I mounted the Azure Storage Account. When I run the following command, the output is ExecutionError: An error occurred while calling o1168.ls.: shaded.databricks.org.apache.hadoop.fs.azure.AzureException: java.util...

Data Engineering

1123 Views
0 replies
0 kudos

11-29-2022 9:40:03 AM

by Muthumk255 • New Contributor

11-28-2022 10:18:34 PM

2252 Views
2 replies
0 kudos

Cannot sign in at databricks partner-academy portal

Hi thereI have used my company email to register an account for databricks learning .databricks.com a while back.Now what I need to do is create an account with partner-academy.databricks.com using my company email too.However when I register at part...

Data Engineering

2252 Views
2 replies
0 kudos

11-28-2022 10:18:34 PM

View Replies

Latest Reply

Harshjot
Contributor III

11-29-2022 9:02:57 AM

0 kudos

Hi @Muthukrishnan Balasubramanian I got the same issue a while back what worked for me is registering using personal account on partner academy then later I changed my email to my work email. Not sure if it's the best way to sort the issue.

0 kudos

11-29-2022 9:02:57 AM

1 More Replies

by db-avengers2rul • Contributor II

11-29-2022 5:18:27 AM

2542 Views
1 replies
0 kudos

Resolved! zip file not able to import in workspace

Dear Team,Using the community edition when i tried to import a zip file it is always throwing some error

Data Engineering

2542 Views
1 replies
0 kudos

11-29-2022 5:18:27 AM

View Replies

Latest Reply

db-avengers2rul
Contributor II

11-29-2022 5:19:52 AM

0 kudos

Please refer to the error in the attachment my question is this restriction is only for community edition ? or also for premium account ?

0 kudos

11-29-2022 5:19:52 AM

by yang • New Contributor II

11-29-2022 4:16:03 AM

1639 Views
1 replies
2 kudos

Resolved! Error in DE 4.1 - DLT UI Walkthrough (from Data Engineering with Databricks v3 course)

I am working on Data Engineering with Databricks v3 course. In notebook DE 4.1 - DLT UI Walkthrough, I countered an error in cmd 11: DA.validate_pipeline_config(pipeline_language)The error message is: AssertionError: Expected the parameter "suite" to...

Data Engineering

1639 Views
1 replies
2 kudos

11-29-2022 4:16:03 AM

View Replies

Latest Reply

Anonymous
Not applicable

11-29-2022 5:15:19 AM

2 kudos

The DA validate function is just to check that you named the pipeline correctly, set up the correct number of workers, 0, and other configurations. The name and directory aren't crucial to the learning process. The goal is to get familiar with the ...

2 kudos

11-29-2022 5:15:19 AM

by eimis_pacheco • Contributor

02-16-2022 8:49:28 PM

1385 Views
1 replies
1 kudos

How to remove more than 4 byte characters using pyspark in databricks?

Hi community,We have the need of removing more than 4 byte characters using pyspark in databricks since these are not supported by amazon Redshift. Does someone know how can I accomplish this?Thank you very much in advanceRegards

Data Engineering

1385 Views
1 replies
1 kudos

02-16-2022 8:49:28 PM

View Replies

Latest Reply

Shalabh007
Honored Contributor

11-29-2022 3:00:30 AM

1 kudos

assuming you are having a string type column in pyspark dataframe, one possible way could beidentify total number of characters for each value in column (say identify no of bytes taken by each character (say b)use substring() function to select first...

1 kudos

11-29-2022 3:00:30 AM

by Ullsokk • New Contributor III

11-29-2022 1:06:06 AM

3587 Views
1 replies
5 kudos

How do I import a notebook from workspaces to repos?

I have a few notebooks in workspaces that I created before linking repo to my git. I have tried importing them from the repo (databricks repo). The only two options are a local file from my pc or a url. The url for a notebook does not work. Do I need...

Data Engineering

3587 Views
1 replies
5 kudos

11-29-2022 1:06:06 AM

View Replies

Latest Reply

Geeta1
Valued Contributor

11-29-2022 2:55:19 AM

5 kudos

Hi @Stian Arntsen , when you click on the down arrow beside your notebook name (in your workspace), you will have a option called 'clone'. You can use it to clone your notebook from your workspace to repos. Hope it helps!

5 kudos

11-29-2022 2:55:19 AM

User

Count

1611

768

345

286

252

Databricks Community

Forum Posts

Schedule reload of system.information_schema for external tables in platform

How would i retrieve data JSON data with namespaces using spark SQL?

Resolved! Can't create database - UnsupportedFileSystemException No FileSystem for scheme "dbfs"

How to get spark streaming metrics like input rows, processed rows and batch duration to Prometheus for monitoring

Databricks cluster failure

What is the best way to take care of Drop and Rename a column in Schema evaluation.

Cluster terminated after 120 mins and cannot restart

Resolved! How to write same code in different locations in the DB notebook?

Job aborted due to stage failure: Task 0 in stage 4.0 failed 1 times, most recent failure: Lost task 0.0 in stage 4.0 (TID 4, localhost, executor driver): java.lang.NullPointerException

Mounting a Azure Storage Account path on Databricks

Cannot sign in at databricks partner-academy portal

Resolved! zip file not able to import in workspace

Resolved! Error in DE 4.1 - DLT UI Walkthrough (from Data Engineering with Databricks v3 course)

How to remove more than 4 byte characters using pyspark in databricks?

How do I import a notebook from workspaces to repos?

Join Us as a Local Community Builder!

Databricks data engineer associate exam

How to delete/empty notebook output

Databricks Cluster Policies

toml file syntax highlighting

Materialized Views Compute