Data Engineering

Forum Posts

Sorted by:

by Jessevds • New Contributor II

08-15-2022 2:56:16 AM

1555 Views
2 replies
2 kudos

Create dropdown-list in Markdown

In the first cell of my notebooks, I record a changelog for all changes done in the notebook in Markdown. However, as this list becomes longer and longer, I want to implement a dropdown list. Is there anyway to do this in Markdown in databricks?For t...

Data Engineering

1555 Views
2 replies
2 kudos

08-15-2022 2:56:16 AM

View Replies

Latest Reply

Vidula
Honored Contributor

09-11-2022 12:18:31 AM

2 kudos

Hi @Jesse vd S Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Thanks!

2 kudos

09-11-2022 12:18:31 AM

1 More Replies

by mghildiy • New Contributor

08-13-2022 10:25:03 PM

616 Views
2 replies
0 kudos

A basic DataFrame transformation query

I want to know how dataframe transformations work.Suppose I have a DataFrame instance df1. I apply some operation on it, say a filter. As every operation gives a new dataframe, so lets say now we have df2. So we have two DataFrame instances now, df1 ...

Data Engineering

616 Views
2 replies
0 kudos

08-13-2022 10:25:03 PM

View Replies

Latest Reply

Vidula
Honored Contributor

09-11-2022 12:14:22 AM

0 kudos

Hi @mghildiy Does @Kaniz Fatma response answer your question? If yes, would you be happy to mark it as best so that other members can find the solution more quickly?We'd love to hear from you.Thanks!

0 kudos

09-11-2022 12:14:22 AM

1 More Replies

by Erik • Valued Contributor II

08-12-2022 11:30:43 AM

1140 Views
2 replies
2 kudos

Resolved! Where is Databricks Tunnel (and is Databricks connect cool again?)

Two related questions:1: There has been several mentions in this forum about "Databricks Tunnel", which should allow us to connect from our local IDE to a remote databricks cluster and develop stuff locally. The roumors said early 2022, is there some...

Data Engineering

1140 Views
2 replies
2 kudos

08-12-2022 11:30:43 AM

View Replies

Latest Reply

Vidula
Honored Contributor

09-11-2022 12:10:58 AM

2 kudos

Hi there @Erik Parmann Does @Youssef Mrini response answer your question? If yes, would you be happy to mark it as best so that other members can find the solution more quickly?We'd love to hear from you.Thanks

2 kudos

09-11-2022 12:10:58 AM

1 More Replies

by dimsh • Contributor

08-12-2022 4:24:07 AM

795 Views
3 replies
1 kudos

Any plans to provide Databricks SQL / Alerts API

Hi, Databricks! You are my favorite Big Data tool, but I've recently faced an issue I didn't expect to have. For our agriculture customers, we're trying to use Databricks SQL Platform to keep our data accurate all day. We use Alerts to validate our d...

Data Engineering

795 Views
3 replies
1 kudos

08-12-2022 4:24:07 AM

View Replies

Latest Reply

Vidula
Honored Contributor

09-10-2022 10:19:28 PM

1 kudos

Hi @Dmytro Imshenetskyi Does @Hubert Dudek response answer your question? If yes, would you be happy to mark it as best so that other members can find the solution more quickly?We'd love to hear from you.Thanks!

1 kudos

09-10-2022 10:19:28 PM

2 More Replies

by Cosimo_F_ • Contributor

09-06-2022 1:32:54 PM

588 Views
4 replies
0 kudos

Autoloader schema inference

Hello,is it possible to turn off schema inference with AutoLoader? Thank you,Cosimo

Data Engineering

588 Views
4 replies
0 kudos

09-06-2022 1:32:54 PM

View Replies

Latest Reply

Kaniz
Community Manager

09-08-2022 8:14:44 AM

0 kudos

Hi @Cosimo Felline, Please check out this documentation.

0 kudos

09-08-2022 8:14:44 AM

3 More Replies

by satishnamu • New Contributor II

08-28-2022 9:21:23 PM

492 Views
2 replies
0 kudos

Cannot sign in at databricks partner-academy portal

Hi thereI have used my company email to register an account for customer-academy.databricks.com a while back.Now what I need to do is create an account with partner-academy.databricks.com using my company email too.However when I register at partner-...

Data Engineering

492 Views
2 replies
0 kudos

08-28-2022 9:21:23 PM

View Replies

Latest Reply

Kaniz
Community Manager

09-03-2022 2:01:22 PM

0 kudos

Hi @Satish Namu, Thank you for reaching out!Let us look into this for you, and we'll follow up with an update.

0 kudos

09-03-2022 2:01:22 PM

1 More Replies

by ronaldolopes • New Contributor

08-29-2022 10:37:44 AM

1358 Views
2 replies
1 kudos

Resolved! Error deleting a table

I'm trying to delete a table that was created from a csv and due to the file deletion, I can't execute the deletion, with the following error: I'm new to Databricks and I don't know how to fix this. Some help?

Data Engineering

1358 Views
2 replies
1 kudos

08-29-2022 10:37:44 AM

View Replies

Latest Reply

AmanSehgal
Honored Contributor III

08-30-2022 9:06:56 PM

1 kudos

To delete the table, it's looking for underlying delta log file and because the file doesn't exist, it's throwing you that error.Just drop the table.drop table <table_name>

1 kudos

08-30-2022 9:06:56 PM

1 More Replies

by RohitKulkarni • Contributor

09-01-2022 11:40:33 PM

1072 Views
4 replies
7 kudos

Resolved! Azure data bricks delta tables .Issue

Hello Team,I have written Spark SQL Query in data bricks :DROP TABLE IF EXISTS Salesforce.Location;CREATE EXTERNAL TABLE Salesforce.Location (Id STRING,OwnerId STRING,IsDeleted bigint,Name STRING,CurrencyIsoCode STRING,CreatedDate bigint,CreatedById ...

Data Engineering

1072 Views
4 replies
7 kudos

09-01-2022 11:40:33 PM

View Replies

Latest Reply

AmanSehgal
Honored Contributor III

09-02-2022 12:09:29 AM

7 kudos

You need to provide one of the following value for 'data_source':TEXTAVROCSVJSONPARQUETORCDELTAeg: USING PARQUETIf you skip USING clause, then the default data source is DELTAhttps://docs.databricks.com/sql/language-manual/sql-ref-syntax-ddl-create-t...

7 kudos

09-02-2022 12:09:29 AM

3 More Replies

by LearnerShahid • New Contributor II

09-02-2022 1:31:35 AM

2861 Views
6 replies
4 kudos

Resolved! Lesson 6.1 of Data Engineering. Error when reading stream - java.lang.UnsupportedOperationException: com.databricks.backend.daemon.data.client.DBFSV1.resolvePathOnPhysicalStorage(path: Path)

Below function executes fine: def autoload_to_table(data_source, source_format, table_name, checkpoint_directory): query = (spark.readStream .format("cloudFiles") .option("cloudFiles.format", source_format) .option("cloudFile...

I have verified that source data exists.

Data Engineering

2861 Views
6 replies
4 kudos

09-02-2022 1:31:35 AM

View Replies

Latest Reply

Anonymous
Not applicable

09-05-2022 4:48:52 AM

4 kudos

Autoloader is not supported on community edition.

4 kudos

09-05-2022 4:48:52 AM

5 More Replies

by BenLambert • Contributor

09-06-2022 12:48:17 AM

1048 Views
2 replies
2 kudos

Resolved! Delta Live Tables not inferring table schema properly.

I have a delta live tables pipeline that is loading and transforming data. Currently I am having a problem that the schema inferred by DLT does not match the actual schema of the table. The table is generated via a groupby.pivot operation as follows:...

Data Engineering

1048 Views
2 replies
2 kudos

09-06-2022 12:48:17 AM

View Replies

Latest Reply

BenLambert
Contributor

09-06-2022 1:44:58 AM

2 kudos

I was able to get around this by specifying the table schema in the table decorator.

2 kudos

09-06-2022 1:44:58 AM

1 More Replies

by mick042 • New Contributor III

06-14-2022 6:04:25 AM

642 Views
1 replies
0 kudos

Optimal approach when using external script/executable for processing data

I need to process a number of files where I manipulate file text utilising an external executable that operates on stdin/stdout. I am quite new to spark. What I am attempting is to use rdd.pipe as in the followingexe_path = " /usr/local/bin/external...

Data Engineering

642 Views
1 replies
0 kudos

06-14-2022 6:04:25 AM

View Replies

Latest Reply

User16753725469
Contributor II

09-09-2022 8:21:26 AM

0 kudos

Hi @Michael Lennon Can you please elaborate use case on what the external app is doing exe_path

0 kudos

09-09-2022 8:21:26 AM

by Leszek • Contributor

08-29-2022 1:34:17 AM

1392 Views
3 replies
4 kudos

How to set up partitions on the streaming Delta Table?

Let's assume that we have 3 streaming Delta Tables:BronzeSilverGoldMy aim is to add partitioning to Silver table (for example by Date). So, as a result Gold table with throw an error that source table has been updated and I would need to set 'ignoreC...

Data Engineering

1392 Views
3 replies
4 kudos

08-29-2022 1:34:17 AM

View Replies

Latest Reply

Kaniz
Community Manager

09-03-2022 1:37:07 PM

4 kudos

Hi @Leszek , We haven’t heard from you on the last response from @Werner Stinckens , and I was checking back to see if his suggestions helped you. Or else, If you have any solution, please do share that with the community as it can be helpful to o...

4 kudos

09-03-2022 1:37:07 PM

2 More Replies

by Mohit_m • Valued Contributor II

09-09-2022 4:44:45 AM

3590 Views
1 replies
1 kudos

Resolved! Databricks - python error when importing wheel distribution package

In previous days all notebooks containing : 'import anomalydetection' worked just fine. There was no change in any configuration of the cluster, notebook or our imported library.However recently notebooks just crashed with below errorSame happen also...

Data Engineering

3590 Views
1 replies
1 kudos

09-09-2022 4:44:45 AM

View Replies

Latest Reply

Mohit_m
Valued Contributor II

09-09-2022 4:45:37 AM

1 kudos

Solution: This is due to the latest version of protobuf library, please try to downgrade the library which should solve the issuepip install protobuf==3.20.*protobuf library versions which works: 3.20.1 if it does not work then try 3.18.1

1 kudos

09-09-2022 4:45:37 AM

by noimeta • Contributor II

09-09-2022 3:42:43 AM

921 Views
0 replies
0 kudos

How to use Terraform to add Git provider credentials to a workspace in order to use service principal for CI/CD

Hi,I'm very new to Terraform. Currently, I'm trying to automate the service principal setup process using Terraform.Following this example, I successfully created a service principal and an access token. However, when I tried adding databricks_git_cr...

Data Engineering

921 Views
0 replies
0 kudos

09-09-2022 3:42:43 AM

by jakubk • Contributor

09-07-2022 9:52:25 PM

1956 Views
2 replies
0 kudos

spark.read.parquet() - how to check for file lock before reading? (azure)

I have some python code which takes parquet files from an adlsv2 location and merges it into delta tables (run as a workflow job on a schedule)I have a try catch wrapper around this so that any files that fail get moved into a failed folder using dbu...

Data Engineering

1956 Views
2 replies
0 kudos

09-07-2022 9:52:25 PM

View Replies

Latest Reply

jakubk
Contributor

09-08-2022 7:33:57 PM

0 kudos

That's the problem - it's not being locked (or fs.mv() isn't checking/honoring the lock). The upload process/tool is a 3rd-prty external toolI can see via the upload tool that the file upload is 'in progress'I can also see the 0 byte destination file...

0 kudos

09-08-2022 7:33:57 PM

1 More Replies

User

Count

1601

736

343

284

246

Databricks

Forum Posts

Create dropdown-list in Markdown

A basic DataFrame transformation query

Resolved! Where is Databricks Tunnel (and is Databricks connect cool again?)

Any plans to provide Databricks SQL / Alerts API

Autoloader schema inference

Cannot sign in at databricks partner-academy portal

Resolved! Error deleting a table

Resolved! Azure data bricks delta tables .Issue

Resolved! Lesson 6.1 of Data Engineering. Error when reading stream - java.lang.UnsupportedOperationException: com.databricks.backend.daemon.data.client.DBFSV1.resolvePathOnPhysicalStorage(path: Path)

Resolved! Delta Live Tables not inferring table schema properly.

Optimal approach when using external script/executable for processing data

How to set up partitions on the streaming Delta Table?

Resolved! Databricks - python error when importing wheel distribution package

How to use Terraform to add Git provider credentials to a workspace in order to use service principal for CI/CD

spark.read.parquet() - how to check for file lock before reading? (azure)

DELTA_EXCEED_CHAR_VARCHAR_LIMIT

Not able to set run_as service_principal_name

Pyspark operations slowness in CLuster 14.3LTS as ...

[Databricks Assets Bundles] Workflow trigger on fi...

Addressing Pipeline Error Handling in Databricks b...