Data Engineering

Forum Posts

Sorted by:

Start a conversation

by aladda • Databricks Employee

06-18-2021 12:11:14 PM

4844 Views
2 replies
3 kudos

Resolved! What does the run_as_repl parameter do in the databricks Jobs API - https://docs.databricks.com/dev-tools/api/latest/examples.html#jobs-api-examples

Data Engineering

4844 Views
2 replies
3 kudos

06-18-2021 12:11:14 PM

View Replies

Latest Reply

User16255483290
Databricks Employee

03-02-2022 7:16:17 AM

3 kudos

@Anand Ladda @André Monteiro From comments in the code:Indicates whether the task should be run in a REPL. This value must be true to run on an existing cluster. Please ignore the 'run_as_repl' parameters it will be removed from public docs as it i...

3 kudos

03-02-2022 7:16:17 AM

1 More Replies

by al_joe • Contributor

02-05-2022 12:19:12 AM

5474 Views
2 replies
0 kudos

Where / how does DBFS store files?

I tried to use %fs head to print the contents of a CSV file used in a training%fs head "/mnt/path/file.csv"but got an error saying cannot head a directory!?Then I did %fs ls on the same CSV file and got a list of 4 files under a directory named as a ...

Data Engineering

5474 Views
2 replies
0 kudos

02-05-2022 12:19:12 AM

View Replies

Latest Reply

User16753725182
Databricks Employee

03-02-2022 6:56:09 AM

0 kudos

Hi @Al Jo , are you still seeing the error while printing the contents of te CSV file?

0 kudos

03-02-2022 6:56:09 AM

1 More Replies

by digitalinstitut • New Contributor

03-02-2022 12:52:37 AM

864 Views
0 replies
0 kudos

www.amritsardigitalacademy.in

Amritsar Digital Academy is the best https://www.amritsardigitalacademy.in/ digital marketing institute In Punjab. if you want to do a digital marketing course. you can enroll now!

Data Engineering

864 Views
0 replies
0 kudos

03-02-2022 12:52:37 AM

by Infosys_128139 • New Contributor III

02-16-2022 7:30:02 AM

9897 Views
8 replies
5 kudos

Resolved! Unable to start SQL End point in DATABRICKS SQL

Hello All, I am trying to use Databricks SQL but somehow the SQL end point is not getting started. It is in starting state for long time and then session is getting expired. Please note , the default SQL End point also not getting started. I am using...

Data Engineering

9897 Views
8 replies
5 kudos

02-16-2022 7:30:02 AM

View Replies

Latest Reply

BilalAslamDbrx
Databricks Employee

03-01-2022 5:27:16 AM

5 kudos

@AMZ DUD did you get this working? With a quota of 500, 43 mins is a long time for a cluster to launch. Perhaps a something in the account isn’t set up correctly. Can you please email me your workspace ID please at bilal dot aslam at databricks dot ...

5 kudos

03-01-2022 5:27:16 AM

7 More Replies

by workshopmanual1 • New Contributor

03-01-2022 6:01:26 AM

835 Views
0 replies
0 kudos

Workshopmanuals.co is proud to offer complete workshop manuals for most vehicle makes and models.

Data Engineering

835 Views
0 replies
0 kudos

03-01-2022 6:01:26 AM

by BasavarajAngadi • Contributor

02-18-2022 8:08:39 AM

6888 Views
6 replies
6 kudos

Resolved! Hi Experts I want to know the difference between connecting any BI Tool to Spark SQL and Databricks SQL end point?

Its all about spinning the spark cluster and both spark Sql api and databricks does the same operation what difference does it make to BI tools ?

Data Engineering

6888 Views
6 replies
6 kudos

02-18-2022 8:08:39 AM

View Replies

Latest Reply

Anonymous
Not applicable

02-22-2022 2:29:00 PM

6 kudos

Thanks @Bilal Aslam and @Aman Sehgal for jumping in! @Basavaraj Angadi I want to make sure you got your question(s) answered! Will you let us know? Don't forget, you can select any reply as the "best answer" !

6 kudos

02-22-2022 2:29:00 PM

5 More Replies

by hare • New Contributor III

02-25-2022 9:52:19 PM

4187 Views
4 replies
8 kudos

Azure DBR - Have to load list of json files but the column has special character.(ex: {"hydra:xxxx": {"hydra:value":"yyyy", "hydra:value1":"zzzzz"}

Azure DBR - Have to load list of json files into data frame and then from DF to data bricks table but the column has special character and getting below error.Both column(key) and value (as json record) has special characters in the json file. # Can...

Data Engineering

4187 Views
4 replies
8 kudos

02-25-2022 9:52:19 PM

View Replies

Latest Reply

Hubert-Dudek
Databricks MVP

02-26-2022 10:05:47 AM

8 kudos

The best is just define schema manually. There is nice article from person who had exactly the same problem https://towardsdev.com/create-a-spark-hive-meta-store-table-using-nested-json-with-invalid-field-names-505f215eb5bf

8 kudos

02-26-2022 10:05:47 AM

3 More Replies

by alejandrofm • Valued Contributor

02-28-2022 5:55:14 AM

1898 Views
1 replies
2 kudos

Resolved! Feature request for spark performance tuning

Hi, I don't think there's a place to see this, please correct me if I'm wrong.Now to see performance tuning tips I have to go to spark UI, then to SQL view and on top I could see performance alerts that help me know If I need apply a spark config, co...

Data Engineering

1898 Views
1 replies
2 kudos

02-28-2022 5:55:14 AM

View Replies

Latest Reply

-werners-
Esteemed Contributor III

02-28-2022 6:28:12 AM

2 kudos

I think that can be requested at ideas.databricks.com

2 kudos

02-28-2022 6:28:12 AM

by LukaszJ • Contributor III

02-23-2022 3:32:41 AM

13945 Views
4 replies
0 kudos

Resolved! Send UPDATE from Databricks to Azure SQL DataBase

Hello.I want to know how to do an UPDATE on Azure SQL DataBase from Azure Databricks using PySpark.I know how to make query as SELECT and turn it into DataFrame, but how to send back some data (as UPDATE on rows)?I want to use build in pyspark istead...

Data Engineering

13945 Views
4 replies
0 kudos

02-23-2022 3:32:41 AM

View Replies

Latest Reply

-werners-
Esteemed Contributor III

02-23-2022 11:42:44 PM

0 kudos

This is discussed on Stack Overflow. As you see for Azure Synapse there is a way, but for a plain SQL database you will have to use some kind of driver like odbc/jdbc.

0 kudos

02-23-2022 11:42:44 PM

3 More Replies

by Atacama • New Contributor II

02-24-2022 10:18:52 AM

4697 Views
3 replies
1 kudos

Resolved! Does Databricks encrypt Spark's spilled data?

Data Engineering

4697 Views
3 replies
1 kudos

02-24-2022 10:18:52 AM

View Replies

Latest Reply

-werners-
Esteemed Contributor III

02-24-2022 11:16:15 PM

1 kudos

the spilled data is written to some object store on the cloud provider.I believe all of them apply encryption by default.Of course it is up to you (or your colleagues) to restrict access to the storage.

1 kudos

02-24-2022 11:16:15 PM

2 More Replies

by KKo • Contributor III

01-24-2022 2:00:16 PM

7218 Views
3 replies
4 kudos

Resolved! Reading multiple parquet files from same _delta_log under a path

I have a path where there is _delta_log and 3 snappy.parquet files. I am trying to read all those .parquet using spark.read.format('delta').load(path) but I am getting data from only one same file all the time. Can't I read from all these files? If s...

Data Engineering

7218 Views
3 replies
4 kudos

01-24-2022 2:00:16 PM

View Replies

Latest Reply

KKo
Contributor III

02-24-2022 5:42:16 AM

4 kudos

@Werner Stinckens Thanks for the reply and explanation, that was helpful to understand the delta feature.

4 kudos

02-24-2022 5:42:16 AM

2 More Replies

by SailajaB • Databricks Partner

02-23-2022 8:49:14 AM

5955 Views
5 replies
4 kudos

Resolved! when and otherwise issue

Hi,Here in our scenario we are reading json files as input and it contains nested structure. Few of the attributes are array type struct. Where we need to change name of nested ones. So we created a new structure and doing cast.We are facing below pr...

Data Engineering

5955 Views
5 replies
4 kudos

02-23-2022 8:49:14 AM

View Replies

Latest Reply

AmanSehgal
Honored Contributor III

02-23-2022 7:53:02 PM

4 kudos

Can you provide the structure that you're using?Also, a more elaborate sample input and output.

4 kudos

02-23-2022 7:53:02 PM

4 More Replies

by SailajaB • Databricks Partner

02-22-2022 11:14:56 PM

22454 Views
4 replies
4 kudos

Unable to mount the blob storage account as soft delete got enabled

Hi Team,when we try to mount or access the blob storage where soft delete enabled. But it is getting failed with below errororg.apache.hadoop.fs.FileAlreadyExistsException: Operation failed: "This endpoint does not support BlobStorageEvents or So...

Data Engineering

22454 Views
4 replies
4 kudos

02-22-2022 11:14:56 PM

View Replies

Latest Reply

-werners-
Esteemed Contributor III

02-22-2022 11:49:27 PM

4 kudos

Jeez, I was planning on enabling soft delete on our adls gen2, but I think I will wait a while after reading this.

4 kudos

02-22-2022 11:49:27 PM

3 More Replies

by JoeWMP • New Contributor III

02-22-2022 12:34:51 PM

7153 Views
5 replies
1 kudos

Resolved! Databricks Job ID's increasing in massive sequence gaps

Has anyone seen something like this before? Today around midnight, our Job ID's started increasing in increments of quadrillions - was this a new change to how Job ID's are generated?

Data Engineering

7153 Views
5 replies
1 kudos

02-22-2022 12:34:51 PM

View Replies

Latest Reply

JoeWMP
New Contributor III

02-24-2022 9:26:51 AM

1 kudos

Thank you Ravi! Glad that this confirms my understanding

1 kudos

02-24-2022 9:26:51 AM

4 More Replies

by Edmondo • New Contributor III

01-17-2022 11:05:12 AM

9816 Views
7 replies
3 kudos

Resolved! Limiting parallelism when external APIs are invoked (i.e. mlflow)

We are applying a groupby operation to a pyspark.sql.Dataframe and then on each group train a single model for mlflow. We see intermittent failures because the MLFlow server replies with a 429, because of too many requests/s What are the best pract...

Data Engineering

9816 Views
7 replies
3 kudos

01-17-2022 11:05:12 AM

View Replies

Latest Reply

Edmondo
New Contributor III

02-24-2022 7:57:12 AM

3 kudos

To me it's already resolved through professional services. The question I do have is how useful is this community if people with the right background aren't here, and if it takes a month to get a no-answer.

3 kudos

02-24-2022 7:57:12 AM

6 More Replies

Databricks Community

Forum Posts

Resolved! What does the run_as_repl parameter do in the databricks Jobs API - https://docs.databricks.com/dev-tools/api/latest/examples.html#jobs-api-examples

Where / how does DBFS store files?

www.amritsardigitalacademy.in

Resolved! Unable to start SQL End point in DATABRICKS SQL

Workshopmanuals.co is proud to offer complete workshop manuals for most vehicle makes and models.

Resolved! Hi Experts I want to know the difference between connecting any BI Tool to Spark SQL and Databricks SQL end point?

Azure DBR - Have to load list of json files but the column has special character.(ex: {"hydra:xxxx": {"hydra:value":"yyyy", "hydra:value1":"zzzzz"}

Resolved! Feature request for spark performance tuning

Resolved! Send UPDATE from Databricks to Azure SQL DataBase

Resolved! Does Databricks encrypt Spark's spilled data?

Resolved! Reading multiple parquet files from same _delta_log under a path

Resolved! when and otherwise issue

Unable to mount the blob storage account as soft delete got enabled

Resolved! Databricks Job ID's increasing in massive sequence gaps

Resolved! Limiting parallelism when external APIs are invoked (i.e. mlflow)

File Arrival Trigger - Multiple tables

Issue while handling Deletes and Inserts in Struct...

DLT with CDC and schema changes in streaming pipel...

how to update not tracked column only in new row v...

Databricks Cost Estimation Template