Data Engineering

Forum Posts

Sorted by:

by Gapy • New Contributor II

10-31-2021 6:43:09 AM

1013 Views
1 replies
1 kudos

Auto Loader Schema-Inference and Evolution for parquet files

Dear all,will (and when) will Auto Loader also support Schema-Inference and Evolution for parquet files, at this point it is only for JSON and CSV supported if i am not mistaken?Thanks and regards,Gapy

Data Engineering

1013 Views
1 replies
1 kudos

10-31-2021 6:43:09 AM

View Replies

Latest Reply

Sandeep
Contributor III

11-10-2021 7:46:01 AM

1 kudos

@Gasper Zerak , This will be available in near future (DBR 10.3 or later). Unfortunately, we don't have an SLA at this moment.

1 kudos

11-10-2021 7:46:01 AM

by Maverick1 • Valued Contributor II

09-09-2021 11:05:22 PM

2116 Views
10 replies
9 kudos

Resolved! Lineage between model and source code breaks on movement of source notebook. How to rectify it?

If there is a registered model and it is linked with a notebook, then the lineage breaks if you move the notebook to a different path or even pull/upload a new version of the notebook.This is not good because when someone doing its development/testin...

Data Engineering

2116 Views
10 replies
9 kudos

09-09-2021 11:05:22 PM

View Replies

Latest Reply

sean_owen
Honored Contributor II

09-15-2021 4:50:53 PM

9 kudos

I also cannot reproduce this, with these exact steps (I think). After moving the notebook and moving it back, the link to it (and link to the revision) still works as expected. You are using MLflow built in to Databricks right?

9 kudos

09-15-2021 4:50:53 PM

9 More Replies

by RantoB • Valued Contributor

11-10-2021 12:19:18 AM

5244 Views
3 replies
3 kudos

Resolved! %run file not found

Hi,I was using the following command to import variables and functions from an other notebook :%run ./utilsFor some reason it is not working any more and gives me this message :Exception: File `'./utils.py'` not found.utils.py is still at the same pl...

Data Engineering

5244 Views
3 replies
3 kudos

11-10-2021 12:19:18 AM

View Replies

Latest Reply

RantoB
Valued Contributor

11-10-2021 12:54:00 AM

3 kudos

Finally I just solved my issue.Actually, in the same cell I wrote a comment starting with # and it was not working because of that...

3 kudos

11-10-2021 12:54:00 AM

2 More Replies

by Mohit_m • Valued Contributor II

11-08-2021 8:36:07 AM

687 Views
2 replies
4 kudos

Enabling of Task Orchestration feature in Jobs via API as well Databricks supports the ability to orchestrate multiple tasks within a job. You must en...

Enabling of Task Orchestration feature in Jobs via API as wellDatabricks supports the ability to orchestrate multiple tasks within a job. You must enable this feature in the admin console. Once enabled, this feature cannot be disabled. To enable orch...

Data Engineering

687 Views
2 replies
4 kudos

11-08-2021 8:36:07 AM

View Replies

Latest Reply

Kaniz
Community Manager

11-09-2021 7:43:42 PM

4 kudos

Thank you @Mohit Miglani for this amazing post.

4 kudos

11-09-2021 7:43:42 PM

1 More Replies

by FemiAnthony • New Contributor III

11-05-2021 2:45:52 AM

1782 Views
5 replies
3 kudos

Resolved! Location of customer_t1 dataset

Can anyone tell me how I can access the customer_t1 dataset that is referenced in the book "Delta Lake - The Definitive Guide "? I am trying to follow along with one of the examples.

Data Engineering

1782 Views
5 replies
3 kudos

11-05-2021 2:45:52 AM

View Replies

Latest Reply

Hubert-Dudek
Esteemed Contributor III

11-05-2021 7:41:44 AM

3 kudos

Some files are visualized here https://github.com/vinijaiswal/delta_time_travel/blob/main/Delta%20Time%20Travel.ipynb but it is quite strange that there is no source in repository. I think only one way is to write to Vini Jaiswal on github.

3 kudos

11-05-2021 7:41:44 AM

4 More Replies

by Sandesh87 • New Contributor III

10-13-2021 10:33:04 PM

1879 Views
2 replies
2 kudos

Resolved! dbutils.secrets.get- NoSuchElementException: None.get

The below code executes a 'get' api method to retrieve objects from s3 and write to the data lake.The problem arises when I use dbutils.secrets.get to get the keys required to establish the connection to s3my_dataframe.rdd.foreachPartition(partition ...

Data Engineering

1879 Views
2 replies
2 kudos

10-13-2021 10:33:04 PM

View Replies

Latest Reply

Kaniz
Community Manager

11-09-2021 5:11:11 AM

2 kudos

Hi @Sandesh Puligundla , You just need to move the following two lines:val AccessKey = dbutils.secrets.get(scope = "ADB_Scope", key = "AccessKey-ID") val SecretKey = dbutils.secrets.get(scope = "ADB_Scope", key = "AccessKey-Secret")Outside of the fo...

2 kudos

11-09-2021 5:11:11 AM

1 More Replies

by Mohit_m • Valued Contributor II

11-09-2021 2:37:51 AM

381 Views
1 replies
2 kudos

docs.databricks.com

Are EBS volumes used by Databricks Clusters are encrypted especially the root volumes

Data Engineering

381 Views
1 replies
2 kudos

11-09-2021 2:37:51 AM

View Replies

Latest Reply

Mohit_m
Valued Contributor II

11-09-2021 3:41:22 AM

2 kudos

Yes these EBS volumes are encrypted. Earlier root volume encryptions were not supported but recently this encryption is also enabled (since Apr, 2021)please find more details on the below docs pagehttps://docs.databricks.com/clusters/configure.html#e...

2 kudos

11-09-2021 3:41:22 AM

by FemiAnthony • New Contributor III

11-05-2021 2:39:42 AM

2764 Views
6 replies
5 kudos

Resolved! /dbfs is empty

Why does /dbfs seem to be empty in my Databricks cluster ?If I run %sh ls /dbfsI get no output.I am looking for the databricks-datasets subdirectory ? I can't find it under /dbfs

Data Engineering

2764 Views
6 replies
5 kudos

11-05-2021 2:39:42 AM

View Replies

Latest Reply

FemiAnthony
New Contributor III

11-09-2021 1:09:39 AM

5 kudos

Thanks @Prabakar Ammeappin

5 kudos

11-09-2021 1:09:39 AM

5 More Replies

by Sandesh87 • New Contributor III

10-13-2021 8:05:19 PM

1067 Views
3 replies
2 kudos

Resolved! log error to cosmos db

Objective:- Retrieve objects from an S3 bucket using a 'get' api call, write the retrieved object to azure datalake and in case of errors like 404s (object not found) write the error message to cosmos DB"my_dataframe" consists of the a column (s3Obje...

Data Engineering

1067 Views
3 replies
2 kudos

10-13-2021 8:05:19 PM

View Replies

Latest Reply

User16763506477
Contributor III

10-26-2021 8:57:56 PM

2 kudos

Hi @Sandesh Puligundla issue is that you are using spark context inside foreachpartition. You can create a dataframe only on the spark driver. Few stack overflow references https://stackoverflow.com/questions/46964250/nullpointerexception-creatin...

2 kudos

10-26-2021 8:57:56 PM

2 More Replies

by SEOCO • New Contributor II

11-03-2021 4:01:47 AM

1691 Views
3 replies
3 kudos

Passing parameters from DevOps Pipeline/release to DataBricks Notebook

Hi,This is all a bit new to me.Does anybody have any idea how to pass a parameter to the Databricks notebook.I have a DevOps pipeline/release that moves my databricks notebooks towards QA and Production environment. The only problem I am facing is th...

Data Engineering

1691 Views
3 replies
3 kudos

11-03-2021 4:01:47 AM

View Replies

Latest Reply

Anonymous
Not applicable

11-08-2021 8:35:09 AM

3 kudos

@Mario Walle - If @Hubert Dudek's answer solved the issue, would you be happy to mark his answer as best so that it will be more visible to other members?

3 kudos

11-08-2021 8:35:09 AM

2 More Replies

by Jeff_Luecht • New Contributor II

11-07-2021 10:32:49 AM

2160 Views
2 replies
4 kudos

Resolved! Resarting existing community edition clusters

I am new to Databricks community edition. I was following the quckstart guide and running through basic cluster management - create, start, etc. For whatever reason, I cannot restart an e3xisting cluster. There is nothing in the cluster event logs or...

Data Engineering

2160 Views
2 replies
4 kudos

11-07-2021 10:32:49 AM

View Replies

Latest Reply

Kaniz
Community Manager

11-07-2021 11:55:18 PM

4 kudos

Hi @ Jeff Luecht,Please refresh the event logs. You can clone your cluster.As a Community Edition user, your cluster will automatically terminate after an idle period of two hours.For more configuration options, please upgrade your Databricks subscri...

4 kudos

11-07-2021 11:55:18 PM

1 More Replies

by Erik • Valued Contributor II

11-05-2021 11:45:45 AM

1639 Views
6 replies
2 kudos

Resolved! Does Z-ordering speed up reading of a single file?

Situation: we have one partion per date, and it just so happens that each partition ends up (after optimize) as *a single* 128mb file. We partition on date, and zorder on userid, and our query is something like "find max value of column A where useri...

Data Engineering

1639 Views
6 replies
2 kudos

11-05-2021 11:45:45 AM

View Replies

Latest Reply

-werners-
Esteemed Contributor III

11-07-2021 10:52:51 PM

2 kudos

Z-Order will make sure that in case you need to read multiple files, these files are co-located.For a single file this does not matter as a single file is always local to itself.If you are certain that your spark program will only read a single file,...

2 kudos

11-07-2021 10:52:51 PM

5 More Replies

by Alexander1 • New Contributor III

10-08-2021 2:19:28 AM

1580 Views
5 replies
0 kudos

Databricks JDBC 2.6.19 documentation

I am searching for the Databricks JDBC 2.6.19 documentation page. I can find release notes from the Databricks download page (https://databricks-bi-artifacts.s3.us-east-2.amazonaws.com/simbaspark-drivers/jdbc/2.6.19/docs/release-notes.txt) but on Mag...

Data Engineering

1580 Views
5 replies
0 kudos

10-08-2021 2:19:28 AM

View Replies

Latest Reply

Alexander1
New Contributor III

11-07-2021 10:38:59 PM

0 kudos

By the way what is still wild, is that the Simba docs say 2.6.16 does only support until Spark 2.4 while the release notes on Databricks download page say 2.6.16 already supports Spark 3.0. Strange that we get contradicting info from the actual driv...

0 kudos

11-07-2021 10:38:59 PM

4 More Replies

by Anonymous • Not applicable

11-06-2021 1:04:22 AM

259 Views
0 replies
0 kudos

spacecoastdaily.com

This Vigor Now male improvement pill contains still up in the air trimmings that together work on working on your overall prosperity by boosting the levels and production of testosterone in your body. Such extended testosterone creation can certainly...

Data Engineering

259 Views
0 replies
0 kudos

11-06-2021 1:04:22 AM

by Daniel • New Contributor III

11-03-2021 2:44:57 PM

3793 Views
11 replies
6 kudos

Resolved! Autocomplete parentheses, quotation marks, brackets and square stopped working

Hello guys, can someone help me?Autocomplete parentheses, quotation marks, brackets and square stopped working in python notebooks.How can I fix this?Daniel

Data Engineering

3793 Views
11 replies
6 kudos

11-03-2021 2:44:57 PM

View Replies

Latest Reply

Daniel
New Contributor III

11-05-2021 6:09:01 AM

6 kudos

@Piper Wilson , @Werner Stinckens Thank you so much for your help.I made the suggestion of the @Jose Gonzalez and now it works.

6 kudos

11-05-2021 6:09:01 AM

10 More Replies

User

Count

1601

736

343

284

247

Databricks

Forum Posts

Auto Loader Schema-Inference and Evolution for parquet files

Resolved! Lineage between model and source code breaks on movement of source notebook. How to rectify it?

Resolved! %run file not found

Enabling of Task Orchestration feature in Jobs via API as well Databricks supports the ability to orchestrate multiple tasks within a job. You must en...

Resolved! Location of customer_t1 dataset

Resolved! dbutils.secrets.get- NoSuchElementException: None.get

docs.databricks.com

Resolved! /dbfs is empty

Resolved! log error to cosmos db

Passing parameters from DevOps Pipeline/release to DataBricks Notebook

Resolved! Resarting existing community edition clusters

Resolved! Does Z-ordering speed up reading of a single file?

Databricks JDBC 2.6.19 documentation

spacecoastdaily.com

Resolved! Autocomplete parentheses, quotation marks, brackets and square stopped working

DELTA_EXCEED_CHAR_VARCHAR_LIMIT

Not able to set run_as service_principal_name

Pyspark operations slowness in CLuster 14.3LTS as ...

[Databricks Assets Bundles] Workflow trigger on fi...

Addressing Pipeline Error Handling in Databricks b...