Data Engineering

Forum Posts

Sorted by:

by vr • Contributor

11-26-2022 4:03:04 PM

5831 Views
5 replies
6 kudos

Resolved! How to avoid trimming in EXPLAIN?

I am looking on EXPLAIN EXTENDED plan for a statement.In == Physical Plan == section, I go down to FileScan node and see a lot of ellipsis, like +- FileScan parquet schema.table[Time#8459,TagName#8460,Value#8461,Quality#8462,day#8...

Data Engineering

5831 Views
5 replies
6 kudos

11-26-2022 4:03:04 PM

View Replies

Latest Reply

SS2
Valued Contributor

11-29-2022 11:09:23 AM

6 kudos

I also faced the same

6 kudos

11-29-2022 11:09:23 AM

4 More Replies

by Retko • Contributor

12-01-2022 5:04:08 AM

19835 Views
5 replies
8 kudos

Databricks notebook sometime takes too long to run query (even on empty table)

Hi,sometime I notice that running a query takes too long - even simple queries - and next time when I run same query it runs much faster. I have cluster running (DBR 10.4 LTS • 5 workers) and it has constantly several workers.An Example of query is s...

Data Engineering

19835 Views
5 replies
8 kudos

12-01-2022 5:04:08 AM

View Replies

Latest Reply

j_afanador
Contributor II

12-02-2022 10:03:48 AM

8 kudos

Probably the cluster is always in use and the query always falls into the processing query, or the cluster auto stops every time that you use it.

8 kudos

12-02-2022 10:03:48 AM

4 More Replies

by Nayan7276 • Valued Contributor II

11-30-2022 12:03:40 AM

2556 Views
4 replies
26 kudos

First post on databricks community

Hello Guys!This my first databricks community post. Looking forward to contribute from my end

Data Engineering

2556 Views
4 replies
26 kudos

11-30-2022 12:03:40 AM

View Replies

Latest Reply

Diva
Contributor

12-02-2022 9:32:42 AM

26 kudos

Welcome to community

26 kudos

12-02-2022 9:32:42 AM

3 More Replies

by augustin • New Contributor II

11-02-2022 8:16:05 AM

5286 Views
5 replies
5 kudos

Mount an uncrypted AWS EFS in AWS Databricks

Hi,I want to mount an uncrypted AWS EFS in AWS Databricks. When I do:mount -t nfs4 -o nfsvers=4.1,rsize=1048576,wsize=1048576,hard,timeo=600,retrans=2,noresvport fs-abcdef.efs.region.amazonaws.com:/ /mnt/efs-uncryptedI get this error:mount.nfs4: moun...

Data Engineering

5286 Views
5 replies
5 kudos

11-02-2022 8:16:05 AM

View Replies

Latest Reply

Andrei_Radulesc
Contributor III

12-02-2022 8:46:43 AM

5 kudos

"To support NFS under LXC, some of the apparmor protections need to be lifted." (see https://theorangeone.net/posts/mount-nfs-inside-lxc/)

5 kudos

12-02-2022 8:46:43 AM

4 More Replies

by sqlshep • New Contributor III

11-05-2022 5:47:26 PM

3957 Views
3 replies
1 kudos

Hello, i have a dashboard using map markers that has been working for the last few days, suddenly i am getting an error in the dashboard and in the rendered map on the query.

Data Engineering

3957 Views
3 replies
1 kudos

11-05-2022 5:47:26 PM

View Replies

Latest Reply

sqlshep
New Contributor III

12-02-2022 5:50:10 AM

1 kudos

Its broken again, i am seeing this several times a week, and it is offline for hours at a time.

1 kudos

12-02-2022 5:50:10 AM

2 More Replies

by hitesh1 • New Contributor III

08-17-2022 3:08:40 PM

8186 Views
1 replies
5 kudos

java.util.NoSuchElementException: key not found

Hello,We are using a Azure Databricks with Standard DS14_V2 Cluster with Runtime 9.1 LTS, Spark 3.1.2 and Scala 2.12 and facing the below issue frequently when running our ETL pipeline. As part of the operation that is failing there are several joins...

Data Engineering

8186 Views
1 replies
5 kudos

08-17-2022 3:08:40 PM

View Replies

Latest Reply

Aviral-Bhardwaj
Esteemed Contributor III

12-02-2022 1:53:14 AM

5 kudos

Hey man,Please use these configuration in your cluster and it will work,spark.sql.storeAssignmentPolicy LEGACYspark.sql.parquet.binaryAsString truespark.speculation falsespark.sql.legacy.timeParserPolicy LEGACYif it wont work let me know what problem...

5 kudos

12-02-2022 1:53:14 AM

by Jack • New Contributor II

05-02-2022 6:43:59 AM

7614 Views
1 replies
1 kudos

Python: Generate new dfs from a list of dataframes using for loop

I have a list of dataframes (for this example 2) and want to apply a for-loop to the list of frames to generate 2 new dataframes. To start, here is my starting dataframe called df_final:First, I create 2 dataframes: df2_b2c_fast, df2_b2b_fast:for x i...

Data Engineering

7614 Views
1 replies
1 kudos

05-02-2022 6:43:59 AM

View Replies

Latest Reply

Aviral-Bhardwaj
Esteemed Contributor III

12-02-2022 1:44:45 AM

1 kudos

thanks

1 kudos

12-02-2022 1:44:45 AM

by isaac_gritz • Databricks Employee

08-22-2022 11:29:14 PM

1645 Views
1 replies
6 kudos

Databricks Security Review

Conducting a security review or vendor assessment of Databricks and looking to learn more about our security features, compliance information, and privacy policies?You can find the latest on Databricks security features, architecture, compliance and ...

Data Engineering

1645 Views
1 replies
6 kudos

08-22-2022 11:29:14 PM

View Replies

Latest Reply

Aviral-Bhardwaj
Esteemed Contributor III

12-02-2022 1:43:19 AM

6 kudos

thanks man

6 kudos

12-02-2022 1:43:19 AM

by SRK • Contributor III

12-01-2022 5:10:41 PM

3160 Views
3 replies
5 kudos

Resolved! I met with an issue when I was trying to use autoloader to read json files from Azure ADLS Gen2. I am getting this issue for specific files only. I checked the file are good and not corrupted.

I met with an issue when I was trying to use autoloader to read json files from Azure ADLS Gen2. I am getting this issue for specific files only. I checked the file are good and not corrupted.Following is the issue:java.lang.IllegalArgumentException:...

Data Engineering

3160 Views
3 replies
5 kudos

12-01-2022 5:10:41 PM

View Replies

Latest Reply

SRK
Contributor III

12-02-2022 1:34:56 AM

5 kudos

I got the issue resolved. The issues was by mistake we have duplicate columns in the schema files. Because of that it was showing that error. However, the error is totally mis-leading, that's why didn't able to rectify it.

5 kudos

12-02-2022 1:34:56 AM

2 More Replies

by KVNARK • Honored Contributor II

12-01-2022 10:20:18 PM

1660 Views
2 replies
12 kudos

Resolved! How to get list of users who created the tables in different workspaces and the operations they have done.

Hi,I have 10 workspaces linked to different departments. We have overall 4 users doing some activity on these 10 workspaces . I want to get the list of users who are all operating on which tables and what operation they have performed and all in all ...

Data Engineering

1660 Views
2 replies
12 kudos

12-01-2022 10:20:18 PM

View Replies

Latest Reply

Aviral-Bhardwaj
Esteemed Contributor III

12-01-2022 11:48:36 PM

12 kudos

Hi Ranjit,for tablets, I believe it's hard but if you want to combine all 10 workspaces you can use the databricks API for cluster lists https://docs.databricks.com/dev-tools/api/latest/index.htmland then you can check their IAM roles to understand w...

12 kudos

12-01-2022 11:48:36 PM

1 More Replies

by Thanapat_S • Contributor

12-01-2022 12:29:50 AM

4179 Views
3 replies
5 kudos

Resolved! How could I export an Alert object for deployment to another Azure Databricks resource?

IntroductionI would like to use Alert feature for monitor job status (from log table) in Databricks-SQL.So, I have write a query in a query notebook (or object) to return result from log table. Also, I have set the alert object for monitoring and tri...

Data Engineering

4179 Views
3 replies
5 kudos

12-01-2022 12:29:50 AM

View Replies

Latest Reply

Harun
Honored Contributor

12-01-2022 5:03:33 AM

5 kudos

I am not seeing any direct option to export or version control the alert object other than the migrate option.https://docs.databricks.com/sql/api/queries-dashboards.html - check this link, it might help you in other way.

5 kudos

12-01-2022 5:03:33 AM

2 More Replies

by Paradox_Parijat • New Contributor III

11-29-2022 10:00:14 AM

2615 Views
5 replies
8 kudos

Hello World! This my first databricks community post. Looking forward to contribute from my end. Peace out! @Dinesh Mergu

Hello World! This my first databricks community post. Looking forward to contribute from my end. Peace out!@Dinesh Mergu

Data Engineering

2615 Views
5 replies
8 kudos

11-29-2022 10:00:14 AM

View Replies

Latest Reply

Harshjot
Contributor III

12-01-2022 10:34:55 PM

8 kudos

Welcome !!

8 kudos

12-01-2022 10:34:55 PM

4 More Replies

by KVNARK • Honored Contributor II

12-01-2022 7:37:50 AM

1496 Views
2 replies
6 kudos

Resolved! Scope of Data Governance in Databricks

Scope of Data Governance in Databricks. How we can implement it and is there any data limit for this to implement. I would like to know more about Cost wise.

Data Engineering

1496 Views
2 replies
6 kudos

12-01-2022 7:37:50 AM

View Replies

Latest Reply

KVNARK
Honored Contributor II

12-01-2022 10:10:21 PM

6 kudos

I see. Thank you @karthik p. Got it.

6 kudos

12-01-2022 10:10:21 PM

1 More Replies

by Taha_Hussain • Databricks Employee

11-10-2022 3:35:31 PM

3132 Views
1 replies
5 kudos

Ask your technical questions at Databricks Office Hours! November 16 - 8:00 AM - 9:00 AM PT: Register HereNovember 30 - 11:00 AM - 12:00 PM PT: Regist...

Ask your technical questions at Databricks Office Hours!November 16 - 8:00 AM - 9:00 AM PT: Register HereNovember 30 - 11:00 AM - 12:00 PM PT: Register HereDatabricks Office Hours connects you directly with experts to answer all your Databricks quest...

Data Engineering

3132 Views
1 replies
5 kudos

11-10-2022 3:35:31 PM

View Replies

Latest Reply

Taha_Hussain
Databricks Employee

12-01-2022 8:51:11 PM

5 kudos

Q&A Recap from 11/30 Office HoursQ: What is the downside of using z-ordering and auto optimize? It seems like there could be a tradeoff with writing small files (whereas it is good at reading a larger file), is that true?A: By default, Delta Lake on ...

5 kudos

12-01-2022 8:51:11 PM

by Ancil • Contributor II

12-01-2022 4:59:35 AM

17763 Views
11 replies
1 kudos

Any on please suggest how we can effectively loop through PySpark Dataframe .

Scenario: I Have a dataframe with more than 1000 rows, each row having a file path and result data column. I need to loop through each row and write files to the file path, with data from the result column.what is the easiest and time effective way ...

Data Engineering

17763 Views
11 replies
1 kudos

12-01-2022 4:59:35 AM

View Replies

Latest Reply

NhatHoang
Valued Contributor II

12-01-2022 7:28:07 PM

1 kudos

Hi,I agree with Werners, try to avoid loop with Pyspark Dataframe.If your dataframe is small, as you said, only about 1000 rows, you may consider to use Pandas.Thanks.

1 kudos

12-01-2022 7:28:07 PM

10 More Replies

User

Count

1611

766

345

286

252

Databricks Community

Forum Posts

Resolved! How to avoid trimming in EXPLAIN?

Databricks notebook sometime takes too long to run query (even on empty table)

First post on databricks community

Mount an uncrypted AWS EFS in AWS Databricks

Hello, i have a dashboard using map markers that has been working for the last few days, suddenly i am getting an error in the dashboard and in the rendered map on the query.

java.util.NoSuchElementException: key not found

Python: Generate new dfs from a list of dataframes using for loop

Databricks Security Review

Resolved! I met with an issue when I was trying to use autoloader to read json files from Azure ADLS Gen2. I am getting this issue for specific files only. I checked the file are good and not corrupted.

Resolved! How to get list of users who created the tables in different workspaces and the operations they have done.

Resolved! How could I export an Alert object for deployment to another Azure Databricks resource?

Hello World! This my first databricks community post. Looking forward to contribute from my end. Peace out! @Dinesh Mergu

Resolved! Scope of Data Governance in Databricks

Ask your technical questions at Databricks Office Hours! November 16 - 8:00 AM - 9:00 AM PT: Register HereNovember 30 - 11:00 AM - 12:00 PM PT: Regist...

Any on please suggest how we can effectively loop through PySpark Dataframe .

Join Us as a Local Community Builder!

dbutls.fs.cp() fails in Runtime 16.3 Beta, when us...

How to copy notebooks from local to the tarrget fo...

At least 1 "file_arrival" blocks are required.

How to manage two separate projects ?

Incrementalization issue in Materialized views