Data Engineering

Forum Posts

Sorted by:

by wyzer • Contributor II

11-18-2022 8:25:08 AM

2589 Views
2 replies
12 kudos

Resolved! Add the creation date of a parquet file into a DataFrame

Currently I load multiple parquet file with this code:df = spark.read.parquet("/mnt/dev/bronze/Voucher/*/*")(Inside the Voucher folder, there is one folder by date. Each one containing one parquet file)How can I add a column into this DataFrame, that...

Data Engineering

2589 Views
2 replies
12 kudos

11-18-2022 8:25:08 AM

View Replies

Latest Reply

wyzer
Contributor II

11-18-2022 12:46:00 PM

12 kudos

Thanks @Michail Karamanos

12 kudos

11-18-2022 12:46:00 PM

1 More Replies

by Yaswanth • New Contributor III

11-13-2022 3:34:14 PM

3388 Views
5 replies
18 kudos

Resolved! How can Delta table protocol version be downgraded from higher version to lower version the table properties minReader from 2 to 1 and MaxWriter from 5 to 3.

Is there a possibility to downgrade the Delta Table protocol versions minReader from 2 to 1 and maxWriter from 5 to 3? I have set the TBL properties to 2 and 5 and columnmapping mode to rename the columns in the DeltaTable but the other users are rea...

Data Engineering

3388 Views
5 replies
18 kudos

11-13-2022 3:34:14 PM

View Replies

Latest Reply

Kaniz
Community Manager

11-18-2022 12:26:21 PM

18 kudos

Hi @Yaswanth velkur, We haven’t heard from you since the last response from @Youssef Mrini and me, and I was checking back to see if our suggestions helped you.Or else, If you have any solution, please share it with the community, as it can be hel...

18 kudos

11-18-2022 12:26:21 PM

4 More Replies

by chiragnayyar • Contributor

11-14-2022 6:03:14 PM

589 Views
1 replies
3 kudos

Resolved! In person Databricks meetup in Singapore?

Hi I would like to know if anyone interested to volunteer in person Databricks meetup.Please share your thoughts, and we can talk further about the logistics Thank you

Data Engineering

589 Views
1 replies
3 kudos

11-14-2022 6:03:14 PM

View Replies

Latest Reply

Kaniz
Community Manager

11-18-2022 12:25:07 PM

3 kudos

Nice initiative @Chirag Nayyar!

3 kudos

11-18-2022 12:25:07 PM

by jeffgreen813 • New Contributor

11-18-2022 10:21:03 AM

532 Views
0 replies
0 kudos

How are you managing your DLT pipelines to maintain graph readability?

I've been building out a few pipelines in DLT and noticed that the usefulness of the user interface has started breaking down at a glance. I've attached a screenshot of one of my pipelines. It's not very far along and it's already pretty rough. You c...

Data Engineering

532 Views
0 replies
0 kudos

11-18-2022 10:21:03 AM

by rgb • New Contributor

11-18-2022 8:49:39 AM

369 Views
0 replies
0 kudos

Migration_pipeline.py failing to get default credentials

cat ~/.databrickscfg looks like this (with the correct token/host values in place of xxxxxx)[DEFAULT]host = xxxxxxtoken = xxxxxxjobs-api-version = 2.0The command I run to start the pipeline with default configured credentials is :sudo python3 migrati...

Data Engineering

369 Views
0 replies
0 kudos

11-18-2022 8:49:39 AM

by 693872 • New Contributor II

11-11-2022 5:36:38 PM

1460 Views
5 replies
2 kudos

Here I am getting this error when i execute left join on two data frame: PythonException: 'pyspark.serializers.SerializationError: Caused by Traceback (most recent call last): going to post full traceback:

I simply do left join on two data frame and both data frame content i was able to print.Here is the code looks like:-df_silver = spark.sql("select ds.PropertyID,\ ds.* from dfsilver as ds LEFT JOIN dfaddmaster as dm \ ...

Data Engineering

1460 Views
5 replies
2 kudos

11-11-2022 5:36:38 PM

View Replies

Latest Reply

Dooley
Valued Contributor

11-18-2022 8:39:41 AM

2 kudos

Did that answer your question? Did it work?

2 kudos

11-18-2022 8:39:41 AM

4 More Replies

by jurbschat • New Contributor III

11-18-2022 8:28:43 AM

451 Views
0 replies
6 kudos

Is Azure Database for MySQL - Flexible Server supported as external metastore.

In the docs it's mention that "if you use Azure Database for MySQL as an external metastore, you must change the value of the lower_case_table_names property from 1 (the default) to 2 in the server-side database configuration."However "lower_case_tab...

Data Engineering

451 Views
0 replies
6 kudos

11-18-2022 8:28:43 AM

by marcus1 • New Contributor III

11-18-2022 8:23:46 AM

241 Views
0 replies
0 kudos

Why does databricks https://docs.databricks.com/dev-tools/api/latest/scim/scim-users.html#get-users take so long

I've been observing as we added more workspaces and users to those workspaces that fetching users per workspace is now taking 11 minutes or more.Our automation to provision group access is now unacceptably long. I've noted that the UI doesn't suffer...

Data Engineering

241 Views
0 replies
0 kudos

11-18-2022 8:23:46 AM

by J_M_W • Contributor

10-11-2022 3:26:13 AM

1446 Views
3 replies
5 kudos

Resolved! Databricks is automatically creating a _apply_changes_storage table in the database when using apply_changes for Delta Live Tables

Hi there,I am using apply_changes (aka. Delta Live Tables Change Data Capture) and it works fine. However, it seems to automatically create a secondary table in the database metastore called _apply_storage_changes_{tableName}So for every table I use ...

Data Engineering

1446 Views
3 replies
5 kudos

10-11-2022 3:26:13 AM

View Replies

Latest Reply

J_M_W
Contributor

11-18-2022 6:56:48 AM

5 kudos

Hi - Thanks @Hubert Dudek I will look into disabling access for the users!

5 kudos

11-18-2022 6:56:48 AM

2 More Replies

by berserkersap • Contributor

08-13-2022 12:32:58 PM

2720 Views
1 replies
0 kudos

How to deal with Decimal data type arithmetic operations ?

I am dealing with values ranging from 10^9 to 10^-9 , the sum of values can go up to 10^20 and need accuracy. So I wanted to use Decimal Data type [ Using SQL in Data Science & Engineering workspace]. However, I got to know the peculiar behavior of D...

Data Engineering

2720 Views
1 replies
0 kudos

08-13-2022 12:32:58 PM

View Replies

Latest Reply

berserkersap
Contributor

11-18-2022 6:18:59 AM

0 kudos

Hello Everyone,I understand that there is no best answer for this question. So, I could only do the same thing I found when I surfed the net.The method I found works whenIf you know the range of values you deal with (not just the input data but also ...

0 kudos

11-18-2022 6:18:59 AM

by 190809 • Contributor

11-18-2022 4:51:47 AM

640 Views
2 replies
0 kudos

Invalid port error when trying to read from PlanetScale MySQL databse

Using the code below I am attempting to connect to a PlanetScale MySQL database. I get the following error: java.sql.SQLException: error parsing url : Incorrect port value. However the port is the default 3306, and I have used the correct url based o...

Data Engineering

640 Views
2 replies
0 kudos

11-18-2022 4:51:47 AM

View Replies

Latest Reply

Pat
Honored Contributor III

11-18-2022 5:15:26 AM

0 kudos

HI @Rachel Cunningham ,maybe you can share your `driver` and `url` value (masked)?

0 kudos

11-18-2022 5:15:26 AM

1 More Replies

by eques_99 • New Contributor II

11-17-2022 11:11:15 AM

788 Views
2 replies
0 kudos

Remove a category (slice) from a Pie Chart

I added a grand total row to a "Count" in SQL, which I needed for some counter visualisations. I used the "ROLL UP" command to get the grand total.However, I have a pie chart which references the same count, and so the grand total row has been added...

Data Engineering

788 Views
2 replies
0 kudos

11-17-2022 11:11:15 AM

View Replies

Latest Reply

eques_99
New Contributor II

11-18-2022 1:32:14 AM

0 kudos

hi, as per the picture above, the slice disappears but the name ("null" in this case) remains on the legend.

0 kudos

11-18-2022 1:32:14 AM

1 More Replies

by Jayanth746 • New Contributor III

11-17-2022 9:36:53 AM

2899 Views
2 replies
2 kudos

Databricks <-> Kafka - SSL handshake failed

I am receiving SSL handshake error even though the trust-store I have created is based on server certificate and the fingerprint in the certificate matches the trust-store fingerprint.kafkashaded.org.apache.kafka.common.errors.SslAuthenticationExcept...

Data Engineering

2899 Views
2 replies
2 kudos

11-17-2022 9:36:53 AM

View Replies

Latest Reply

Debayan
Esteemed Contributor III

11-17-2022 11:18:44 PM

2 kudos

Hi @Jayanth Goulla , worth a try ,https://stackoverflow.com/questions/54903381/kafka-failed-authentication-due-to-ssl-handshake-failedDid you follow: https://docs.microsoft.com/en-us/azure/databricks/spark/latest/structured-streaming/kafka?

2 kudos

11-17-2022 11:18:44 PM

1 More Replies

by elgeo • Valued Contributor II

11-16-2022 2:27:20 AM

818 Views
3 replies
2 kudos

Resolved! Disable auto-complete (tab button)

Hello. How could we disable autocomplete that appears with tab button? Thank you

Data Engineering

818 Views
3 replies
2 kudos

11-16-2022 2:27:20 AM

View Replies

Latest Reply

elgeo
Valued Contributor II

11-18-2022 12:30:22 AM

2 kudos

Thank you @Kaniz Fatma

2 kudos

11-18-2022 12:30:22 AM

2 More Replies

by sharonbjehome • New Contributor

11-16-2022 4:17:29 AM

732 Views
1 replies
1 kudos

Structered Streamin from MongoDB Atlas not parsing JSON correctly

HI all,I have a table in MongoDB Atlas that I am trying to read continuously to memory and then will write that file out eventually. However, when I look at the in-memory table it doesn't have the correct schema.Code here:from pyspark.sql.types impo...

Data Engineering

732 Views
1 replies
1 kudos

11-16-2022 4:17:29 AM

View Replies

Latest Reply

Debayan
Esteemed Contributor III

11-17-2022 11:36:04 PM

1 kudos

Hi @sharonbjehome , This has to be checked thoroughly via a support ticket, did you follow: https://docs.databricks.com/external-data/mongodb.html Also, could you please check with mongodb support, Was this working before?

1 kudos

11-17-2022 11:36:04 PM

User

Count

1601

736

343

284

246

Databricks

Forum Posts

Resolved! Add the creation date of a parquet file into a DataFrame

Resolved! How can Delta table protocol version be downgraded from higher version to lower version the table properties minReader from 2 to 1 and MaxWriter from 5 to 3.

Resolved! In person Databricks meetup in Singapore?

How are you managing your DLT pipelines to maintain graph readability?

Migration_pipeline.py failing to get default credentials

Here I am getting this error when i execute left join on two data frame: PythonException: 'pyspark.serializers.SerializationError: Caused by Traceback (most recent call last): going to post full traceback:

Is Azure Database for MySQL - Flexible Server supported as external metastore.

Why does databricks https://docs.databricks.com/dev-tools/api/latest/scim/scim-users.html#get-users take so long

Resolved! Databricks is automatically creating a _apply_changes_storage table in the database when using apply_changes for Delta Live Tables

How to deal with Decimal data type arithmetic operations ?

Invalid port error when trying to read from PlanetScale MySQL databse

Remove a category (slice) from a Pie Chart

Databricks <-> Kafka - SSL handshake failed

Resolved! Disable auto-complete (tab button)

Structered Streamin from MongoDB Atlas not parsing JSON correctly

DELTA_EXCEED_CHAR_VARCHAR_LIMIT

Not able to set run_as service_principal_name

Pyspark operations slowness in CLuster 14.3LTS as ...

[Databricks Assets Bundles] Workflow trigger on fi...

Addressing Pipeline Error Handling in Databricks b...