Data Engineering

Forum Posts

Sorted by:

by Tico23 • Contributor

03-05-2023 4:41:58 AM

5935 Views
3 replies
0 kudos

Resolved! AmazonS3 with Autoloader consume "too many" requests or maybe not!

After successfully loading 3 small files (2 KB each) in from AWS S3 using Auto Loader for learning purposes, I got, few hours later, a "AWS Free tier limit alert", although I haven't used the AWS account for a while. Does this streaming service on ...

Data Engineering

5935 Views
3 replies
0 kudos

03-05-2023 4:41:58 AM

View Replies

Latest Reply

Debayan
Databricks Employee

03-06-2023 8:25:11 AM

0 kudos

Hi, Auto Loader incrementally and efficiently processes new data files as they arrive in cloud storage. Auto Loader can load data files from AWS S3 (s3://), Azure Data Lake Storage Gen2 (ADLS Gen2, abfss://), Google Cloud Storage (GCS, gs://), Azur...

0 kudos

03-06-2023 8:25:11 AM

2 More Replies

by Hubert-Dudek • Esteemed Contributor III

03-06-2023 3:12:08 AM

2777 Views
3 replies
7 kudos

Starting from #databricks runtime 12.2 LTS, implicit lateral column aliasing is now supported. This feature enables you to reuse an expression defined...

Starting from #databricks runtime 12.2 LTS, implicit lateral column aliasing is now supported. This feature enables you to reuse an expression defined earlier in the same SELECT list, thus avoiding repetition of the same calculation.For instance, in ...

Data Engineering

2777 Views
3 replies
7 kudos

03-06-2023 3:12:08 AM

View Replies

Latest Reply

Ajay-Pandey
Esteemed Contributor III

03-06-2023 5:17:54 AM

7 kudos

Informative Thanks for sharing.

7 kudos

03-06-2023 5:17:54 AM

2 More Replies

by Hubert-Dudek • Esteemed Contributor III

04-03-2022 8:21:03 AM

9414 Views
4 replies
23 kudos

Encrypt and decrypt personal data with Spark Databricks.We create a table that will include personal information. However, we want to hide personal id...

Encrypt and decrypt personal data with Spark Databricks.We create a table that will include personal information. However, we want to hide personal identifiers so no one can see them.We set a key. A key need to have 16, 24, or 32 bytes. 1 byte = 1 ch...

Data Engineering

9414 Views
4 replies
23 kudos

04-03-2022 8:21:03 AM

View Replies

Latest Reply

MaheshDBR
New Contributor II

03-04-2023 7:43:45 PM

23 kudos

@Hubert Dudek how can we decrypt the data outside of databricks with python? which is encrypted with aes_encrypt

23 kudos

03-04-2023 7:43:45 PM

3 More Replies

by STummala • New Contributor

03-01-2023 7:02:44 AM

3333 Views
2 replies
0 kudos

how to dynamically perform aggregation on all columns in a data frame even when some columns have different types like int , double string datetime or float in pyspark (i have 140-200 columns and need to perform aggregation/avg on each column)

need to aggregate all the numerical columns but need to this dynamically

Data Engineering

3333 Views
2 replies
0 kudos

03-01-2023 7:02:44 AM

View Replies

Latest Reply

Anonymous
Not applicable

03-05-2023 11:16:13 PM

0 kudos

Hi @sandeep tummala , Thank you for your question! To assist you better, please take a moment to review the answer and let me know if it best fits your needs.Please help us select the best solution by clicking on "Select As Best" if it does.Your fe...

0 kudos

03-05-2023 11:16:13 PM

1 More Replies

by Nirbhay • New Contributor II

02-21-2023 7:36:23 AM

1949 Views
2 replies
2 kudos

Resolved! newly created account not working and when changing password it is getting hangged

My community user account is not working

Data Engineering

1949 Views
2 replies
2 kudos

02-21-2023 7:36:23 AM

View Replies

Latest Reply

Nirbhay
New Contributor II

03-05-2023 2:07:07 AM

2 kudos

Community-Edition please

2 kudos

03-05-2023 2:07:07 AM

1 More Replies

by raghub1 • New Contributor II

05-10-2022 1:12:02 AM

8914 Views
4 replies
3 kudos

Resolved! Writing PySpark DataFrame onto AWS Glue throwing error

I have followed the steps as mentioned in this blog : https://www.linkedin.com/pulse/aws-glue-data-catalog-metastore-databricks-deepak-rajak/ but when trying to saveAsTable(table_name), it is giving an error as IllegalArgumentException: Path must be ...

Data Engineering

8914 Views
4 replies
3 kudos

05-10-2022 1:12:02 AM

View Replies

Latest Reply

Anonymous
Not applicable

06-21-2022 9:10:24 AM

3 kudos

Hey @Raghu Bharadwaj Tallapragada Just wanted to check in if you were able to resolve your issue or do you need more help? We'd love to hear from you.Thanks!

3 kudos

06-21-2022 9:10:24 AM

3 More Replies

by youssefmrini • Databricks Employee

03-03-2023 6:16:01 AM

1765 Views
1 replies
2 kudos

Resolved! Does Databricks Workflows support Continuous Jobs ?

Data Engineering

1765 Views
1 replies
2 kudos

03-03-2023 6:16:01 AM

View Replies

Latest Reply

youssefmrini
Databricks Employee

03-03-2023 6:18:21 AM

2 kudos

Continuous Jobs trigger mode has been introduced recently. To prevent unnecessary resource usage and reduce cost, Databricks automatically pauses a continuous job if there are more than five consecutive failures within a 24 hour period.

2 kudos

03-03-2023 6:18:21 AM

by youssefmrini • Databricks Employee

03-03-2023 6:02:05 AM

2939 Views
1 replies
0 kudos

Can I use Unpivot clause in databricks SQL ?

Data Engineering

2939 Views
1 replies
0 kudos

03-03-2023 6:02:05 AM

View Replies

Latest Reply

youssefmrini
Databricks Employee

03-03-2023 6:02:44 AM

0 kudos

The UNPIVOT clause is now supported by Databricks SQL. Use the UNPIVOT clause to rotate columns of a table-valued expression into column values https://docs.databricks.com/sql/language-manual/sql-ref-syntax-qry-select-unpivot.html

0 kudos

03-03-2023 6:02:44 AM

by eabouzeid • New Contributor III

02-09-2023 8:54:18 AM

15483 Views
8 replies
8 kudos

How to enable interactive Python matplotlib figures in DataBricks?

I want to make a matplolib interactive (I can zoom in/out, etc.) in databricks. This is achieved in Jupyter notebook by the following code: %matplotlib notebookHow to achieve this in databricks?Thank you

Data Engineering

15483 Views
8 replies
8 kudos

02-09-2023 8:54:18 AM

View Replies

Latest Reply

amu
New Contributor II

03-02-2023 11:48:21 PM

8 kudos

Hi there, while facing a similar issue we switched to Altair python library and it works great with Databricks. (other options can be Bokeh or Plotly).

8 kudos

03-02-2023 11:48:21 PM

7 More Replies

by youssefmrini • Databricks Employee

03-03-2023 5:32:50 AM

4450 Views
1 replies
2 kudos

Resolved! What to do when you face this error "Please delete your streaming query checkpoint and restart"

Data Engineering

4450 Views
1 replies
2 kudos

03-03-2023 5:32:50 AM

View Replies

Latest Reply

youssefmrini
Databricks Employee

03-03-2023 5:33:35 AM

2 kudos

The best way is to run a full refresh.

2 kudos

03-03-2023 5:33:35 AM

by youssefmrini • Databricks Employee

03-03-2023 5:27:03 AM

1397 Views
1 replies
2 kudos

Resolved! Is there a way to reprocess data for a certain time using DLT ?

Data Engineering

1397 Views
1 replies
2 kudos

03-03-2023 5:27:03 AM

View Replies

Latest Reply

youssefmrini
Databricks Employee

03-03-2023 5:27:31 AM

2 kudos

You will have to do a full refresh.

2 kudos

03-03-2023 5:27:31 AM

by brickster • New Contributor II

10-30-2022 3:28:41 AM

9107 Views
3 replies
0 kudos

How to trigger workflow job tasks from Autoloader

I have configured a File Notification Autoloader that monitors S3 bucket for binary files. I want to integrate autoloader with workflow job so that whenever a file is placed in S3 bucket, the pipeline job notebook tasks can pick-up new file and start...

Data Engineering

9107 Views
3 replies
0 kudos

10-30-2022 3:28:41 AM

View Replies

Latest Reply

Anonymous
Not applicable

01-08-2023 9:56:12 PM

0 kudos

Hi @Saravanan Ponnaiah Hope everything is going great.Does @odoll odoll response answer your question? If yes, would you be happy to mark it as best so that other members can find the solution more quickly?We'd love to hear from you.Thanks!

0 kudos

01-08-2023 9:56:12 PM

2 More Replies

by bradlindblad • New Contributor II

02-24-2023 11:38:12 AM

3163 Views
2 replies
1 kudos

Resolved! Font in Databricks Notebook is Greyed Out - Glitchy

The monospaced/code font in my databricks notebooks is greyed out, both in light and dark theme. I've tried playing with all the notebook settings, etc. and nothing will make the font 'normal'. I've tried Chrome and Edge, and the results are the same...

Data Engineering

3163 Views
2 replies
1 kudos

02-24-2023 11:38:12 AM

View Replies

Latest Reply

klaapbakken
New Contributor III

03-03-2023 1:40:05 AM

1 kudos

I was having this exact same issue. I fixed it by uninstalling the Source Code Pro font from my Windows machine.

1 kudos

03-03-2023 1:40:05 AM

1 More Replies

by Gk • New Contributor III

02-13-2023 1:00:07 AM

5268 Views
10 replies
1 kudos

DataBricks

How to find Mountpoints definitions

Data Engineering

5268 Views
10 replies
1 kudos

02-13-2023 1:00:07 AM

View Replies

Latest Reply

Anonymous
Not applicable

02-20-2023 11:20:56 PM

1 kudos

Hi @Govardhana Reddy Glad to hear!Please mark the answer as best, it will be highly appreciable.Have a great day!Regards

1 kudos

02-20-2023 11:20:56 PM

9 More Replies

by sanjay • Valued Contributor II

03-02-2023 5:48:37 AM

4191 Views
4 replies
1 kudos

Resolved! How can I get date when autoloader processes the file

Hi,I am running autoloader which is running continuously and checks for new file every 1 minute. I need to store when file was received/processed but its giving me date when autoloader started. Here is my code.df = (spark .readStream .format("clo...

Data Engineering

4191 Views
4 replies
1 kudos

03-02-2023 5:48:37 AM

View Replies

Latest Reply

Lakshay
Databricks Employee

03-02-2023 6:55:35 AM

1 kudos

Hi @Sanjay Jain , You can use the File Metadata column functionality to collect that information.Ref doc:- https://docs.databricks.com/ingestion/file-metadata-column.html

1 kudos

03-02-2023 6:55:35 AM

3 More Replies

Databricks Community

Forum Posts

Resolved! AmazonS3 with Autoloader consume "too many" requests or maybe not!

Starting from #databricks runtime 12.2 LTS, implicit lateral column aliasing is now supported. This feature enables you to reuse an expression defined...

Encrypt and decrypt personal data with Spark Databricks.We create a table that will include personal information. However, we want to hide personal id...

how to dynamically perform aggregation on all columns in a data frame even when some columns have different types like int , double string datetime or float in pyspark (i have 140-200 columns and need to perform aggregation/avg on each column)

Resolved! newly created account not working and when changing password it is getting hangged

Resolved! Writing PySpark DataFrame onto AWS Glue throwing error

Resolved! Does Databricks Workflows support Continuous Jobs ?

Can I use Unpivot clause in databricks SQL ?

How to enable interactive Python matplotlib figures in DataBricks?

Resolved! What to do when you face this error "Please delete your streaming query checkpoint and restart"

Resolved! Is there a way to reprocess data for a certain time using DLT ?

How to trigger workflow job tasks from Autoloader

Resolved! Font in Databricks Notebook is Greyed Out - Glitchy

DataBricks

Resolved! How can I get date when autoloader processes the file

Join Us as a Local Community Builder!

Node type not available in Central India (Student ...

Unexpected Schema ID Folder Creation in Unity Cata...

PipelineSpec object does not seem to show event_lo...

delta live tables

readStream with readChangeFeed option in SQL