Data Engineering

Forum Posts

Sorted by:

by Gopal269673 • Contributor

03-31-2023 5:10:51 PM

5454 Views
11 replies
8 kudos

Resolved! Facing issues in running the converted code in spark sql framework with 5 to 10 percent volume of prod data. Need help in solving this and suggestions required.

Hi All.. Need your help in this issue what i am facing. Currently we are using data bricks as a platform to build pipeline and execute our talend ETL sqls converted into the spark sql framework as we were facing issues in loading the history data int...

Data Engineering

5454 Views
11 replies
8 kudos

03-31-2023 5:10:51 PM

View Replies

Latest Reply

Gopal269673
Contributor

04-06-2023 4:38:10 PM

8 kudos

@All Users Group Metrics stats also attached here.Thanks.

8 kudos

04-06-2023 4:38:10 PM

10 More Replies

by Ajay_Birari • New Contributor II

01-03-2023 9:02:13 PM

2751 Views
3 replies
3 kudos

SSO Implementation - User need to go through multiple steps to visit the "Select Workspace" page.

Below are the steps we have implemented to login through SSO.1. We have setup SSO and are able to login into Databricks using IDP (SiemensID Authentication). 2. After successful authentication, we have done the configuration of redirecting user to da...

First screen after successful SSO authentication

Again, select Single Sign on by clicking on Continue with SSO

Data Engineering

2751 Views
3 replies
3 kudos

01-03-2023 9:02:13 PM

View Replies

Latest Reply

Ajay_Birari
New Contributor II

01-18-2023 9:22:02 PM

3 kudos

Hello @Kaniz Fatma @Debayan Mukherjee Thanks for the response.We have raised this issue with Databricks team internally. We have shared the details with team. I will post the solution once we find any breakthrough to resolve it.

3 kudos

01-18-2023 9:22:02 PM

2 More Replies

by ramravi • Contributor II

01-03-2023 1:19:32 AM

2566 Views
3 replies
4 kudos

Issue while reading data from Kafka topic to Spark strutured streaming

py4j.security.Py4JSecurityException: Method public org.apache.spark.sql.streaming.DataStreamReader org.apache.spark.sql.SQLContext.readStream() is not whitelisted on class class org.apache.spark.sql.SQLContextI already disable acl for cluster using "...

Data Engineering

2566 Views
3 replies
4 kudos

01-03-2023 1:19:32 AM

View Replies

Latest Reply

jose_gonzalez
Databricks Employee

04-06-2023 11:27:50 AM

4 kudos

Hi @Ravi Teja,Just a friendly follow-up. Do you still need help? if you do, please share more details, like DBR version, standard or High concurrency cluster? etc

4 kudos

04-06-2023 11:27:50 AM

2 More Replies

by pk1 • New Contributor II

01-02-2023 2:55:56 AM

2407 Views
2 replies
3 kudos

Academy Accreditation - SQL Analyst Associate

Hi Team , So last year I acquired SQL Analyst Associate badge and due for renew this Jan 2023 . However when checked in Databricks Academy couldn't find the course . So has it been retired or removed ? If exists can someone help me with the course d...

Data Engineering

2407 Views
2 replies
3 kudos

01-02-2023 2:55:56 AM

View Replies

Latest Reply

jose_gonzalez
Databricks Employee

04-06-2023 11:24:12 AM

3 kudos

Adding @Vidula Khanna and @Kaniz Fatma for visibility

3 kudos

04-06-2023 11:24:12 AM

1 More Replies

by SS2 • Valued Contributor

02-01-2023 12:57:18 PM

1412 Views
3 replies
0 kudos

Custom duty charges or any other additional charges on Databricks rewards.

Hi All,Anyone can please confirm i have to pay any custom duty fee or any other additional shippment fee for Databricks rewards?Thanks

Data Engineering

1412 Views
3 replies
0 kudos

02-01-2023 12:57:18 PM

View Replies

Latest Reply

jose_gonzalez
Databricks Employee

04-06-2023 9:42:05 AM

0 kudos

Hi @S S,Just a friendly follow-up. Do you still need help with this question? please let us know

0 kudos

04-06-2023 9:42:05 AM

2 More Replies

by Kearon • New Contributor III

03-09-2023 7:19:06 AM

6322 Views
6 replies
0 kudos

Resolved! Databricks Delta Live Table stored as SCD 2 is creating new records when no data changes. How do I stop this?

I have a streaming pipeline that ingests json files from a data lake using autoloader. These files are dumped there periodically. Mostly the files contain duplicate data, but there are occasional changes. I am trying to process these files into a dat...

Data Engineering

6322 Views
6 replies
0 kudos

03-09-2023 7:19:06 AM

View Replies

Latest Reply

Kearon
New Contributor III

04-06-2023 7:11:33 AM

0 kudos

For clarity, here is the final code that avoids duplicates, using @Suteja Kanuri 's suggestion:import dlt @dlt.table def currStudents_dedup(): df = spark.readStream.format("delta").table("live.currStudents_ingest") return ( df.drop...

0 kudos

04-06-2023 7:11:33 AM

5 More Replies

by coltonflowers • New Contributor III

04-05-2023 7:14:33 AM

2890 Views
5 replies
0 kudos

Whenever using the displayHTML method or any python library that requires rendering HTML we get the following error in the results: Uncaught SyntaxEr...

Whenever using the displayHTML method or any python library that requires rendering HTML we get the following error in the results: Uncaught SyntaxError: Invalid or unexpected tokenWe cannot reproduce this error reliably, and resizing the html window...

Data Engineering

2890 Views
5 replies
0 kudos

04-05-2023 7:14:33 AM

View Replies

Latest Reply

Debayan
Databricks Employee

04-05-2023 10:44:52 PM

0 kudos

Hi, If you could confirm the whole error stack will help us understanding the issue little clear. Also, please tag @Debayan with your next response which will notify me. Thank you!

0 kudos

04-05-2023 10:44:52 PM

4 More Replies

by Tonny_Stark • New Contributor III

04-05-2023 6:51:08 AM

3614 Views
3 replies
0 kudos

FileNotFoundError: [Errno 2] No such file or directory:

I have the following error code in databricks when I want to unzip filesFileNotFoundError: [Errno 2] No such file or directory: but the file is there I already tried several ways and nothing worksI have tried modifying by placing/dbfs/mnt/dbfs/mnt/d...

Data Engineering

3614 Views
3 replies
0 kudos

04-05-2023 6:51:08 AM

View Replies

Latest Reply

karthik_p
Esteemed Contributor

04-06-2023 5:50:25 AM

0 kudos

@Alfredo Vallejos then your file is tar.gz file right, have you tried tar command instead of unzip

0 kudos

04-06-2023 5:50:25 AM

2 More Replies

by 646901 • New Contributor II

02-01-2023 5:15:21 AM

1766 Views
2 replies
0 kudos

Cloud storage - enabling object versioning?

So i am going to keep this generic as to all cloud provider storage options as its relevant across the board, (GCS, S3 and blob store). Nothing is mentioned in docs as far as i can see. Is there a use case against enabling object versioning in cloud ...

Data Engineering

1766 Views
2 replies
0 kudos

02-01-2023 5:15:21 AM

View Replies

Latest Reply

Anonymous
Not applicable

04-06-2023 3:35:57 AM

0 kudos

Hi @Matt User Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answers you...

0 kudos

04-06-2023 3:35:57 AM

1 More Replies

by Meghala • Valued Contributor II

02-04-2023 4:33:18 AM

2728 Views
4 replies
0 kudos

I faced some problem while taking databricks exam

Hi team, Good evening today I got problem while taking the exam my exam is @11:30 but some audio problem it's got reschedule @12:45 again also I faced problem ,question was some time appears and some time it's not so, because this I can't able to ta...

Data Engineering

2728 Views
4 replies
0 kudos

02-04-2023 4:33:18 AM

View Replies

Latest Reply

Anonymous
Not applicable

04-06-2023 2:14:46 AM

0 kudos

Hi @S Meghala Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answers you...

0 kudos

04-06-2023 2:14:46 AM

3 More Replies

by Larrio • New Contributor III

03-07-2023 2:06:48 AM

7882 Views
6 replies
3 kudos

Autoloader - understanding missing file after schema update.

Hello,Concerning Autoloader (based on https://docs.databricks.com/ingestion/auto-loader/schema.html), so far what I understand is when it detects a schema update, the stream fails and I have to rerun it to make it works, it's ok.But once I rerun it, ...

Data Engineering

7882 Views
6 replies
3 kudos

03-07-2023 2:06:48 AM

View Replies

Latest Reply

Anonymous
Not applicable

03-31-2023 5:47:18 PM

3 kudos

Hi @Lucien Arrio Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Thank...

3 kudos

03-31-2023 5:47:18 PM

5 More Replies

by Anonymous • Not applicable

03-31-2023 12:57:11 AM

3443 Views
9 replies
3 kudos

Dear Community members, We want to extend our sincere gratitude for attending the Community event - March series on March 31st 2023. Your presence mad...

Dear Community members,We want to extend our sincere gratitude for attending the Community event - March series on March 31st 2023. Your presence made the event a huge success, and we appreciate the time you took to join us. We were thrilled to hear ...

Data Engineering

3443 Views
9 replies
3 kudos

03-31-2023 12:57:11 AM

View Replies

Latest Reply

pvignesh92
Honored Contributor

04-03-2023 1:10:51 AM

3 kudos

@Suteja Kanuri Hi Suteja. Great initiative. Please plan a common timezone between India and UK/EUR/US so that we can also attend. BTW is there any recorded session that we can go through?

3 kudos

04-03-2023 1:10:51 AM

8 More Replies

by chanansh • Contributor

02-03-2023 5:02:06 AM

1867 Views
2 replies
0 kudos

Delta table acceleration for group by on key columns using ZORDER does not work

What is the best practice for accelerating queries which looks like the following?win = Window.partitionBy('key1','key2').orderBy('timestamp') df.select('timestamp', (F.col('col1') - F.lag('col1').over(win)).alias('col1_diff'))I have tried to use OP...

Data Engineering

1867 Views
2 replies
0 kudos

02-03-2023 5:02:06 AM

View Replies

Latest Reply

Anonymous
Not applicable

04-05-2023 11:55:40 PM

0 kudos

Hi @Hanan Shteingart Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answ...

0 kudos

04-05-2023 11:55:40 PM

1 More Replies

by Kanna1706 • New Contributor III

04-04-2023 10:45:50 PM

3618 Views
3 replies
4 kudos

Resolved! Where we can find our created table (location) in data bricks community edition.

Data Engineering

3618 Views
3 replies
4 kudos

04-04-2023 10:45:50 PM

View Replies

Latest Reply

Anonymous
Not applicable

04-05-2023 11:30:53 PM

4 kudos

Hi @Machireddy Nikitha Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best an...

4 kudos

04-05-2023 11:30:53 PM

2 More Replies

by kll • New Contributor III

04-05-2023 3:13:10 PM

3398 Views
1 replies
0 kudos

Fatal error: The Python kernel is unresponsive when attempting to query data from AWS Redshift within Jupyter notebook

I am running jupyter notebook on a cluster with configuration: 12.2 LTS (includes Apache Spark 3.3.2, Scala 2.12)Worker type: i3.xlarge 30.5gb memory, 4 coresMin 2 and max 8 workers cursor = conn.cursor() cursor.execute( """ ...

Data Engineering

3398 Views
1 replies
0 kudos

04-05-2023 3:13:10 PM

View Replies

Latest Reply

Debayan
Databricks Employee

04-05-2023 10:41:45 PM

0 kudos

Hi, Could you please confirm the usage of your cluster while running this job? you can monitor the performance here: https://docs.databricks.com/clusters/clusters-manage.html#monitor-performance with different metrics. Also, please tag @Debayan with...

0 kudos

04-05-2023 10:41:45 PM

User

Count

1611

768

348

286

252

Databricks Community

Forum Posts

Resolved! Facing issues in running the converted code in spark sql framework with 5 to 10 percent volume of prod data. Need help in solving this and suggestions required.

SSO Implementation - User need to go through multiple steps to visit the "Select Workspace" page.

Issue while reading data from Kafka topic to Spark strutured streaming

Academy Accreditation - SQL Analyst Associate

Custom duty charges or any other additional charges on Databricks rewards.

Resolved! Databricks Delta Live Table stored as SCD 2 is creating new records when no data changes. How do I stop this?

Whenever using the displayHTML method or any python library that requires rendering HTML we get the following error in the results: Uncaught SyntaxEr...

FileNotFoundError: [Errno 2] No such file or directory:

Cloud storage - enabling object versioning?

I faced some problem while taking databricks exam

Autoloader - understanding missing file after schema update.

Dear Community members, We want to extend our sincere gratitude for attending the Community event - March series on March 31st 2023. Your presence mad...

Delta table acceleration for group by on key columns using ZORDER does not work

Resolved! Where we can find our created table (location) in data bricks community edition.

Fatal error: The Python kernel is unresponsive when attempting to query data from AWS Redshift within Jupyter notebook

Join Us as a Local Community Builder!

Dlt pipeline showing legacy , even though all thin...

SERVERLESS SQL WAREHOUSE

Unity Catalog Table in Databricks Asset Bundle

Databricks data engineer associate exam

How to delete/empty notebook output