Data Engineering

Forum Posts

Sorted by:

by Atifdatabricks • New Contributor II

08-02-2023 5:00:38 AM

838 Views
3 replies
1 kudos

Suspended - Databricks Certified Associate Developer for Apache Spark

During middle of the exam I got suspended. It said due to my eye movement. I had the test on left part of my monitor and pdf (which was provided as a testing aid for this exam) on right side. I was just moving my eyes left and right as I was using PD...

Data Engineering

838 Views
3 replies
1 kudos

08-02-2023 5:00:38 AM

View Replies

Latest Reply

Kaniz
Community Manager

08-03-2023 12:19:59 AM

1 kudos

Adding @Cert-Team for visibility.

1 kudos

08-03-2023 12:19:59 AM

2 More Replies

by Rishi045 • New Contributor III

08-03-2023 4:15:25 AM

5456 Views
11 replies
0 kudos

Data getting missed while reading from azure event hub using spark streaming

Hi All,I am facing an issue of data getting missed.I am reading the data from azure event hub and after flattening the json data I am storing it in a parquet file and then using another databricks notebook to perform the merge operations on my delta ...

Data Engineering

Azure event hub

Spark streaming

5456 Views
11 replies
0 kudos

08-03-2023 4:15:25 AM

View Replies

Latest Reply

Hubert-Dudek
Esteemed Contributor III

08-03-2023 5:00:07 AM

0 kudos

- In the EventHub, you can preview the event hub job using Azure Analitycs, so please first check are all records there- Please set in Databricks that it is saved directly to the bronze delta table without performing any aggregation, just 1 to 1, and...

0 kudos

08-03-2023 5:00:07 AM

10 More Replies

by ThomasVanBilsen • New Contributor III

08-03-2023 1:15:36 AM

832 Views
1 replies
1 kudos

Catalog name's in DTAP scenario

Hi everyone,I'm currently in the process of migrating to Unity Catalog. I have several Azure Databricks Workspaces, one for each phase of the development phase (development, test, acceptance, and production). In accordance with the best practices (ht...

Data Engineering

DTAP

Unity Catalog

832 Views
1 replies
1 kudos

08-03-2023 1:15:36 AM

View Replies

Latest Reply

-werners-
Esteemed Contributor III

08-03-2023 4:31:33 AM

1 kudos

you could also store the environment name in a config file f.e. in the databricks filestore.These config files can also be managed by ci/cd.tbh my preferred way of working lately.

1 kudos

08-03-2023 4:31:33 AM

by sparkstreaming • New Contributor III

12-22-2021 6:53:34 AM

3169 Views
5 replies
4 kudos

Resolved! Missing rows while processing records using foreachbatch in spark structured streaming from Azure Event Hub

I am new to real time scenarios and I need to create a spark structured streaming jobs in databricks. I am trying to apply some rule based validations from backend configurations on each incoming JSON message. I need to do the following actions on th...

Data Engineering

3169 Views
5 replies
4 kudos

12-22-2021 6:53:34 AM

View Replies

Latest Reply

Rishi045
New Contributor III

08-03-2023 3:33:51 AM

4 kudos

Were you able to achieve any solutions if yes please can you help with it.

4 kudos

08-03-2023 3:33:51 AM

4 More Replies

by Oliver_Angelil • Valued Contributor II

06-14-2023 11:55:02 PM

4871 Views
9 replies
8 kudos

In what circumstances are both UAT/DEV and PROD environments actually necessary?

I wanted to ask this Q yesterday in the Q&A session with Mohan Mathews, but didn't get around to it (@Kaniz Fatma do you know his handle here so I can tag him?)We (and most development teams) have two environments: UAT/DEV and PROD. For those that d...

Data Engineering

4871 Views
9 replies
8 kudos

06-14-2023 11:55:02 PM

View Replies

Latest Reply

Anonymous
Not applicable

06-15-2023 8:06:52 PM

8 kudos

Hi @Oliver Angelil Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Tha...

8 kudos

06-15-2023 8:06:52 PM

8 More Replies

by DipsikhaDas • New Contributor II

08-02-2023 8:39:20 PM

629 Views
2 replies
2 kudos

Databricks notebook exceptions into Service Now

Hello Community members,I am looking for options for redirecting the Databricks notebook raised except within exception block to be redirected to ServiceNowIs there a way the connection can be made directly from the notebook?Looking for suggestions. ...

Data Engineering

629 Views
2 replies
2 kudos

08-02-2023 8:39:20 PM

View Replies

Latest Reply

DipsikhaDas
New Contributor II

08-02-2023 11:57:03 PM

2 kudos

Thank you for the solution, I will definitely try this and share to the community if this works.

2 kudos

08-02-2023 11:57:03 PM

1 More Replies

by adivandhya • New Contributor III

07-28-2023 6:42:03 PM

1258 Views
4 replies
4 kudos

Resolved! configuration for Job Queueing in Terraform

When defining the databricks_job resource in Terraform , we are trying to enable Job Queueing flag for the job. However, from the Terraform Provider docs, we are not able to find any config related to queuing. Is there a different method to configure...

Data Engineering

1258 Views
4 replies
4 kudos

07-28-2023 6:42:03 PM

View Replies

Latest Reply

Kaniz
Community Manager

07-31-2023 12:58:55 AM

4 kudos

Hi @adivandhya, it’s a private preview feature - you need to work with your account SA for that.

4 kudos

07-31-2023 12:58:55 AM

3 More Replies

by HasiCorp • New Contributor II

03-03-2023 5:12:03 AM

4991 Views
3 replies
2 kudos

Resolved! AnalysisException: [RequestId=... ErrorClass=INVALID_PARAMETER_VALUE] Missing cloud file system scheme

Hi community,i get an analysis exception when executing following code in a notebook using a personal compute cluster. Seems to be an issue with permission but I am logged in with my admin account. Any help would be appreciated. USE CATALOG catalog; ...

Data Engineering

4991 Views
3 replies
2 kudos

03-03-2023 5:12:03 AM

View Replies

Latest Reply

Leonardo
New Contributor III

08-02-2023 6:08:19 AM

2 kudos

I was having the same issue because I was trying to set the location with the absolute path, just like you did.I solved it by creating an external location, then copying the URL and putting it into the location of the path options.

2 kudos

08-02-2023 6:08:19 AM

2 More Replies

by felix_counter • New Contributor III

07-28-2023 12:00:01 AM

1903 Views
3 replies
3 kudos

Resolved! Order of delta table after read not as expected

Dear Databricks Community,I am performing three consecutive 'append' writes to a delta table, whereas the first append creates the table. Each append consists of two rows, which are ordered by column 'id' (see example in the attached screenshot). Whe...

Data Engineering

1903 Views
3 replies
3 kudos

07-28-2023 12:00:01 AM

View Replies

Latest Reply

felix_counter
New Contributor III

08-02-2023 4:49:23 AM

3 kudos

Thanks a lot @Lakshay and @Tharun-Kumar for your valued contributions!

3 kudos

08-02-2023 4:49:23 AM

2 More Replies

by ivanychev • Contributor

07-11-2022 12:13:24 AM

1019 Views
2 replies
1 kudos

Is there a way to avoid using EBS drives on workers with local NVMe SSD?

The Databricks on AWS docs claim that 30G + 150G EBS drives are mounter to every node by default. But if I use instance type like r5d.2xlarge, it already has local disk so I want to avoid mounting the 150G EBS drive to it. Is there a way to do it?We ...

Data Engineering

1019 Views
2 replies
1 kudos

07-11-2022 12:13:24 AM

View Replies

Latest Reply

Kaniz
Community Manager

08-02-2023 3:58:58 AM

1 kudos

Hi @ivanychev, Based on the provided information, if you want to avoid mounting the 150G EBS drive to a node with the local disk, you can set ebs_volume_count it to 0 in the Clusters API when creating the cluster. Another option could be manually det...

1 kudos

08-02-2023 3:58:58 AM

1 More Replies

by bearys • New Contributor II

07-18-2022 5:04:18 AM

1446 Views
2 replies
2 kudos

Illegal character in partition path when attempting REORG ... (PURGE)

I have a large delta table partitioned by an identifier column that I now have discovered has blank spaces in some of the identifiers, e.g. one partition can be defined by "Identifier=first identifier". Most partitions does not have these blank space...

Data Engineering

1446 Views
2 replies
2 kudos

07-18-2022 5:04:18 AM

View Replies

Latest Reply

Kaniz
Community Manager

08-02-2023 3:40:07 AM

2 kudos

Hi @bearys, The error message suggests an illegal character in the path at a specific index. The error is pointing to a blank space in the path "dbfs:/mnt/container/table_name/Identifier=first identifier/part-01347-8a9a157b-6d0d-75dd-b1b7-2aed12e057...

2 kudos

08-02-2023 3:40:07 AM

1 More Replies

by DB_PROD_Molina • New Contributor

08-01-2023 8:45:50 AM

782 Views
2 replies
3 kudos

Job aborted due to stage failure. Relative path in absolute URI

Hello Team we have frequently data bricks job failure with following message , any help would be appreciated Job aborted due to stage failure. Relative path in absolute URI

Data Engineering

782 Views
2 replies
3 kudos

08-01-2023 8:45:50 AM

View Replies

Latest Reply

Tharun-Kumar
Honored Contributor II

08-02-2023 12:11:47 AM

3 kudos

@DB_PROD_Molina One of the reasons this error shows up is due to file path/name containing special characters in it. If that is the case, could you rename your file to have the special characters removed.

3 kudos

08-02-2023 12:11:47 AM

1 More Replies

by Harrison • New Contributor

08-01-2023 10:52:59 AM

663 Views
0 replies
0 kudos

Reading CloudWatch Logs from AWS Kinesis

If you have AWS CloudWatch subscribed to write out logs to AWS Kinesis, the Kinesis stream is base64 encoded and the CloudWatch logs are GZIP compressed. The challenge we faced was how to address that in pyspark to be able to read the data. We were ...

Data Engineering

663 Views
0 replies
0 kudos

08-01-2023 10:52:59 AM

by DaniW • New Contributor III

08-01-2023 1:26:11 AM

2435 Views
3 replies
3 kudos

Resolved! PARSE_SYNTAX_ERROR creating view from CSV

Hello, if i run this code: %sqlCREATE OR REPLACE VIEW esprosilver.xxx.encuestas_talleresASSELECT * FROM CSV.`abfss://landing@esproanalyticscenterdl.dfs.core.windows.net/oracle-dwh/encuestas_talleres/encuestas_talleres.csv` It creates the view in unit...

Data Engineering

2435 Views
3 replies
3 kudos

08-01-2023 1:26:11 AM

View Replies

Latest Reply

DaniW
New Contributor III

08-01-2023 1:33:18 AM

3 kudos

I forgot to mention that the csv delimiter is ';'

3 kudos

08-01-2023 1:33:18 AM

2 More Replies

by dprutean • New Contributor III

09-03-2022 9:18:55 AM

6138 Views
5 replies
3 kudos

Resolved! JDBC Driver support for OpenJDK 17

Connecting to Databricks using OpenJDK 17 I got the exception below. Are there any plans to fix the driver for OpenJDK17?java.sql.SQLException: [Databricks][DatabricksJDBCDriver](500540) Error caught in BackgroundFetcher. Foreground thread ID: 44. Ba...

Data Engineering

6138 Views
5 replies
3 kudos

09-03-2022 9:18:55 AM

View Replies

Latest Reply

ameyabapat
New Contributor II

08-01-2023 2:23:12 AM

3 kudos

I still see the above error with databricks jdbc driver 2.6.33. Anyone aware of fix available either in driver or java?

3 kudos

08-01-2023 2:23:12 AM

4 More Replies

User

Count

1602

736

344

284

247

Databricks

Forum Posts

Suspended - Databricks Certified Associate Developer for Apache Spark

Data getting missed while reading from azure event hub using spark streaming

Catalog name's in DTAP scenario

Resolved! Missing rows while processing records using foreachbatch in spark structured streaming from Azure Event Hub

In what circumstances are both UAT/DEV and PROD environments actually necessary?

Databricks notebook exceptions into Service Now

Resolved! configuration for Job Queueing in Terraform

Resolved! AnalysisException: [RequestId=... ErrorClass=INVALID_PARAMETER_VALUE] Missing cloud file system scheme

Resolved! Order of delta table after read not as expected

Is there a way to avoid using EBS drives on workers with local NVMe SSD?

Illegal character in partition path when attempting REORG ... (PURGE)

Job aborted due to stage failure. Relative path in absolute URI

Reading CloudWatch Logs from AWS Kinesis

Resolved! PARSE_SYNTAX_ERROR creating view from CSV

Resolved! JDBC Driver support for OpenJDK 17

Best way to parse Google Analytics data in Databri...

DELTA_EXCEED_CHAR_VARCHAR_LIMIT

Not able to set run_as service_principal_name

Pyspark operations slowness in CLuster 14.3LTS as ...

[Databricks Assets Bundles] Workflow trigger on fi...