Data Engineering

Forum Posts

Sorted by:

by Johny • New Contributor III

05-03-2023 3:06:44 PM

1128 Views
2 replies
4 kudos

Insert data to a CDF-enabled Delta table throwing java.lang.StackOverflowError

I am building a bronze table with CDF-enables in these steps:Initially, Reading json file from landing zone and write to table locationdf = spark.readStream.format("cloudFiles") \ .option("cloudFiles.schemaLocation", <schema_loc>) \ .option("clou...

Data Engineering

1128 Views
2 replies
4 kudos

05-03-2023 3:06:44 PM

View Replies

Latest Reply

Johny
New Contributor III

05-04-2023 3:02:26 PM

4 kudos

I tried with a simple csv file that only has one column. I got the same error.

4 kudos

05-04-2023 3:02:26 PM

1 More Replies

by varunsaagar • New Contributor III

01-13-2023 6:01:18 AM

4759 Views
18 replies
31 kudos

Request for reattempt voucher. Databricks Certified Machine Learning Professional exam

HiOn December 28th ,I attempted the Databricks Certified Machine Learning Professional exam for 1st time , unfortunately I ended up by failing grade. My passing grade was 70%, and I received 68.33%.I am planning to reattempt the exam, Could you kindl...

Data Engineering

4759 Views
18 replies
31 kudos

01-13-2023 6:01:18 AM

View Replies

Latest Reply

girl_chan
New Contributor II

05-04-2023 6:52:55 AM

31 kudos

What is the next event where they will give a voucher?

31 kudos

05-04-2023 6:52:55 AM

17 More Replies

by JKR • New Contributor III

04-25-2023 4:57:58 PM

1418 Views
2 replies
0 kudos

The spark driver has stopped unexpectedly and is restarting. Your notebook will be automatically reattached.

Getting below error Context: Using Databricks shared interactive cluster for scheduled run multiple parallel jobs at the same time after every 5 mins. When I check Ganglia, driver node's memory reaches almost max and then restart of driver happens an...

Data Engineering

1418 Views
2 replies
0 kudos

04-25-2023 4:57:58 PM

View Replies

Latest Reply

jose_gonzalez
Moderator

04-26-2023 2:20:11 PM

0 kudos

please check the driver's logs, for example the log4j and the GC logs

0 kudos

04-26-2023 2:20:11 PM

1 More Replies

by jonasmin • New Contributor III

06-02-2022 8:32:31 AM

5193 Views
7 replies
2 kudos

Error while establishing JDBC connection to Azure databricks via HTTP proxy

I am using the databricks JDBC driver (https://databricks.com/spark/jdbc-drivers-download) to connect to Azure databricks.The connection needs to be routed through a HTTP proxy. I found parameters that can be configured for using the HTTP proxy:By pa...

Data Engineering

5193 Views
7 replies
2 kudos

06-02-2022 8:32:31 AM

View Replies

Latest Reply

MS_Varma
New Contributor II

05-04-2023 5:45:51 AM

2 kudos

Hi @Jonas Minning , actually I am also having the same issue and when i looked into the driver related documentation I found that the driver currently only supports SOCKS proxies and I believe this is the reason why we are getting this error. So, I ...

2 kudos

05-04-2023 5:45:51 AM

6 More Replies

by knowAsha • New Contributor II

03-24-2023 6:06:28 AM

1211 Views
3 replies
3 kudos

Error while running the data engineering course notebook : "DE 2.2 - Providing Options for External Sources"

Can somebody help me fixing this problem? I am running this notebook on databricks community edition

I am running this notebook in Databricks community edition.

Data Engineering

1211 Views
3 replies
3 kudos

03-24-2023 6:06:28 AM

View Replies

Latest Reply

lemfo
New Contributor II

05-04-2023 5:35:19 AM

3 kudos

df = spark.read.format('parquet').load(path = datasource_path) df = df.select("*").toPandas() df.to_sql('users', conn, if_exists='replace', index = False)

3 kudos

05-04-2023 5:35:19 AM

2 More Replies

by Hubert-Dudek • Esteemed Contributor III

04-20-2023 8:41:41 AM

1278 Views
2 replies
8 kudos

Implementing a data vault model in databricks can be challenging, but it can significantly improve the manageability of your data, particularly in hea...

Implementing a data vault model in databricks can be challenging, but it can significantly improve the manageability of your data, particularly in heavily regulated industries such as banking. While it may involve significant data duplication, duplic...

Data Engineering

1278 Views
2 replies
8 kudos

04-20-2023 8:41:41 AM

View Replies

Latest Reply

Priyag1
Honored Contributor II

05-04-2023 4:22:30 AM

8 kudos

helpful

8 kudos

05-04-2023 4:22:30 AM

1 More Replies

by Priyag1 • Honored Contributor II

05-03-2023 6:40:54 AM

554 Views
1 replies
11 kudos

How to get reward points ?

Data Engineering

554 Views
1 replies
11 kudos

05-03-2023 6:40:54 AM

View Replies

Latest Reply

samhita
New Contributor III

05-04-2023 4:07:03 AM

11 kudos

You must share informative posts and be active in the community.Thank you

11 kudos

05-04-2023 4:07:03 AM

by kjoth • Contributor II

02-01-2022 11:38:45 PM

11375 Views
9 replies
6 kudos

How to make the job fail via code after handling exception

Hi , We are capturing the exception if an error occurs using try except. But we want the job status to be failed once we got the exception. Whats the best way to do that. We are using pyspark.

Data Engineering

11375 Views
9 replies
6 kudos

02-01-2022 11:38:45 PM

View Replies

Latest Reply

AkA
New Contributor II

05-04-2023 3:06:49 AM

6 kudos

Instead of exiting the notebook which make the task/job success, Exception objects needs to be raised again from Exception block to fail the job.try: <you code>except Exception as err: <your block of exception handling> raise err

6 kudos

05-04-2023 3:06:49 AM

8 More Replies

by pinaki1 • New Contributor III

05-01-2023 11:36:59 PM

2158 Views
5 replies
0 kudos

connect rds from databricks sql editor

Is it possible to connect and execute query directly from rds in sql editor without using unity catelog

Data Engineering

2158 Views
5 replies
0 kudos

05-01-2023 11:36:59 PM

View Replies

Latest Reply

luis_herrera
New Contributor III

05-03-2023 4:41:01 AM

0 kudos

Hi there, Yes, you could do federated queries from DB SQL Editor. This is an experimental feature, though. UC is actually not supported. You can read more here:https://docs.databricks.com/query-federation/index.htmlPS: check out #DAIS2023 talks

0 kudos

05-03-2023 4:41:01 AM

4 More Replies

by 81528 • New Contributor II

05-03-2023 11:14:54 PM

1058 Views
2 replies
3 kudos

The workers in the cluster uses old end-of-life Ubuntu 18.04.

I create a cluster or a pool with the runtime version 12.2 LTS or even with the latest 13.0According to the documentation the worker should use an image. with Ubuntu 20.04 https://docs.databricks.com/release-notes/runtime/12.2.html#system-environment...

Data Engineering

1058 Views
2 replies
3 kudos

05-03-2023 11:14:54 PM

View Replies

Latest Reply

Priyag1
Honored Contributor II

05-03-2023 11:45:14 PM

3 kudos

Contact support team

3 kudos

05-03-2023 11:45:14 PM

1 More Replies

by DavyN • New Contributor II

02-27-2023 2:40:59 AM

2062 Views
3 replies
3 kudos

Resolved! Unable to take Lakehouse Fundamentals Quiz

Hi,I watched the videos for Lakehouse Fundamentals. However, when I click on "Take the quiz" it opens another tab that says I don't have permission to access the page.I've done all the necessary signing up.Could someone please help. Thanks!

Data Engineering

2062 Views
3 replies
3 kudos

02-27-2023 2:40:59 AM

View Replies

Latest Reply

MandatoryNickna
New Contributor II

05-04-2023 12:26:02 AM

3 kudos

This still seems to be unavailable. Very annoying.

3 kudos

05-04-2023 12:26:02 AM

2 More Replies

by Oliver_Angelil • Valued Contributor II

05-03-2023 7:47:02 AM

772 Views
2 replies
2 kudos

Automated CI code checks using workflows when PR is raised

I'm familiar with Github Actions workflows to automate code checks whenever a PR is raised to a specified branch. For example for Python code, very useful is if unit tests (e.g. pytest), syntax (flake8), and code formatting (black formatter), type h...

Data Engineering

772 Views
2 replies
2 kudos

05-03-2023 7:47:02 AM

View Replies

Latest Reply

Priyag1
Honored Contributor II

05-03-2023 10:05:09 PM

2 kudos

In a typical software development workflow (e.g. Github flow), a feature branch is created based on the master branch for feature development. A notebook can be synced to the feature branch via Github integration. Or a notebook can be exported from D...

2 kudos

05-03-2023 10:05:09 PM

1 More Replies

by DeviJaviya • New Contributor II

04-29-2023 10:15:56 PM

1045 Views
2 replies
0 kudos

Trying to build subquery in Databricks notebook, similar to SQL in a data frame with the Top(1)

Hello Everyone,I am new to Databricks, so I am at the learning stage. It would be very helpful if someone helps in resolving the issue or I can say helped me to fix my code.I have built the query that fetches the data based on CASE, in Case I have a ...

Data Engineering

1045 Views
2 replies
0 kudos

04-29-2023 10:15:56 PM

View Replies

Latest Reply

DeviJaviya
New Contributor II

05-03-2023 8:40:16 PM

0 kudos

Hello Rishabh,Thank you for your suggestion, we tried to limit 1 but the output values are coming the same for all the dates. which is not correct.

0 kudos

05-03-2023 8:40:16 PM

1 More Replies

by Joey • New Contributor II

04-25-2023 3:07:23 PM

3489 Views
3 replies
0 kudos

How to fix the error on INVALID_PARAMETER_VALUE when using mlflow for tracking a yolo model training?

I'm new to databricks, and I'm trying to train yolo model and use mlflow to track the parameters and log the models. I keep getting this error related to the requirements.txt file path: INVALID_PARAMETER_VALUE: Invalid value '/Shared/YOLOv8/requireme...

Data Engineering

3489 Views
3 replies
0 kudos

04-25-2023 3:07:23 PM

View Replies

Latest Reply

Joey
New Contributor II

05-03-2023 2:08:56 PM

0 kudos

Thanks for the reply, @Suteja Kanuri . I tried the proposed solution. This time got this message:Invalid artifact path: '/Shared/YOLOv8'. Names may be treated as files in certain cases, and must not resolve to other names when treated as such. This ...

0 kudos

05-03-2023 2:08:56 PM

2 More Replies

by Hubert-Dudek • Esteemed Contributor III

05-03-2023 5:12:00 AM

686 Views
2 replies
6 kudos

Have you ever wondered how to automate your #databricks jobs and workflows without using the UI? If you want to manage your Databricks resources as co...

Have you ever wondered how to automate your #databricks jobs and workflows without using the UI? If you want to manage your Databricks resources as code, you should check out Terraform.Here is a simple example of creating a job that runs a notebook o...

Data Engineering

686 Views
2 replies
6 kudos

05-03-2023 5:12:00 AM

View Replies

Latest Reply

-werners-
Esteemed Contributor III

05-03-2023 6:09:58 AM

6 kudos

or use ADF Still waiting for actual added value on using Databricks Workflow over ADF.

6 kudos

05-03-2023 6:09:58 AM

1 More Replies

User

Count

1601

736

343

284

247

Databricks

Forum Posts

Insert data to a CDF-enabled Delta table throwing java.lang.StackOverflowError

Request for reattempt voucher. Databricks Certified Machine Learning Professional exam

The spark driver has stopped unexpectedly and is restarting. Your notebook will be automatically reattached.

Error while establishing JDBC connection to Azure databricks via HTTP proxy

Error while running the data engineering course notebook : "DE 2.2 - Providing Options for External Sources"

Implementing a data vault model in databricks can be challenging, but it can significantly improve the manageability of your data, particularly in hea...

How to get reward points ?

How to make the job fail via code after handling exception

connect rds from databricks sql editor

The workers in the cluster uses old end-of-life Ubuntu 18.04.

Resolved! Unable to take Lakehouse Fundamentals Quiz

Automated CI code checks using workflows when PR is raised

Trying to build subquery in Databricks notebook, similar to SQL in a data frame with the Top(1)

How to fix the error on INVALID_PARAMETER_VALUE when using mlflow for tracking a yolo model training?

Have you ever wondered how to automate your #databricks jobs and workflows without using the UI? If you want to manage your Databricks resources as co...

DELTA_EXCEED_CHAR_VARCHAR_LIMIT

Not able to set run_as service_principal_name

Pyspark operations slowness in CLuster 14.3LTS as ...

[Databricks Assets Bundles] Workflow trigger on fi...

Addressing Pipeline Error Handling in Databricks b...