Data Engineering

Forum Posts

Sorted by:

by Krish123 • New Contributor

03-25-2022 8:19:19 PM

852 Views
0 replies
0 kudos

mount a Azure DL in Databricks

Hello Team,I am quite new to Databricks and I am learning PySpark and Databricks. I am trying to mount a DL Gen2 in Databricks, as part of that I had created app registration, added DL into app registration permissions, created a secret and also adde...

Data Engineering

852 Views
0 replies
0 kudos

03-25-2022 8:19:19 PM

by shan_chandra • Honored Contributor III

03-25-2022 8:46:04 AM

2324 Views
1 replies
2 kudos

Resolved! java.lang.ArithmeticException: Casting XXXXXXXXXXX to int causes overflow

My job started failing with the below error when inserting rows into a delta table. ailing with the below error when inserting rows (timestamp) to a delta table, it was working well before.java.lang.ArithmeticException: Casting XXXXXXXXXXX to int cau...

Data Engineering

2324 Views
1 replies
2 kudos

03-25-2022 8:46:04 AM

View Replies

Latest Reply

shan_chandra
Honored Contributor III

03-25-2022 8:49:16 AM

2 kudos

This is because the Integer type represents 4-byte signed integer numbers. The range of numbers is from -2147483648 to 2147483647.Kindly use double as the data type to insert the "2147483648" value in the delta table.In the below example, The second ...

2 kudos

03-25-2022 8:49:16 AM

by tz1 • New Contributor III

03-02-2022 8:14:02 AM

11256 Views
13 replies
7 kudos

Resolved! Problem with Databricks JDBC connection: Error occured while deserializing arrow data

I have a Java program like this to test out the Databricks JDBC connection with the Databricks JDBC driver. Connection connection = null; try { Class.forName(driver); connection = DriverManager.getConnection(url...

Data Engineering

11256 Views
13 replies
7 kudos

03-02-2022 8:14:02 AM

View Replies

Latest Reply

Alice__Caterpil
New Contributor III

03-23-2022 5:19:51 PM

7 kudos

Hi @Jose Gonzalez ,This similar issue in snowflake in JDBC is a good reference, I was able to get this to work in Java OpenJDK 17 by having this JVM option specified:--add-opens=java.base/java.nio=ALL-UNNAMEDAlthough I came across another issue with...

7 kudos

03-23-2022 5:19:51 PM

12 More Replies

by Bency • New Contributor III

03-24-2022 12:01:42 PM

950 Views
3 replies
2 kudos

Invalid field schema option provided-DatabricksDeltaLakeSinkConnector

I have configured a Delta Lake Sink connector which reads from an AVRO topic and writes to the Delta lake . I have followed the docs and my config looks like below . { "name": "dev_test_delta_connector", "config": { "topics": "dl_test_avro", "inp...

Data Engineering

950 Views
3 replies
2 kudos

03-24-2022 12:01:42 PM

View Replies

Latest Reply

Bency
New Contributor III

03-24-2022 12:23:17 PM

2 kudos

@Hubert Dudek , Should I be configuring anything with respect to schema in the connector config ? Because I did successfully stage some data from another topic of a different format(JSON_SR) into delta lake table , but its with AVRO topic that I ge...

2 kudos

03-24-2022 12:23:17 PM

2 More Replies

by User16826992666 • Valued Contributor

06-16-2021 11:00:20 AM

1702 Views
2 replies
1 kudos

Resolved! As an admin of a Databricks SQL environment, can I cancel long running queries?

I don't want one long or poorly written query to block my entire SQL endpoint for everyone else. Do I have the ability to kill specific queries?

Data Engineering

1702 Views
2 replies
1 kudos

06-16-2021 11:00:20 AM

View Replies

Latest Reply

DevB
New Contributor II

03-24-2022 12:16:40 PM

1 kudos

Is there a way to stop the session programmatically? like "kill session_id" or something similar in API?

1 kudos

03-24-2022 12:16:40 PM

1 More Replies

by Bency • New Contributor III

03-18-2022 4:51:39 AM

3907 Views
7 replies
5 kudos

Resolved! Databricks Delta Lake Sink Connector

I am trying to use Databricks Delta Lake Sink Connector(confluent cloud ) and write to S3 . the connector starts up with the following error . Any help on this could be appreciated org.apache.kafka.connect.errors.ConnectException: java.sql.SQLExcepti...

Data Engineering

3907 Views
7 replies
5 kudos

03-18-2022 4:51:39 AM

View Replies

Latest Reply

Bency
New Contributor III

03-24-2022 11:53:07 AM

5 kudos

Hi @Kaniz Fatma yes we did , looks like it was indeed a whitelisting issue . Thanks @Hubert Dudek @Kaniz Fatma

5 kudos

03-24-2022 11:53:07 AM

6 More Replies

by Constantine • Contributor III

03-24-2022 8:39:56 AM

1190 Views
1 replies
4 kudos

Resolved! How to process a large delta table with UDF ?

I have a delta table with about 300 billion rows. Now I am performing some operations on a column using UDF and creating another columnMy code is something like thisdef my_udf(data): return pass udf_func = udf(my_udf, StringType()) data...

Data Engineering

1190 Views
1 replies
4 kudos

03-24-2022 8:39:56 AM

View Replies

Latest Reply

Hubert-Dudek
Esteemed Contributor III

03-24-2022 8:52:10 AM

4 kudos

That udf code will run on driver so better not use it for such a big dataset. What you need is vectorized pandas udf https://docs.databricks.com/spark/latest/spark-sql/udf-python-pandas.html

4 kudos

03-24-2022 8:52:10 AM

by Jeff1 • Contributor II

03-23-2022 8:39:55 AM

864 Views
3 replies
1 kudos

Resolved! Strange object returned using sparklyr

CommunityI'm running a sparklyr "group_by" function and the function returns the following info:# group by event_typeacled_grp_tbl <- acled_tbl %>% group_by("event_type") %>% summary(count = n()) Length Cl...

Data Engineering

864 Views
3 replies
1 kudos

03-23-2022 8:39:55 AM

View Replies

Latest Reply

Jeff1
Contributor II

03-24-2022 5:21:44 AM

1 kudos

I should have deleted the post. While your are correct "event_type" should be without quotes the problem was the Summary function. I was using the wrong function it should have been "summarize."

1 kudos

03-24-2022 5:21:44 AM

2 More Replies

by Anuj93 • New Contributor III

03-10-2022 5:39:57 AM

1133 Views
3 replies
2 kudos

Resolved! a user has been deleted from databricks workspace . Is there any way to find who deleted the user?

a user has been deleted from databricks workspace . Is there any way to find who deleted the user?

Data Engineering

1133 Views
3 replies
2 kudos

03-10-2022 5:39:57 AM

View Replies

Latest Reply

Hubert-Dudek
Esteemed Contributor III

03-10-2022 9:38:40 AM

2 kudos

To do that you need to have enabled audit logs (if event already happened and it was not "on" I am afraid now it is too late).For Azure https://docs.microsoft.com/en-us/azure/databricks/administration-guide/account-settings/azure-diagnostic-logsFor A...

2 kudos

03-10-2022 9:38:40 AM

2 More Replies

by umair • New Contributor

02-18-2022 2:37:31 AM

1182 Views
3 replies
3 kudos

Resolved! Cannot Reproduce Result scikit-learn random forest

I'm running some machine learning experiments in databricks. For random forest algorithm when i restart the cluster, each time the training output is changes even though random state is set. Anyone has any clue about this issue?Note : I tried the sam...

Data Engineering

1182 Views
3 replies
3 kudos

02-18-2022 2:37:31 AM

View Replies

Latest Reply

Kaniz
Community Manager

03-24-2022 4:12:45 AM

3 kudos

Hi @umair ramzan , Were you able to reproduce the result using scikit-learn random forest?

3 kudos

03-24-2022 4:12:45 AM

2 More Replies

by trendtoreview • New Contributor

03-24-2022 1:14:34 AM

472 Views
1 replies
0 kudos

We all have been in the situation at some time where we wonder how to stop liking someone. There could be any reason behind this situation and might b...

We all have been in the situation at some time where we wonder how to stop liking someone. There could be any reason behind this situation and might be any person: your crush, love, friend, relatives, colleague, or any celebrity. Liking is the strong...

Data Engineering

472 Views
1 replies
0 kudos

03-24-2022 1:14:34 AM

View Replies

Latest Reply

Hubert-Dudek
Esteemed Contributor III

03-24-2022 3:49:35 AM

0 kudos

@[Kaniz Fatma] @[Vartika] SPAM

0 kudos

03-24-2022 3:49:35 AM

by Databach • New Contributor

03-24-2022 2:34:46 AM

2704 Views
0 replies
0 kudos

How to resolve "java.lang.ClassNotFoundException: com.databricks.spark.util.RegexBasedAWSSecretKeyRedactor" when running Scala Spark project using databricks-connect ?

Currently I am learning how to use databricks-connect to develop Scala code using IDE (VS Code) locally. The set-up of the databricks-connect as described here https://docs.microsoft.com/en-us/azure/databricks/dev-tools/databricks-connect was succues...

Data Engineering

2704 Views
0 replies
0 kudos

03-24-2022 2:34:46 AM

by Dusko • New Contributor III

02-03-2022 8:43:11 AM

1909 Views
8 replies
2 kudos

Resolved! How to access root mountPoint without "Access Denied"?

Hi, I’m trying to read file from S3 root bucket. I can ls all the files but I can’t read it because of access denied. When I mount the same S3 root bucket under some other mountPoint, I can touch and read all the files. I also see that this new mount...

Data Engineering

1909 Views
8 replies
2 kudos

02-03-2022 8:43:11 AM

View Replies

Latest Reply

Dusko
New Contributor III

02-17-2022 12:05:31 AM

2 kudos

Hi @Atanu Sarkar , @Piper Wilson ,thanks for the replies. Well I don't understand the fact about ownership. I believe that rootbucket is still under my ownership (I created it and I could upload/delete any files through browser without any problem...

2 kudos

02-17-2022 12:05:31 AM

7 More Replies

by hrushi2000 • New Contributor

03-22-2022 11:41:34 PM

331 Views
1 replies
0 kudos

Machine learning is sanctionative computers to tackle tasks that have, until now, completely been administered by folks.From driving cars to translati...

Machine learning is sanctionative computers to tackle tasks that have, until now, completely been administered by folks.From driving cars to translating speech, machine learning is driving accolade explosion among the capabilities of computing – serv...

Data Engineering

331 Views
1 replies
0 kudos

03-22-2022 11:41:34 PM

View Replies

Latest Reply

Hubert-Dudek
Esteemed Contributor III

03-23-2022 3:13:36 AM

0 kudos

@[Kaniz Fatma] @[Vartika] SPAM

0 kudos

03-23-2022 3:13:36 AM

by gbrueckl • Contributor II

03-03-2022 7:34:22 AM

6733 Views
16 replies
3 kudos

Resolved! Setup Git Integration via REST API

We are currently setting up CI/CD for our Databricks workspace using Databricks Repos following the approach described in the offical docs: https://docs.databricks.com/repos.html#best-practices-for-integrating-databricks-repos-with-cicd-workflowsObvi...

Data Engineering

6733 Views
16 replies
3 kudos

03-03-2022 7:34:22 AM

View Replies

Latest Reply

New1
New Contributor II

03-11-2022 4:33:20 AM

3 kudos

Hi, how can i trigger a job externally using Github actions?

3 kudos

03-11-2022 4:33:20 AM

15 More Replies

User

Count

1601

736

343

284

246

Databricks

Forum Posts

mount a Azure DL in Databricks

Resolved! java.lang.ArithmeticException: Casting XXXXXXXXXXX to int causes overflow

Resolved! Problem with Databricks JDBC connection: Error occured while deserializing arrow data

Invalid field schema option provided-DatabricksDeltaLakeSinkConnector

Resolved! As an admin of a Databricks SQL environment, can I cancel long running queries?

Resolved! Databricks Delta Lake Sink Connector

Resolved! How to process a large delta table with UDF ?

Resolved! Strange object returned using sparklyr

Resolved! a user has been deleted from databricks workspace . Is there any way to find who deleted the user?

Resolved! Cannot Reproduce Result scikit-learn random forest

We all have been in the situation at some time where we wonder how to stop liking someone. There could be any reason behind this situation and might b...

How to resolve "java.lang.ClassNotFoundException: com.databricks.spark.util.RegexBasedAWSSecretKeyRedactor" when running Scala Spark project using databricks-connect ?

Resolved! How to access root mountPoint without "Access Denied"?

Machine learning is sanctionative computers to tackle tasks that have, until now, completely been administered by folks.From driving cars to translati...

Resolved! Setup Git Integration via REST API

DELTA_EXCEED_CHAR_VARCHAR_LIMIT

Not able to set run_as service_principal_name

Pyspark operations slowness in CLuster 14.3LTS as ...

[Databricks Assets Bundles] Workflow trigger on fi...

Addressing Pipeline Error Handling in Databricks b...