Get Started Discussions

by NewContributor • New Contributor III

07-10-2023 1:11:12 AM

5742 Views
5 replies
3 kudos

Resolved! Databricks Certified Data Engineer Associate (Version 2) Exam got suspended

Hi Team,My Databricks Certified Data Engineer Associate (Version 2) exam got suspended today and it is in suspended state.I was there continuously in front of the camera and suddenly the alert appeared and support person asked me to show the full tab...

Get Started Discussions

Reply

5742 Views
5 replies
3 kudos

07-10-2023 1:11:12 AM

View Replies

Latest Reply

Rob_79
New Contributor II

07-30-2023 9:12:20 PM

3 kudos

Hi @Retired_mod ,I've been into the same situation as Shifa and I've also raised ticket with Databricks but no feedback yet!Can you please help on that?Cheers,Rabie

3 kudos

07-30-2023 9:12:20 PM

4 More Replies

by agar08 • New Contributor

07-28-2023 10:38:02 AM

932 Views
0 replies
0 kudos

java.net.SocketTimeoutException at java.net.SocketInputStream.socketRead

Databricks notebook is configured with ADLS gen2 using Service principal authentication and is able to read/write files to ADLS gen2. However, occasionally, we are seeing below errors in the production environment:java.net.SocketTimeoutException at j...

Get Started Discussions

Reply

932 Views
0 replies
0 kudos

07-28-2023 10:38:02 AM

by Prototype998 • New Contributor III

07-28-2023 9:08:40 AM

1124 Views
0 replies
0 kudos

Spark English SDK in Databricks Community edition

Feel free to read an article on how you can use English sdk for apache spark in databricks community edition.link: English_SDK_For_Apache_Spark

Get Started Discussions

Reply

1124 Views
0 replies
0 kudos

07-28-2023 9:08:40 AM

by nadishancosta • New Contributor II

07-27-2023 12:44:46 PM

1621 Views
2 replies
0 kudos

Cannot access community account

Resetting password does not work. After I enter my new password, it just keeps processing. I waited for over 10 minutes, tried on different browsers, tried on a VPN, nothing works. Also this randomly happened. I didnt forget my password, just the sys...

Get Started Discussions

Reply

1621 Views
2 replies
0 kudos

07-27-2023 12:44:46 PM

View Replies

Latest Reply

nadishancosta
New Contributor II

07-28-2023 2:03:07 AM

0 kudos

Its for the Community Edition

0 kudos

07-28-2023 2:03:07 AM

1 More Replies

by aupres • New Contributor III

07-25-2023 4:06:54 AM

4183 Views
2 replies
0 kudos

Resolved! How to generate schema with org.apache.spark.sql.functions.schema_of_csv?

Hello, I use spark 3.4.1-hadooop 3 on windows 11. And I am struggling to generate the schema of csv data with schema_of csv function. Below is my java codes. Map<String, String> kafkaParams = new HashMap<>(); kafkaParams.put("kafka.bootstrap.servers"...

Get Started Discussions

schema_of_csv

spark-java

Reply

4183 Views
2 replies
0 kudos

07-25-2023 4:06:54 AM

View Replies

Latest Reply

aupres
New Contributor III

07-28-2023 1:30:46 AM

0 kudos

I use org.apache.spark.sql.functions.lit method and solve this issue. Thank you any way.

0 kudos

07-28-2023 1:30:46 AM

1 More Replies

by royourboat • New Contributor

07-27-2023 9:15:51 PM

690 Views
0 replies
0 kudos

"Something went wrong"

I've made two fresh accounts on DataBricks and am stuck here for both when I try to login. I've never used DataBricks before. This problem occurs on 3 different browsers across 2 PCs.This is not the place to post such a question, sorry! But, I haven'...

Get Started Discussions

Reply

690 Views
0 replies
0 kudos

07-27-2023 9:15:51 PM

by zyang • Contributor II

07-17-2023 6:13:01 AM

9877 Views
4 replies
2 kudos

Sync the production data in environment into test environment

Hello,I have a database called sales which contain several delta tables and views in both production and test workspace. But the data is not synced because some people develop the code in test workspace. As time passed, both the data and the tables i...

Get Started Discussions

Reply

9877 Views
4 replies
2 kudos

07-17-2023 6:13:01 AM

View Replies

Latest Reply

Anonymous
Not applicable

07-17-2023 8:53:02 PM

2 kudos

Hi @zyang Hope everything is going great. Just wanted to check in if you were able to resolve your issue. If yes, would you be happy to mark an answer as best so that other members can find the solution more quickly? If not, please tell us so we can...

2 kudos

07-17-2023 8:53:02 PM

3 More Replies

by Oliver_Angelil • Valued Contributor II

07-27-2023 4:12:46 AM

3961 Views
2 replies
2 kudos

Resolved! Confirmation that Ingestion Time Clustering is applied

The article on Ingestion Time Clustering mentions that "Ingestion Time Clustering is enabled by default on Databricks Runtime 11.2", however how can I confirm is it active for my table? For example, is there a:True/False "Ingestion Time Clustered" fl...

Get Started Discussions

Reply

3961 Views
2 replies
2 kudos

07-27-2023 4:12:46 AM

View Replies

Latest Reply

Oliver_Angelil
Valued Contributor II

07-27-2023 7:52:45 AM

2 kudos

Thanks @NandiniN, that was very helpful. I have 3 follow-up questions:If I already have a table (350GB) that has been partitioned by 3 columns: Year, Month, Day, and stored in the hive-style with subdirectories: Year=X/Month=Y/Day=Z, can I read it in...

2 kudos

07-27-2023 7:52:45 AM

1 More Replies

by Ramana • Valued Contributor

07-24-2023 3:07:32 PM

9303 Views
3 replies
6 kudos

Resolved! What is the alternative for sys.exit(0) in Databricks

Hi,We are working on a migration project from Cloudera to Databricks.All our code is in .py files and we decided to keep the same in Databricks as well and try to execute the same from GIT through Databricks workflows.We have two kinds of exit functi...

Get Started Discussions

Reply

9303 Views
3 replies
6 kudos

07-24-2023 3:07:32 PM

View Replies

Latest Reply

Ramana
Valued Contributor

07-27-2023 7:31:50 AM

6 kudos

I tested with different levels of nesting and it is working as expected.Here is the sample code: import sys bucket_name = "prod"# str(sys.argv[1]).lower() def main(): i,j=0,0 while j<=2: print(f"while loop iteration: {j}") f...

6 kudos

07-27-2023 7:31:50 AM

2 More Replies

by Dekova • New Contributor II

07-27-2023 5:05:30 AM

3139 Views
1 replies
1 kudos

Resolved! Photon and UDF efficiency

When using a JVM engine, Scala UDFs have an advantage over Python UDFs because data doesn't have to be shifted out to the Python environment for processing. If I understand the implications of using the Photon C++ engine, any processing that needs to...

Get Started Discussions

Reply

3139 Views
1 replies
1 kudos

07-27-2023 5:05:30 AM

View Replies

Latest Reply

-werners-
Esteemed Contributor III

07-27-2023 6:55:19 AM

1 kudos

Photon does not support UDFs:https://learn.microsoft.com/en-us/azure/databricks/runtime/photon#limitationsSo when creating a UDF, photon will not be used.

1 kudos

07-27-2023 6:55:19 AM

by Dekova • New Contributor II

07-27-2023 5:52:42 AM

975 Views
0 replies
0 kudos

Structured Streaming and Workplace Max Jobs

From the documentation: A workspace is limited to 1000 concurrent task runs. A 429 Too Many Requests response is returned when you request a run that cannot start immediately.The number of jobs a workspace can create in an hour is limited to 10000 (i...

Get Started Discussions

Reply

975 Views
0 replies
0 kudos

07-27-2023 5:52:42 AM

by SSV_dataeng • New Contributor II

07-24-2023 3:25:13 AM

1650 Views
2 replies
0 kudos

Plot number of abandoned cart items by product

abandoned_carts_df = (email_carts_df.filter(col('converted') == False).filter(col('cart').isNotNull()))display(abandoned_carts_df) abandoned_items_df = (abandoned_carts_df.select(col("cart").alias("items")).groupBy("items").count())display(abandoned_...

Get Started Discussions

Reply

1650 Views
2 replies
0 kudos

07-24-2023 3:25:13 AM

View Replies

Latest Reply

NandiniN
Databricks Employee

07-27-2023 5:16:42 AM

0 kudos

Hi @SSV_dataeng ,Try abandoned_items_df = (abandoned_carts_df.withColumn("items", explode("cart")).groupBy("items").count().sort("items"))

0 kudos

07-27-2023 5:16:42 AM

1 More Replies

by SSV_dataeng • New Contributor II

07-23-2023 8:36:24 AM

2433 Views
4 replies
0 kudos

write to Delta

spark.conf.set("spark.databricks.delta.properties.defaults.columnMapping.mode","name")products_output_path = DA.paths.working_dir + "/delta/products"products_df.write.format("delta").save(products_output_path) verify_files = dbutils.fs.ls(products_ou...

Get Started Discussions

Reply

2433 Views
4 replies
0 kudos

07-23-2023 8:36:24 AM

View Replies

Latest Reply

NandiniN
Databricks Employee

07-27-2023 5:06:30 AM

0 kudos

Hi @SSV_dataeng ,Please check with this (you would have to indent it correctly for python)productsOutputPath = DA.workingDir + "/delta/products"(productsDF.write.format("delta").mode("overwrite").save(productsOutputPath))verify_files = dbutils.fs.ls(...

0 kudos

07-27-2023 5:06:30 AM

3 More Replies

by marchino • New Contributor II

07-26-2023 3:53:43 AM

6411 Views
3 replies
1 kudos

Can I change Service Principal's OAuth token's expiration date?

Hi,since I have to read from a Databricks table from an external API I created a Service Principal that would start a cluster and perform the operation, to authenticate the request on behalf of the Service Principal I generate the OAuth token followi...

Get Started Discussions

Reply

6411 Views
3 replies
1 kudos

07-26-2023 3:53:43 AM

View Replies

Latest Reply

NandiniN
Databricks Employee

07-27-2023 4:22:32 AM

1 kudos

Hello @marchino Please check if this is of your interest https://kb.databricks.com/en_US/security/set-an-unlimited-lifetime-for-service-principal-access-token

1 kudos

07-27-2023 4:22:32 AM

2 More Replies

by Phani1 • Valued Contributor II

07-21-2023 12:30:39 AM

1344 Views
1 replies
0 kudos

Reserved VM/DBU's

As Some VM's /DBU's reservations are purchased, however, it was underutilized. How can we optimize the underutilization? Do we have any guidelines or best practices?

Get Started Discussions

Reply

1344 Views
1 replies
0 kudos

07-21-2023 12:30:39 AM

View Replies

Latest Reply

Phani1
Valued Contributor II

07-27-2023 2:11:58 AM

0 kudos

We have 5 reserved instances of Azure VMs to run the data bricks cluster jobs. And this is not being utilized efficiently (as per the usage metrics – 1 of the reservation is 10-15% utilized and the other one is 30-40% utilized). Could you please help...

0 kudos

07-27-2023 2:11:58 AM

Databricks Community

Forum Posts

Resolved! Databricks Certified Data Engineer Associate (Version 2) Exam got suspended

java.net.SocketTimeoutException at java.net.SocketInputStream.socketRead

Spark English SDK in Databricks Community edition

Cannot access community account

Resolved! How to generate schema with org.apache.spark.sql.functions.schema_of_csv?

"Something went wrong"

Sync the production data in environment into test environment

Resolved! Confirmation that Ingestion Time Clustering is applied

Resolved! What is the alternative for sys.exit(0) in Databricks

Resolved! Photon and UDF efficiency

Structured Streaming and Workplace Max Jobs

Plot number of abandoned cart items by product

write to Delta

Can I change Service Principal's OAuth token's expiration date?

Reserved VM/DBU's

Join Us as a Local Community Builder!

Data bricks is not mounting with storage account g...

External MCP representing user data permissions

serialized_dashboard

how to import sample notebook to azure databricks ...

Request to Extend Partner Tech Summit Lab Access