cancel
Showing results for 
Search instead for 
Did you mean: 
Community Platform Discussions
Connect with fellow community members to discuss general topics related to the Databricks platform, industry trends, and best practices. Share experiences, ask questions, and foster collaboration within the community.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

DineshKumar
by New Contributor III
  • 4060 Views
  • 6 replies
  • 0 kudos

Databricks Cluster is going down after installing the external library

 I have created a Databricks cluster with below configurations.Databricks Runtime Version13.2 ML (includes Apache Spark 3.4.0, Scala 2.12)Node typei3.xlarge30.5 GB Memory, 4 CoresI created a notebook and trying to load the Mysql table which resides i...

  • 4060 Views
  • 6 replies
  • 0 kudos
Latest Reply
Debayan
Databricks Employee
  • 0 kudos

Hi, The below error describes that there is an issue connecting to the host from Databricks, you can find more details about the network configurations here at https://docs.databricks.com/administration-guide/cloud-configurations/aws/customer-managed...

  • 0 kudos
5 More Replies
KoenZandvliet
by New Contributor III
  • 1311 Views
  • 1 replies
  • 2 kudos

Resolved! Setting cluster permissions with API

I would like to update the permissions of a cluster using the API. Documentation mentions the following: patch api/2.0/permissions/{request_object_type}/{request_object_id}.Which {request_object_type} to use? ‘cluster’, ‘cluster’ and ‘compute’ are no...

  • 1311 Views
  • 1 replies
  • 2 kudos
Latest Reply
daniel_sahal
Esteemed Contributor
  • 2 kudos

@KoenZandvliet clusters is the one you should be looking for.

  • 2 kudos
NewContributor
by New Contributor III
  • 4305 Views
  • 5 replies
  • 3 kudos

Resolved! Databricks Certified Data Engineer Associate (Version 2) Exam got suspended

Hi Team,My Databricks Certified Data Engineer Associate (Version 2) exam got suspended today and it is in suspended state.I was there continuously in front of the camera and suddenly the alert appeared and support person asked me to show the full tab...

  • 4305 Views
  • 5 replies
  • 3 kudos
Latest Reply
Rob_79
New Contributor II
  • 3 kudos

Hi @Retired_mod ,I've been into the same situation as Shifa and I've also raised ticket with Databricks but no feedback yet!Can you please help on that?Cheers,Rabie

  • 3 kudos
4 More Replies
nadishancosta
by New Contributor II
  • 1106 Views
  • 2 replies
  • 0 kudos

Cannot access community account

Resetting password does not work. After I enter my new password, it just keeps processing. I waited for over 10 minutes, tried on different browsers, tried on a VPN, nothing works. Also this randomly happened. I didnt forget my password, just the sys...

  • 1106 Views
  • 2 replies
  • 0 kudos
Latest Reply
nadishancosta
New Contributor II
  • 0 kudos

Its for the Community Edition

  • 0 kudos
1 More Replies
aupres
by New Contributor III
  • 2848 Views
  • 2 replies
  • 0 kudos

Resolved! How to generate schema with org.apache.spark.sql.functions.schema_of_csv?

Hello, I use spark 3.4.1-hadooop 3 on windows 11. And I am struggling to generate the schema of csv data with schema_of csv function. Below is my java codes. Map<String, String> kafkaParams = new HashMap<>(); kafkaParams.put("kafka.bootstrap.servers"...

Community Platform Discussions
schema_of_csv
spark-java
  • 2848 Views
  • 2 replies
  • 0 kudos
Latest Reply
aupres
New Contributor III
  • 0 kudos

I use org.apache.spark.sql.functions.lit method and solve this issue. Thank you any way.

  • 0 kudos
1 More Replies
zyang
by Contributor
  • 7813 Views
  • 4 replies
  • 2 kudos

Sync the production data in environment into test environment

Hello,I have a database called sales which contain several delta tables and views in both production and test workspace. But the data is not synced because some people develop the code in test workspace. As time passed, both the data and the tables i...

  • 7813 Views
  • 4 replies
  • 2 kudos
Latest Reply
Anonymous
Not applicable
  • 2 kudos

Hi @zyang  Hope everything is going great. Just wanted to check in if you were able to resolve your issue. If yes, would you be happy to mark an answer as best so that other members can find the solution more quickly? If not, please tell us so we can...

  • 2 kudos
3 More Replies
Oliver_Angelil
by Valued Contributor II
  • 2789 Views
  • 2 replies
  • 2 kudos

Resolved! Confirmation that Ingestion Time Clustering is applied

The article on Ingestion Time Clustering mentions that "Ingestion Time Clustering is enabled by default on Databricks Runtime 11.2", however how can I confirm is it active for my table? For example, is there a:True/False "Ingestion Time Clustered" fl...

  • 2789 Views
  • 2 replies
  • 2 kudos
Latest Reply
Oliver_Angelil
Valued Contributor II
  • 2 kudos

Thanks @NandiniN, that was very helpful. I have 3 follow-up questions:If I already have a table (350GB) that has been partitioned by 3 columns: Year, Month, Day, and stored in the hive-style with subdirectories: Year=X/Month=Y/Day=Z, can I read it in...

  • 2 kudos
1 More Replies
Dekova
by New Contributor II
  • 2046 Views
  • 1 replies
  • 1 kudos

Resolved! Photon and UDF efficiency

When using a JVM engine, Scala UDFs have an advantage over Python UDFs because data doesn't have to be shifted out to the Python environment for processing. If I understand the implications of using the Photon C++ engine, any processing that needs to...

  • 2046 Views
  • 1 replies
  • 1 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 1 kudos

Photon does not support UDFs:https://learn.microsoft.com/en-us/azure/databricks/runtime/photon#limitationsSo when creating a UDF, photon will not be used.

  • 1 kudos
Dekova
by New Contributor II
  • 564 Views
  • 0 replies
  • 0 kudos

Structured Streaming and Workplace Max Jobs

From the documentation: A workspace is limited to 1000 concurrent task runs. A 429 Too Many Requests response is returned when you request a run that cannot start immediately.The number of jobs a workspace can create in an hour is limited to 10000 (i...

  • 564 Views
  • 0 replies
  • 0 kudos
SSV_dataeng
by New Contributor II
  • 1182 Views
  • 2 replies
  • 0 kudos

Plot number of abandoned cart items by product

abandoned_carts_df = (email_carts_df.filter(col('converted') == False).filter(col('cart').isNotNull()))display(abandoned_carts_df) abandoned_items_df = (abandoned_carts_df.select(col("cart").alias("items")).groupBy("items").count())display(abandoned_...

SSV_dataeng_0-1690194232666.png
  • 1182 Views
  • 2 replies
  • 0 kudos
Latest Reply
NandiniN
Databricks Employee
  • 0 kudos

Hi @SSV_dataeng ,Try abandoned_items_df = (abandoned_carts_df.withColumn("items", explode("cart")).groupBy("items").count().sort("items"))

  • 0 kudos
1 More Replies
SSV_dataeng
by New Contributor II
  • 1626 Views
  • 4 replies
  • 0 kudos

write to Delta

spark.conf.set("spark.databricks.delta.properties.defaults.columnMapping.mode","name")products_output_path = DA.paths.working_dir + "/delta/products"products_df.write.format("delta").save(products_output_path) verify_files = dbutils.fs.ls(products_ou...

  • 1626 Views
  • 4 replies
  • 0 kudos
Latest Reply
NandiniN
Databricks Employee
  • 0 kudos

Hi @SSV_dataeng ,Please check with this (you would have to indent it correctly for python)productsOutputPath = DA.workingDir + "/delta/products"(productsDF.write.format("delta").mode("overwrite").save(productsOutputPath))verify_files = dbutils.fs.ls(...

  • 0 kudos
3 More Replies
marchino
by New Contributor II
  • 4519 Views
  • 3 replies
  • 1 kudos

Can I change Service Principal's OAuth token's expiration date?

Hi,since I have to read from a Databricks table from an external API I created a Service Principal that would start a cluster and perform the operation, to authenticate the request on behalf of the Service Principal I generate the OAuth token followi...

  • 4519 Views
  • 3 replies
  • 1 kudos
Latest Reply
NandiniN
Databricks Employee
  • 1 kudos

Hello @marchino Please check if this is of your interest https://kb.databricks.com/en_US/security/set-an-unlimited-lifetime-for-service-principal-access-token 

  • 1 kudos
2 More Replies
Henrik
by New Contributor III
  • 2658 Views
  • 2 replies
  • 1 kudos

Data lineage on views

I do not know if this is intended behavior of data lineage but for me it is weird.When I create a view based on two tables the data lineage upstream looks correct. But when I replace the view to only use one of the tables, then data lineage upstream ...

  • 2658 Views
  • 2 replies
  • 1 kudos
Latest Reply
Henrik
New Contributor III
  • 1 kudos

After some thoughts, i have come to this conclusion:Data lineage on views is working as one should expect. I strongly recommend that this feature is redesigned so it shows the result of the lastest view.

  • 1 kudos
1 More Replies
Chalki
by New Contributor III
  • 5107 Views
  • 3 replies
  • 0 kudos

Iterative read and writes cause java.lang.OutOfMemoryError: GC overhead limit exceeded

I have an iterative algorithm which read and writes a dataframe iteration trough a list with new partitions, like this: for p in partitions_list:df = spark.read.parquet("adls_storage/p")df.write.format("delta").mode("overwrite").option("partitionOver...

  • 5107 Views
  • 3 replies
  • 0 kudos
Latest Reply
Chalki
New Contributor III
  • 0 kudos

@daniel_sahalI've attached the wrong snip/ Actually it is FULL GC Ergonomics, which was bothering me. Now I am attaching the correct snip.  But as you said I scaled a bit. The thing I forgot to mention is that the table is wide - more than 300 column...

  • 0 kudos
2 More Replies

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group
Top Kudoed Authors