cancel
Showing results for 
Search instead for 
Did you mean: 
Get Started Discussions
Start your journey with Databricks by joining discussions on getting started guides, tutorials, and introductory topics. Connect with beginners and experts alike to kickstart your Databricks experience.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

nadishancosta
by New Contributor II
  • 1619 Views
  • 2 replies
  • 0 kudos

Cannot access community account

Resetting password does not work. After I enter my new password, it just keeps processing. I waited for over 10 minutes, tried on different browsers, tried on a VPN, nothing works. Also this randomly happened. I didnt forget my password, just the sys...

  • 1619 Views
  • 2 replies
  • 0 kudos
Latest Reply
nadishancosta
New Contributor II
  • 0 kudos

Its for the Community Edition

  • 0 kudos
1 More Replies
aupres
by New Contributor III
  • 4175 Views
  • 2 replies
  • 0 kudos

Resolved! How to generate schema with org.apache.spark.sql.functions.schema_of_csv?

Hello, I use spark 3.4.1-hadooop 3 on windows 11. And I am struggling to generate the schema of csv data with schema_of csv function. Below is my java codes. Map<String, String> kafkaParams = new HashMap<>(); kafkaParams.put("kafka.bootstrap.servers"...

Get Started Discussions
schema_of_csv
spark-java
  • 4175 Views
  • 2 replies
  • 0 kudos
Latest Reply
aupres
New Contributor III
  • 0 kudos

I use org.apache.spark.sql.functions.lit method and solve this issue. Thank you any way.

  • 0 kudos
1 More Replies
royourboat
by New Contributor
  • 689 Views
  • 0 replies
  • 0 kudos

"Something went wrong"

I've made two fresh accounts on DataBricks and am stuck here for both when I try to login. I've never used DataBricks before. This problem occurs on 3 different browsers across 2 PCs.This is not the place to post such a question, sorry! But, I haven'...

01f1be36c5be48ff3d62988accdf5634.png
  • 689 Views
  • 0 replies
  • 0 kudos
zyang
by Contributor II
  • 9858 Views
  • 4 replies
  • 2 kudos

Sync the production data in environment into test environment

Hello,I have a database called sales which contain several delta tables and views in both production and test workspace. But the data is not synced because some people develop the code in test workspace. As time passed, both the data and the tables i...

  • 9858 Views
  • 4 replies
  • 2 kudos
Latest Reply
Anonymous
Not applicable
  • 2 kudos

Hi @zyang  Hope everything is going great. Just wanted to check in if you were able to resolve your issue. If yes, would you be happy to mark an answer as best so that other members can find the solution more quickly? If not, please tell us so we can...

  • 2 kudos
3 More Replies
Oliver_Angelil
by Valued Contributor II
  • 3949 Views
  • 2 replies
  • 2 kudos

Resolved! Confirmation that Ingestion Time Clustering is applied

The article on Ingestion Time Clustering mentions that "Ingestion Time Clustering is enabled by default on Databricks Runtime 11.2", however how can I confirm is it active for my table? For example, is there a:True/False "Ingestion Time Clustered" fl...

  • 3949 Views
  • 2 replies
  • 2 kudos
Latest Reply
Oliver_Angelil
Valued Contributor II
  • 2 kudos

Thanks @NandiniN, that was very helpful. I have 3 follow-up questions:If I already have a table (350GB) that has been partitioned by 3 columns: Year, Month, Day, and stored in the hive-style with subdirectories: Year=X/Month=Y/Day=Z, can I read it in...

  • 2 kudos
1 More Replies
Ramana
by Valued Contributor
  • 9295 Views
  • 3 replies
  • 6 kudos

Resolved! What is the alternative for sys.exit(0) in Databricks

Hi,We are working on a migration project from Cloudera to Databricks.All our code is in .py files and we decided to keep the same in Databricks as well and try to execute the same from GIT through Databricks workflows.We have two kinds of exit functi...

  • 9295 Views
  • 3 replies
  • 6 kudos
Latest Reply
Ramana
Valued Contributor
  • 6 kudos

I tested with different levels of nesting and it is working as expected.Here is the sample code: import sys bucket_name = "prod"# str(sys.argv[1]).lower() def main(): i,j=0,0 while j<=2: print(f"while loop iteration: {j}") f...

  • 6 kudos
2 More Replies
Dekova
by New Contributor II
  • 3128 Views
  • 1 replies
  • 1 kudos

Resolved! Photon and UDF efficiency

When using a JVM engine, Scala UDFs have an advantage over Python UDFs because data doesn't have to be shifted out to the Python environment for processing. If I understand the implications of using the Photon C++ engine, any processing that needs to...

  • 3128 Views
  • 1 replies
  • 1 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 1 kudos

Photon does not support UDFs:https://learn.microsoft.com/en-us/azure/databricks/runtime/photon#limitationsSo when creating a UDF, photon will not be used.

  • 1 kudos
Dekova
by New Contributor II
  • 973 Views
  • 0 replies
  • 0 kudos

Structured Streaming and Workplace Max Jobs

From the documentation: A workspace is limited to 1000 concurrent task runs. A 429 Too Many Requests response is returned when you request a run that cannot start immediately.The number of jobs a workspace can create in an hour is limited to 10000 (i...

  • 973 Views
  • 0 replies
  • 0 kudos
SSV_dataeng
by New Contributor II
  • 1641 Views
  • 2 replies
  • 0 kudos

Plot number of abandoned cart items by product

abandoned_carts_df = (email_carts_df.filter(col('converted') == False).filter(col('cart').isNotNull()))display(abandoned_carts_df) abandoned_items_df = (abandoned_carts_df.select(col("cart").alias("items")).groupBy("items").count())display(abandoned_...

SSV_dataeng_0-1690194232666.png
  • 1641 Views
  • 2 replies
  • 0 kudos
Latest Reply
NandiniN
Databricks Employee
  • 0 kudos

Hi @SSV_dataeng ,Try abandoned_items_df = (abandoned_carts_df.withColumn("items", explode("cart")).groupBy("items").count().sort("items"))

  • 0 kudos
1 More Replies
SSV_dataeng
by New Contributor II
  • 2425 Views
  • 4 replies
  • 0 kudos

write to Delta

spark.conf.set("spark.databricks.delta.properties.defaults.columnMapping.mode","name")products_output_path = DA.paths.working_dir + "/delta/products"products_df.write.format("delta").save(products_output_path) verify_files = dbutils.fs.ls(products_ou...

  • 2425 Views
  • 4 replies
  • 0 kudos
Latest Reply
NandiniN
Databricks Employee
  • 0 kudos

Hi @SSV_dataeng ,Please check with this (you would have to indent it correctly for python)productsOutputPath = DA.workingDir + "/delta/products"(productsDF.write.format("delta").mode("overwrite").save(productsOutputPath))verify_files = dbutils.fs.ls(...

  • 0 kudos
3 More Replies
marchino
by New Contributor II
  • 6398 Views
  • 3 replies
  • 1 kudos

Can I change Service Principal's OAuth token's expiration date?

Hi,since I have to read from a Databricks table from an external API I created a Service Principal that would start a cluster and perform the operation, to authenticate the request on behalf of the Service Principal I generate the OAuth token followi...

  • 6398 Views
  • 3 replies
  • 1 kudos
Latest Reply
NandiniN
Databricks Employee
  • 1 kudos

Hello @marchino Please check if this is of your interest https://kb.databricks.com/en_US/security/set-an-unlimited-lifetime-for-service-principal-access-token 

  • 1 kudos
2 More Replies
Phani1
by Valued Contributor II
  • 1340 Views
  • 1 replies
  • 0 kudos

Reserved VM/DBU's

As Some VM's /DBU's reservations are purchased, however, it was underutilized. How can we optimize the underutilization? Do we have any guidelines or best practices?

  • 1340 Views
  • 1 replies
  • 0 kudos
Latest Reply
Phani1
Valued Contributor II
  • 0 kudos

We have 5 reserved instances of Azure VMs to run the data bricks cluster jobs. And this is not being utilized efficiently (as per the usage metrics – 1 of the reservation is 10-15% utilized and the other one is 30-40% utilized). Could you please help...

  • 0 kudos
Henrik
by New Contributor III
  • 3975 Views
  • 2 replies
  • 1 kudos

Data lineage on views

I do not know if this is intended behavior of data lineage but for me it is weird.When I create a view based on two tables the data lineage upstream looks correct. But when I replace the view to only use one of the tables, then data lineage upstream ...

  • 3975 Views
  • 2 replies
  • 1 kudos
Latest Reply
Henrik
New Contributor III
  • 1 kudos

After some thoughts, i have come to this conclusion:Data lineage on views is working as one should expect. I strongly recommend that this feature is redesigned so it shows the result of the lastest view.

  • 1 kudos
1 More Replies
etlundquist
by New Contributor II
  • 4499 Views
  • 3 replies
  • 0 kudos

Unable to Start Clusters on GCP - Clusters Stuck in "CREATING" State

I set up my Databricks Account on GCP via GCP Marketplace and then created my first workspace via the Accounts Console (default Databricks VPC). Everything seemed to be ok until I attempted to create my first cluster. The cluster hangs indefinitely i...

  • 4499 Views
  • 3 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Hi @etlundquist  Thank you for posting your question in our community! We are happy to assist you. To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answers ...

  • 0 kudos
2 More Replies
Nitin2
by New Contributor
  • 1350 Views
  • 0 replies
  • 0 kudos

Not able to login or change password

Hi,I am unable to login to databricks community edition. I have tried changing my password. However, no email is sent on my email id which is : kum.nit7287@gmail.com. Can anyone help?

  • 1350 Views
  • 0 replies
  • 0 kudos

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels