cancel
Showing results for 
Search instead for 
Did you mean: 
Get Started Discussions
Start your journey with Databricks by joining discussions on getting started guides, tutorials, and introductory topics. Connect with beginners and experts alike to kickstart your Databricks experience.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Dekova
by New Contributor II
  • 906 Views
  • 0 replies
  • 0 kudos

Structured Streaming and Workplace Max Jobs

From the documentation: A workspace is limited to 1000 concurrent task runs. A 429 Too Many Requests response is returned when you request a run that cannot start immediately.The number of jobs a workspace can create in an hour is limited to 10000 (i...

  • 906 Views
  • 0 replies
  • 0 kudos
SSV_dataeng
by New Contributor II
  • 1625 Views
  • 2 replies
  • 0 kudos

Plot number of abandoned cart items by product

abandoned_carts_df = (email_carts_df.filter(col('converted') == False).filter(col('cart').isNotNull()))display(abandoned_carts_df) abandoned_items_df = (abandoned_carts_df.select(col("cart").alias("items")).groupBy("items").count())display(abandoned_...

SSV_dataeng_0-1690194232666.png
  • 1625 Views
  • 2 replies
  • 0 kudos
Latest Reply
NandiniN
Databricks Employee
  • 0 kudos

Hi @SSV_dataeng ,Try abandoned_items_df = (abandoned_carts_df.withColumn("items", explode("cart")).groupBy("items").count().sort("items"))

  • 0 kudos
1 More Replies
SSV_dataeng
by New Contributor II
  • 2399 Views
  • 4 replies
  • 0 kudos

write to Delta

spark.conf.set("spark.databricks.delta.properties.defaults.columnMapping.mode","name")products_output_path = DA.paths.working_dir + "/delta/products"products_df.write.format("delta").save(products_output_path) verify_files = dbutils.fs.ls(products_ou...

  • 2399 Views
  • 4 replies
  • 0 kudos
Latest Reply
NandiniN
Databricks Employee
  • 0 kudos

Hi @SSV_dataeng ,Please check with this (you would have to indent it correctly for python)productsOutputPath = DA.workingDir + "/delta/products"(productsDF.write.format("delta").mode("overwrite").save(productsOutputPath))verify_files = dbutils.fs.ls(...

  • 0 kudos
3 More Replies
marchino
by New Contributor II
  • 6322 Views
  • 3 replies
  • 1 kudos

Can I change Service Principal's OAuth token's expiration date?

Hi,since I have to read from a Databricks table from an external API I created a Service Principal that would start a cluster and perform the operation, to authenticate the request on behalf of the Service Principal I generate the OAuth token followi...

  • 6322 Views
  • 3 replies
  • 1 kudos
Latest Reply
NandiniN
Databricks Employee
  • 1 kudos

Hello @marchino Please check if this is of your interest https://kb.databricks.com/en_US/security/set-an-unlimited-lifetime-for-service-principal-access-token 

  • 1 kudos
2 More Replies
Phani1
by Valued Contributor II
  • 1323 Views
  • 1 replies
  • 0 kudos

Reserved VM/DBU's

As Some VM's /DBU's reservations are purchased, however, it was underutilized. How can we optimize the underutilization? Do we have any guidelines or best practices?

  • 1323 Views
  • 1 replies
  • 0 kudos
Latest Reply
Phani1
Valued Contributor II
  • 0 kudos

We have 5 reserved instances of Azure VMs to run the data bricks cluster jobs. And this is not being utilized efficiently (as per the usage metrics – 1 of the reservation is 10-15% utilized and the other one is 30-40% utilized). Could you please help...

  • 0 kudos
Henrik
by New Contributor III
  • 3932 Views
  • 2 replies
  • 1 kudos

Data lineage on views

I do not know if this is intended behavior of data lineage but for me it is weird.When I create a view based on two tables the data lineage upstream looks correct. But when I replace the view to only use one of the tables, then data lineage upstream ...

  • 3932 Views
  • 2 replies
  • 1 kudos
Latest Reply
Henrik
New Contributor III
  • 1 kudos

After some thoughts, i have come to this conclusion:Data lineage on views is working as one should expect. I strongly recommend that this feature is redesigned so it shows the result of the lastest view.

  • 1 kudos
1 More Replies
etlundquist
by New Contributor II
  • 4457 Views
  • 3 replies
  • 0 kudos

Unable to Start Clusters on GCP - Clusters Stuck in "CREATING" State

I set up my Databricks Account on GCP via GCP Marketplace and then created my first workspace via the Accounts Console (default Databricks VPC). Everything seemed to be ok until I attempted to create my first cluster. The cluster hangs indefinitely i...

  • 4457 Views
  • 3 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Hi @etlundquist  Thank you for posting your question in our community! We are happy to assist you. To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answers ...

  • 0 kudos
2 More Replies
Nitin2
by New Contributor
  • 1340 Views
  • 0 replies
  • 0 kudos

Not able to login or change password

Hi,I am unable to login to databricks community edition. I have tried changing my password. However, no email is sent on my email id which is : kum.nit7287@gmail.com. Can anyone help?

  • 1340 Views
  • 0 replies
  • 0 kudos
Chalki
by New Contributor III
  • 7009 Views
  • 3 replies
  • 0 kudos

Iterative read and writes cause java.lang.OutOfMemoryError: GC overhead limit exceeded

I have an iterative algorithm which read and writes a dataframe iteration trough a list with new partitions, like this: for p in partitions_list:df = spark.read.parquet("adls_storage/p")df.write.format("delta").mode("overwrite").option("partitionOver...

  • 7009 Views
  • 3 replies
  • 0 kudos
Latest Reply
Chalki
New Contributor III
  • 0 kudos

@daniel_sahalI've attached the wrong snip/ Actually it is FULL GC Ergonomics, which was bothering me. Now I am attaching the correct snip.  But as you said I scaled a bit. The thing I forgot to mention is that the table is wide - more than 300 column...

  • 0 kudos
2 More Replies
Dekova
by New Contributor II
  • 3876 Views
  • 1 replies
  • 3 kudos

Resolved! Using DeltaTable.merge() and generating surrogate keys on insert?

I'm using merge to upsert data into a table:DeltaTable.forName(DESTINATION_TABLE).as("target").merge(merge_df.as("source") ,"source.topic = target.topic and source.key = target.key").whenMatched().updateAll().whenNotMatched().insertAll().execute()Id ...

  • 3876 Views
  • 1 replies
  • 3 kudos
Latest Reply
daniel_sahal
Esteemed Contributor
  • 3 kudos

@Dekova 1) uuid() is non-deterministic meaning that it will give you different result each time you run this function2) Per the documentation "For Databricks Runtime 9.1 and above, MERGE operations support generated columns when you set spark.databri...

  • 3 kudos
102842
by New Contributor II
  • 3359 Views
  • 3 replies
  • 2 kudos

Databricks SQL - Conditional Catalog query

Hi is there a way we can do%sqlselect * from {{ catalog }}.schema.tableWhere `{{ catalog }}` is a template variable extracted/evaluated from either an environment variable, a databricks secret, or somewhere else? (note: not a widget) 

  • 3359 Views
  • 3 replies
  • 2 kudos
Latest Reply
Tharun-Kumar
Databricks Employee
  • 2 kudos

Hi @102842 You can use query parameters to perform this - https://docs.databricks.com/sql/user/queries/query-parameters.htmlYou can define the catalog name as a query parameter. You should declare the catalog name parameter as a drop down list, becau...

  • 2 kudos
2 More Replies
bharath_db
by New Contributor II
  • 1548 Views
  • 1 replies
  • 0 kudos

Activation Email is not coming up in the email

Activation Email is not showing up in the email. I am not able to start my trial. @Sujitha or @Kaniz - Please help!  

bharath_db_0-1690315716716.png
  • 1548 Views
  • 1 replies
  • 0 kudos
Latest Reply
bharath_db
New Contributor II
  • 0 kudos

 @Sujitha or @Kaniz - Need your help regarding the validate email not reaching inbox / spam folder to activate trial.

  • 0 kudos
kurtrm
by New Contributor III
  • 5031 Views
  • 4 replies
  • 0 kudos

Import dbfs file into workspace using Python SDK

Hello,I am looking to replicate the functionality provided by the databricks_cli Python package using the Python SDK. Previously, using the databricks_cli WorkspaceApi object, I could use the import_workspace or import_workspace_dir methods to move a...

  • 5031 Views
  • 4 replies
  • 0 kudos
Latest Reply
Kratik
New Contributor III
  • 0 kudos

Even, I am looking for a way to bring files present in S3 to Workspace programmatically. 

  • 0 kudos
3 More Replies
alesventus
by Contributor
  • 1339 Views
  • 0 replies
  • 0 kudos

Big time differences in reading tables

When I read managed table in #databricks# i can see big differences in time spent. Small test table with just 2 records is once loaded in 3 seconds and another time in 30 seconds. Reading table_change for this tinny table took 15 minutes. Don't know ...

Get Started Discussions
performance issue
  • 1339 Views
  • 0 replies
  • 0 kudos

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels