cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 
Data + AI Summit 2024 - Data Engineering & Streaming

Forum Posts

Brian61
by New Contributor
  • 839 Views
  • 0 replies
  • 0 kudos
  • 839 Views
  • 0 replies
  • 0 kudos
pedroHmdo
by New Contributor II
  • 5783 Views
  • 2 replies
  • 3 kudos

Resolved! Why I did not receive the databricks lakehouse fundamentals accreditation badge?

I have passed the test but did not receive the Badge. I also didn't receive any email.Thank you for you attention.

  • 5783 Views
  • 2 replies
  • 3 kudos
Latest Reply
Anonymous
Not applicable
  • 3 kudos

Hi @Pedro Medeiros​ Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answer...

  • 3 kudos
1 More Replies
Neil
by New Contributor
  • 5516 Views
  • 1 replies
  • 0 kudos

While trying to save the spark dataframe to delta table is taking too long

While working on video analytics task I need to save the image bytes to the delta table earlier extracted into the spark dataframe. While I want to over write a same delta table over the period of complete task and also the size of input data differs...

  • 5516 Views
  • 1 replies
  • 0 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 0 kudos

can you check the spark UI, to see where the time is spent?It can be a join, udf, ...

  • 0 kudos
FRG96
by New Contributor III
  • 6043 Views
  • 0 replies
  • 0 kudos

How to set the ABFSS URL for Azure Databricks Init Scripts that have spaces in directory names?

I want to use an Init Script on ADLS Gen2 location for my Azure Databricks 11.3 and 12.2 clusters. The init_script.sh is placed in a directory that has spaces in it:https://storageaccount1.blob.core.windows.net/container1/directory%20with%20spaces/su...

  • 6043 Views
  • 0 replies
  • 0 kudos
Chinu
by New Contributor III
  • 5576 Views
  • 1 replies
  • 1 kudos

Resolved! How to create a raw data (with filter_by) to pull query history from now to 5 mins ago

Hi Team, Is it possible I can use "query_start_time_range" filter from the api call to get the query data only from now to 5 mins ago?Im using telegraf to call query history api but it looks like Im reaching the max return and I can't find how to use...

  • 5576 Views
  • 1 replies
  • 1 kudos
Latest Reply
mathan_pillai
Databricks Employee
  • 1 kudos

Have you checked this https://docs.databricks.com/api-explorer/workspace/queryhistory/list you can list the queries based on time range as well. So you can try passing the fields in the filter_by parameter. Then pass the value as (current time - 5 m...

  • 1 kudos
User16783854357
by New Contributor III
  • 1207 Views
  • 1 replies
  • 0 kudos

Delta Sharing - Who provides the server?

I would like to understand who provides the server when using Delta sharing? If a customer exposes their delta table through Delta sharing, is it the customer who needs to setup a cluster or server to process the incoming requests?

  • 1207 Views
  • 1 replies
  • 0 kudos
Latest Reply
BigRoux
Databricks Employee
  • 0 kudos

The producer does need a cluster to set up Delta Sharing. However, once the handoff happens no cluster is needed, the data will be delivered via storage services.

  • 0 kudos
Shankar
by New Contributor III
  • 1743 Views
  • 1 replies
  • 2 kudos

Resolved! Is there a Python API for vacuum with dry run?

I have the below sql command where i am doing a dry run with vacuum. ​%sql VACUUM <table_name> RETAIN 500 HOURS DRY RUN;wanted to check if there is a way to achieve this in python api?​I​ tried the below. But, not sure if there is a parameter that we...

  • 1743 Views
  • 1 replies
  • 2 kudos
Latest Reply
venkatcrc
New Contributor III
  • 2 kudos

Equivalent of sql command "VACUUM <table_name> RETAIN 500 HOURS DRY RUN;" in python is spark.sql("VACUUM <table_name> RETAIN 500 HOURS DRY RUN;")

  • 2 kudos
Retko
by Contributor
  • 8546 Views
  • 3 replies
  • 3 kudos

Resolved! Data Tab is not showing any databases and tables even though cluster is running (Community edition)

Hi, I have a cluster running: But I dont see anything in Data Tab: As you can see it tells about some error, but error appeared after I deleted clusters which were terminated. Before it said something that cluster is not running, dont remember exactl...

image image image
  • 8546 Views
  • 3 replies
  • 3 kudos
Latest Reply
Rajani
Contributor II
  • 3 kudos

@Retko Okter​ You need to enable DBFS File Browser from the Admin Settings hope this helps.

  • 3 kudos
2 More Replies
horatiug
by New Contributor III
  • 758 Views
  • 0 replies
  • 0 kudos

Can the databricks_mount timeout be changed. ?

I am using terrafom to do databricks workspace configuration and while mounting 6 buckets if duration of mount is bigger than 20 min I get timeout. Is it possible to change the timeout ? thanksHoratiu

  • 758 Views
  • 0 replies
  • 0 kudos
cblock
by New Contributor III
  • 2080 Views
  • 3 replies
  • 3 kudos

Unable to run jobs with git notebooks

So, in this case our jobs are deployed from our development workspace to our isolated testing workspace via an automated Azure DevOps pipeline. As such, they are created (and thus run as) a service account user.Recently we made the switch to using gi...

  • 2080 Views
  • 3 replies
  • 3 kudos
Latest Reply
Anonymous
Not applicable
  • 3 kudos

Hi @Chris Block​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Thanks...

  • 3 kudos
2 More Replies
swatish0395
by New Contributor III
  • 1033 Views
  • 0 replies
  • 0 kudos

i am working on the parquet file level column encryption and decryption on user specific permission

i am able to encrypt and decrypt the daat in multiple ways and able to save the encrypted parquet file, but i want to decrypt the data if the user has specific permission otherwise he will get the encrypted data,.is there any permanent solution to de...

  • 1033 Views
  • 0 replies
  • 0 kudos
grazie
by Contributor
  • 2282 Views
  • 2 replies
  • 2 kudos

how to get dbutils in Runtime 13

We're using the following method (generated by using dbx) to access dbutils, e.g. to retrieve parameters from secret scopes: @staticmethod def _get_dbutils(spark: SparkSession) -> "dbutils": try: from pyspark.dbutils import...

  • 2282 Views
  • 2 replies
  • 2 kudos
Latest Reply
colt
New Contributor III
  • 2 kudos

We have something similar in our code. This worked using runtime 13 until last week. Also the Machine Learning DBR doesn't work either.

  • 2 kudos
1 More Replies
96286
by Contributor
  • 6276 Views
  • 4 replies
  • 3 kudos

Resolved! Autoloader works on compute cluster, but does not work within a task in workflows

I feel like I am going crazy with this. I have tested a data pipeline on my standard compute cluster. I am loading new files as batch from a Google Cloud Storage bucket. Autoloader works exactly as expected from my notebook on my compute cluster. The...

  • 6276 Views
  • 4 replies
  • 3 kudos
Latest Reply
96286
Contributor
  • 3 kudos

I found the issue. I describe the solution in the following SO post. https://stackoverflow.com/questions/76287095/databricks-autoloader-works-on-compute-cluster-but-does-not-work-within-a-task/76313794#76313794

  • 3 kudos
3 More Replies
g96g
by New Contributor III
  • 982 Views
  • 1 replies
  • 0 kudos

Function in databricks

Im having a hard time to convert below function from SSMS to databricks function. Any help would be appreciated! CREATE FUNCTION [dbo].[MaxOf5Values] (@D1 [int],@D2 [int],@D3 [int],@D4 [int],@D5 [int]) RETURNS int AS BEGIN DECLARE @Result int   ...

  • 982 Views
  • 1 replies
  • 0 kudos
Latest Reply
Ajay-Pandey
Esteemed Contributor III
  • 0 kudos

Hi @Givi Salu​ ,​Please refer to this link that will help you convert this function.

  • 0 kudos

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group
Labels