cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

erigaud
by Honored Contributor
  • 965 Views
  • 2 replies
  • 0 kudos

Speeding up Command Execution API

Hello,I'm using the Databricks Command Execution api to run some spark computations on a dedicated cluster and return the results. I would ideally want the results quickly, especially since the spark computations needed take less that 0.1s.However wh...

erigaud_0-1737640182808.png
  • 965 Views
  • 2 replies
  • 0 kudos
Latest Reply
Rjdudley
Honored Contributor
  • 0 kudos

My first thought concurs with @Alberto_Umana, the only time I've seen queuing like that is when the cluster is not running.  Make sure you have the correct warehouse_id configured in your API calls.

  • 0 kudos
1 More Replies
KalyaniJaya
by Databricks Partner
  • 1077 Views
  • 1 replies
  • 0 kudos

'dbutils.jobs.taskValues.get' taking debug value in workflow, instead of actual value being set

Hi,I am trying to pass and set values from one wheel into another wheel in databricks workflow.I have used 'dbutils.jobs.taskValues.get' and 'dbutils.jobs.taskValues.set'I have used 'dbutils.jobs.taskValues.get' in second task and made sure to keep d...

  • 1077 Views
  • 1 replies
  • 0 kudos
Latest Reply
Alberto_Umana
Databricks Employee
  • 0 kudos

It seems like the issue you're encountering is related to the debugValue parameter being used instead of the actual value when calling dbutils.jobs.taskValues.get. This behavior is expected when the notebook is run outside of a job context, as the de...

  • 0 kudos
Sadam97
by New Contributor III
  • 1119 Views
  • 1 replies
  • 0 kudos

How to get the databricks support contract

We are trying to get the databricks support contract but have no luck. After moving here and there we found this email address gtmops@databricks.com to get support contract but its been 3 weeks and multiple emails from our side but no reply. What's t...

  • 1119 Views
  • 1 replies
  • 0 kudos
Latest Reply
Alberto_Umana
Databricks Employee
  • 0 kudos

Hello @Sadam97, Please use help@databricks.com: https://docs.databricks.com/en/resources/support.html

  • 0 kudos
Nate-Haines
by New Contributor III
  • 9202 Views
  • 3 replies
  • 8 kudos

Drive memory utilization cleanup

Issue Summary:When running multiple jobs on the same compute cluster, over time, I see an increase in memory utilization that is seemingly never fully released, even when jobs finish. This eventually leads to some jobs stalling out as memory hits the...

  • 9202 Views
  • 3 replies
  • 8 kudos
Latest Reply
KyleGrymonpre
New Contributor III
  • 8 kudos

I'm encountering something similar. Immediately upon starting a cluster and triggering a job run, my memory usage jumps from 0 to about 20GB used and 15GB cached (see the attached screenshot). The data I am working with should be very small (less tha...

  • 8 kudos
2 More Replies
pradeepvatsvk
by New Contributor III
  • 1850 Views
  • 2 replies
  • 3 kudos

Working with Pandas through Abfss

Hi, I am unable to read and write pandas datframe through abfss protocol , Is there a work around for this , I do not want to store my files in dbfs

  • 1850 Views
  • 2 replies
  • 3 kudos
Latest Reply
Avinash_Narala
Databricks Partner
  • 3 kudos

you can use volumes and mount the abfss location to unity catalog and can access the files present in azure in databricks. Regards,Avinash N

  • 3 kudos
1 More Replies
pwtnew32
by New Contributor III
  • 1994 Views
  • 3 replies
  • 3 kudos

Resolved! Lakehouse Federation

I use Lakehouse Federation to connect hive metastore (local VM) with metastore database type mysqlIt's can see database and table in hive but when I query data session continue running without failed or success.Do I have to migrate data to ADLs which...

  • 1994 Views
  • 3 replies
  • 3 kudos
Latest Reply
Avinash_Narala
Databricks Partner
  • 3 kudos

As for Lakehouse Federation, only some sources are supported as of now. In that, connecting to HiveMetastore data which is in local VM is not supported, so you can migrate that data to ADLS and mount that as external location in unity catalog and que...

  • 3 kudos
2 More Replies
weilin0323
by New Contributor III
  • 2184 Views
  • 2 replies
  • 3 kudos

Resolved! How to Apply Encryption Function to a Specific Column

Hello!I would like to apply a function to encrypt a specific column. The UDF is as follows: DROP FUNCTION IF EXISTS EncryptColumn; CREATE FUNCTION EncryptColumn (key_name STRING, encryptcolumn STRING) RETURN base64(aes_encrypt(encryptcolumn, key_nam...

weilin0323_0-1737526512566.png
  • 2184 Views
  • 2 replies
  • 3 kudos
Latest Reply
weilin0323
New Contributor III
  • 3 kudos

Hi @MadhuB,The method you provided is feasible, and I later finded other ways to apply UDF:UPDATE table_name SET column_name = EncryptColumn(key_name, column_name) Thank you!

  • 3 kudos
1 More Replies
johngabbradley
by New Contributor II
  • 1223 Views
  • 2 replies
  • 0 kudos

Using spark.read.json with a {} literal in my path

I am pulling data from an S3 bucket using spark.read.json like thiss3_uri = "s3://snowflake-genesys/v2.outbound.campaigns.{id}/2025-01-22/00/"       df = spark.read.json(s3_uri)My s3 url has the {id} in the file path.  I have used r"s3://snowflake-ge...

  • 1223 Views
  • 2 replies
  • 0 kudos
Latest Reply
Alberto_Umana
Databricks Employee
  • 0 kudos

Hi @johngabbradley, Would below approach work for you? s3_uri = "s3://snowflake-genesys/v2.outbound.campaigns.{id}/2025-01-22/00/"files = dbutils.fs.ls(s3_uri)file_paths = [file.path for file in files]df = spark.read.json(file_paths)

  • 0 kudos
1 More Replies
Wallace_Selis
by New Contributor
  • 1361 Views
  • 1 replies
  • 0 kudos

HELP

I can't log in. After entering the code received in the email, I remain on this screen  

Wallace_Selis_0-1737574404014.png
  • 1361 Views
  • 1 replies
  • 0 kudos
Latest Reply
Walter_C
Databricks Employee
  • 0 kudos

If you try in the incognito mode does it shows same error?  

  • 0 kudos
avesel
by New Contributor
  • 1698 Views
  • 1 replies
  • 0 kudos

How to refer to repository directories in Workflows

HiI need to refer to a configuration file which resides in a separate directory than the script. The paths and execution withing a notebook/python files work fine. When script is scheduled and uses code repository the directory names look obfuscated ...

  • 1698 Views
  • 1 replies
  • 0 kudos
Latest Reply
lauraxyz
Contributor
  • 0 kudos

can you try with relative path?like you get your current path within test_script.py with command like:cur_path = os.getcwd() then get the path to config.yaml with a relative path like:config_path = os.path.abspath(os.path.join(cur_path, f"../config/c...

  • 0 kudos
sensanjoy
by Contributor II
  • 2490 Views
  • 6 replies
  • 0 kudos

Performance issue when reading data from a view.

Hi All,We are facing some performance issue and I need your help to know what could be the best approach to follow here.Existing: For each region, we have view(Reg1_View,Reg2_View..) to pull data from table( we don't have direct access to table).And ...

View_Vs_Staging_Table.png
  • 2490 Views
  • 6 replies
  • 0 kudos
Latest Reply
SharathAbh93
New Contributor II
  • 0 kudos

Does any table hold data of all region 1. if yes. Get a Materialized view created (replacing all_reg_view)2. i see you already tried creating a staging table replacing the all_reg_view. Try creating cluster key along with partition.Cluster key on the...

  • 0 kudos
5 More Replies
thiagoawstest
by Contributor
  • 4581 Views
  • 1 replies
  • 0 kudos

No access to databricks console

 Hello, I have the following situation: when trying to configure SSO, it was enabled to allow login using Microsoft, but the problem is that the sessions expired, and now we cannot access with any email, it says that the account is not enabled.How co...

Data Engineering
AWS
dataengineer
  • 4581 Views
  • 1 replies
  • 0 kudos
Latest Reply
Miguel_Suarez
Databricks Employee
  • 0 kudos

Hi @thiagoawstest, Please reach out to your Account Executive or Solutions Architect. The will be able to help you with the issue you're experiencing while trying to log in. Best

  • 0 kudos
databricks98
by New Contributor
  • 4717 Views
  • 1 replies
  • 0 kudos

Failed to send request to Azure Databricks Cluster

We have scheduled an ADF (Azure Data Factory) pipeline that contains a Lookup activity, which is responsible for fetching the last ingested date from the Databricks catalog(Hive metastore). I attached screenshot please find https://yourimageshare.com...

  • 4717 Views
  • 1 replies
  • 0 kudos
Latest Reply
Miguel_Suarez
Databricks Employee
  • 0 kudos

Hi @databricks98,  It seems like there is some issue connecting to your Azure account. Were there any recent changes to firewalls, permissions, or cluster configurations? Could you please check to make sure that the connection between Databricks and ...

  • 0 kudos
jsaddam28
by New Contributor III
  • 61162 Views
  • 25 replies
  • 16 kudos

How to import local python file in notebook?

for example I have one.py and two.py in databricks and I want to use one of the module from one.py in two.py. Usually I do this in my local machine by import statement like below two.py__ from one import module1 . . . How to do this in databricks???...

  • 61162 Views
  • 25 replies
  • 16 kudos
Latest Reply
PabloCSD
Valued Contributor II
  • 16 kudos

This alternative worked for us: https://community.databricks.com/t5/data-engineering/is-it-possible-to-import-functions-from-a-module-in-workspace/td-p/5199

  • 16 kudos
24 More Replies
momita_s
by New Contributor II
  • 1244 Views
  • 2 replies
  • 0 kudos

Resolved! How can we fetch application id in serverless compute in databricks?

Hi All,How can we fetch application id in serverless compute in databricks? We are working to use serverless compute for some jobs. The issue is we are not able to fetch application id in notebook. Earlier we were using spark.sparkContext.application...

  • 1244 Views
  • 2 replies
  • 0 kudos
Latest Reply
Alberto_Umana
Databricks Employee
  • 0 kudos

Hi @momita_s, Thanks for your question. As you mentioned above parameters/commands are not available in serverless, I did not find a way to retrieve the applicationID on serverless, I would check internally but this likely requires a feature request.

  • 0 kudos
1 More Replies
Labels