cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Rk2
by New Contributor II
  • 2049 Views
  • 2 replies
  • 4 kudos

Resolved! scheduling a job with multiple notebooks using common parameter

I have a practical use case​three notebooks (pyspark ) all have on​e common parameter. ​need to schedule all three notebooks in a sequence ​is there any way to run them by setting one parameter value, as they are same in all. ​please suggest the ...

  • 2049 Views
  • 2 replies
  • 4 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 4 kudos

@Ramesh Kotha​ , in notebook get parameter like that:my_parameter = dbutils.widgets.get("my_parameter")and set it in a task like that:

  • 4 kudos
1 More Replies
SailajaB
by Valued Contributor III
  • 5430 Views
  • 3 replies
  • 7 kudos

Resolved! how we can use config file to change pysparks dataframe names without hardcoding

Hi,Can we use config file to change pyspark dataframe attribute names (root, nested of both struct and array type) .Actually in input we are getting attributes in lowercase we need to convert them into camel case(please note we don't have any separat...

  • 5430 Views
  • 3 replies
  • 7 kudos
Latest Reply
Anonymous
Not applicable
  • 7 kudos

Hi @Sailaja B​ This is awesome!Thanks for coming in and posting the solution. We really appreciate it.Cheers!

  • 7 kudos
2 More Replies
Tahseen0354
by Valued Contributor
  • 1781 Views
  • 1 replies
  • 1 kudos

Configure CLI on databricks on GCP

Hi, I have a service account in my GCP project and the service account is added as a user in my databricks GCP account. Is it possible to configure CLI on databricks on GCP using that service account ? Something similar to:databricks configure ---tok...

  • 1781 Views
  • 1 replies
  • 1 kudos
LukaszJ
by Contributor III
  • 5555 Views
  • 4 replies
  • 4 kudos

Resolved! Terraform: get metastore id without creating new metastore

Hello,I want to create database (schema) and tables in my Databricks workspace using terraform.I found this resources: databricks_schemaIt requires databricks_catalog, which requires metastore_id.However, I have databricks_workspace and I did not cre...

  • 5555 Views
  • 4 replies
  • 4 kudos
Latest Reply
Atanu
Databricks Employee
  • 4 kudos

https://registry.terraform.io/providers/databrickslabs/databricks/latest/docs/resources/schema I think this is for UC. https://docs.databricks.com/data-governance/unity-catalog/index.html

  • 4 kudos
3 More Replies
Juniper_AIML
by New Contributor
  • 4879 Views
  • 3 replies
  • 0 kudos

How to access the virtual environment directory where the databricks notebooks are running?

How to get access to a separate virtual environment space and its storage location on databricks so that we can move our created libraries into it without waiting for their installation each time the cluster is brought up.What we want basically is a ...

  • 4879 Views
  • 3 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Hey there @Aman Gaurav​ Thank you for posting your question.Just wanted to check in if you were able to resolve your issue or do you need more help? We'd love to hear from you.Thanks!

  • 0 kudos
2 More Replies
alejandrofm
by Valued Contributor
  • 5055 Views
  • 4 replies
  • 4 kudos

Resolved! Are there any recommended spark config settings for Delta/Databricks?

Hi! I'm starting to test configs on DataBricks, for example, to avoid corrupting data if two processes try to write at the same time:.config('spark.databricks.delta.multiClusterWrites.enabled', 'false')Or if I need more partitions than default .confi...

  • 5055 Views
  • 4 replies
  • 4 kudos
Latest Reply
Anonymous
Not applicable
  • 4 kudos

Hey there @Alejandro Martinez​ Hope everything is going well.Just wanted to see if you were able to find an answer to your question. If yes, would you be happy to let us know and mark it as best so that other members can find the solution more quickl...

  • 4 kudos
3 More Replies
DejanSunderic
by New Contributor III
  • 14919 Views
  • 11 replies
  • 3 kudos

is command stuck?

I created some ETL using DataFrames in python. It used to run 180 sec. But it is not taking ~ 1200 sec. I have been changing it, so it could be something that I introduced, or something in the environment.Part of the process is appending results into...

  • 14919 Views
  • 11 replies
  • 3 kudos
Latest Reply
Carneiro
New Contributor II
  • 3 kudos

I am having a problem very similar. Since yesterday, without a known reason, some commands that used to run daily are now stuck in a "Running command" state. Commands like: dataframe.show(n=1) dataframe.toPandas() dataframe.description() dataframe.wr...

  • 3 kudos
10 More Replies
Thefan
by New Contributor II
  • 1205 Views
  • 0 replies
  • 1 kudos

Koalas dropna in DLT

Greetings !I've been trying out DLT for a few days but I'm running into an unexpected issue when trying to use Koalas dropna in my pipeline.My goal is to drop all columns that contain only null/na values before writing it.Current code is this : @dlt...

  • 1205 Views
  • 0 replies
  • 1 kudos
shawncao
by New Contributor II
  • 4325 Views
  • 0 replies
  • 0 kudos

REST api to execute SQL query and read output

Hi there,I'm using these two APIs to execute SQL statements and read output back when it's finished. However, seems it always returns only 1000 rows even though I need all the results (millions of rows), is there a solution for this? execute SQL: htt...

  • 4325 Views
  • 0 replies
  • 0 kudos
Jackie
by New Contributor II
  • 6568 Views
  • 3 replies
  • 6 kudos

Resolved! speed up a for loop in python (azure databrick)

code example# a list of file pathlist_files_path = ["/dbfs/mnt/...", ..., "/dbfs/mnt/..."]# copy all file above to this folderdest_path=""/dbfs/mnt/..."for file_path in list_files_path: # copy function copy_file(file_path, dest_path)I am runni...

  • 6568 Views
  • 3 replies
  • 6 kudos
Latest Reply
Hemant
Valued Contributor II
  • 6 kudos

@Jackie Chan​ , What's the data size you want to copy? If it's bigger, then use ADF.

  • 6 kudos
2 More Replies
818674
by New Contributor III
  • 10483 Views
  • 10 replies
  • 8 kudos

Resolved! How to perform a cross-check for data in multiple columns in same table?

I am trying to check whether a certain datapoint exists in multiple locations.This is what my table looks like:I am checking whether the same datapoint is in two locations. The idea is that this datapoint should exist in BOTH locations, and be counte...

Table Examples of Results for Cross-Checking
  • 10483 Views
  • 10 replies
  • 8 kudos
Latest Reply
818674
New Contributor III
  • 8 kudos

Hi,Thank you very much for following up. I no longer need assistance with this issue.

  • 8 kudos
9 More Replies
deisou
by New Contributor
  • 4367 Views
  • 4 replies
  • 2 kudos

Resolved! What is the best strategy for backing up a large Databricks Delta table that is stored in Azure blob storage?

I have a large delta table that I would like to back up and I am wondering what is the best practice for backing it up. The goal is so that if there is any accidental corruption or data loss either at the Azure blob storage level or within Databricks...

  • 4367 Views
  • 4 replies
  • 2 kudos
Latest Reply
Anonymous
Not applicable
  • 2 kudos

Hi @deisou​ Just wanted to check in if you were able to resolve your issue. If yes, would you be happy to mark the answer as best? If not, please tell us so we can help you.Cheers!

  • 2 kudos
3 More Replies
rgrosskopf
by New Contributor II
  • 1255 Views
  • 0 replies
  • 1 kudos

How to use Databricks Feature Store for time series forecasts?

I've seen the Databricks documentation on time series here. I'm using forecasts as a feature and those forecasts have both an as-of timestamp (when the forecast was generated) and a time step label (timestamp indicating the time of the forecasted obs...

  • 1255 Views
  • 0 replies
  • 1 kudos

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels