cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

gazzyjuruj
by Contributor II
  • 22006 Views
  • 9 replies
  • 8 kudos

Resolved! Failed to start cluster

Hi, I ran the cluster more than 5-6 times with it failing to start since this past morning (about 11-12 hours now) since i'm facing this problem.Attaching screenshot below and also typing in case someone comes from the web to this thread in future.Pr...

IMG_2152 IMG_2151
  • 22006 Views
  • 9 replies
  • 8 kudos
Latest Reply
worthpicker
New Contributor II
  • 8 kudos

The cluster will be able to start and the nodes will automatically obtain the updated cluster configuration data.

  • 8 kudos
8 More Replies
labromb
by Contributor
  • 2545 Views
  • 0 replies
  • 0 kudos

Converting a text widget to a list

Hi, Been working on some parallel notebook code, which I have ported to python from the example on the DB website and added some exception handling and that works fine. What I would like to do is paramterise the input but am not succeeding as the fun...

  • 2545 Views
  • 0 replies
  • 0 kudos
anmol_deep
by New Contributor III
  • 3894 Views
  • 2 replies
  • 2 kudos

How to restore DatabricksRoot(FileStore) data after Databricks Workspace is decommissioned?

My Azure Databricks workspace was decommissioned. I forgot to copy files stored in the DatabricksRoot storage (dbfs:/FileStore/...).Can the workspace be recommissioned/restored? Is there any way to get my data back?Also, is there any difference betwe...

  • 3894 Views
  • 2 replies
  • 2 kudos
Latest Reply
User16764241763
Honored Contributor
  • 2 kudos

Hello @Anmol Deep​  Please submit a support request ASAP, so we can restore the deleted workspace. You can recover artifacts from the workspace.

  • 2 kudos
1 More Replies
zach
by New Contributor III
  • 3540 Views
  • 4 replies
  • 1 kudos

Does Databricks have a google cloud Big Query equivalent of --dry_run to estimate costs before executing?

Databricks uses DBU's as a costing unit whether based onto of AWS/Azure/GCP and I want to know if Databricks has a google cloud Big Query equivalent of --dry_run for estimating costs? https://cloud.google.com/bigquery/docs/estimate-costs

  • 3540 Views
  • 4 replies
  • 1 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 1 kudos

Not that I know of.Google uses number of bytes read to determine the cost.Databricks uses DBU. The number of DBU's spent is not only dependent on the amount of bytes read (the more you read, the longer the program will run probably), but also the typ...

  • 1 kudos
3 More Replies
BeginnerBob
by New Contributor III
  • 2994 Views
  • 3 replies
  • 1 kudos

Loading Dimensions including SCDType2

I have a customer dimension and for every incremental load I am applying type2 or type1 to the dimension.This dimension is based off a silver table in my delta lake where I am applying a merge statement.What happens if I need to go back and track ad...

  • 2994 Views
  • 3 replies
  • 1 kudos
Latest Reply
BeginnerBob
New Contributor III
  • 1 kudos

Thanks werners,I was informed you could essentially recreate a type 2 dimensions from scratch, without reading the files 1 by 1, using the delta lake time shift. However, this doesn't seem to be the case and the only way to create this is to incremen...

  • 1 kudos
2 More Replies
alonisser
by Contributor II
  • 10103 Views
  • 8 replies
  • 6 kudos

Failing to install a library from dbfs mounted storage (adls2) with pass through credentials cluster

We've setup a premium workspace with passthrough credentials cluster , while they do work and access my adls gen 2 storageI can't make it install a library on the cluster from there. and keeping getting"Library installation attempted on the driver no...

  • 10103 Views
  • 8 replies
  • 6 kudos
Latest Reply
alonisser
Contributor II
  • 6 kudos

Sorry I can't figure this out, the link you've added is irrelevant for passthrough credentials, if we add it the cluster won't be passthrough, Is there a way to add this just for a specific folder? while keeping passthrough for the rest?

  • 6 kudos
7 More Replies
Taha_Hussain
by Databricks Employee
  • 1239 Views
  • 0 replies
  • 4 kudos

Databricks Office Hours Register for Office Hours to participate in a live Q&A session with Databricks experts! Our next event is scheduled for Ju...

Databricks Office HoursRegister for Office Hours to participate in a live Q&A session with Databricks experts! Our next event is scheduled for June 22nd from 8:00 am - 9:00am PT.This is your opportunity to connect directly with our experts to ask any...

  • 1239 Views
  • 0 replies
  • 4 kudos
wyzer
by Contributor II
  • 4668 Views
  • 4 replies
  • 2 kudos

Resolved! Unable to delete a DBFS folder

Hello everyone,I've created by error a DBFS folder named : ${env]But when I run this command :dbutils.fs.rm("/mnt/${env]")It returns me this error : java.net.URISyntaxException: Illegal character in path at index 12: /mnt/$%7Benv]How can I do please ...

  • 4668 Views
  • 4 replies
  • 2 kudos
Latest Reply
User16764241763
Honored Contributor
  • 2 kudos

Hello @Salah K.​ Can you try below?%sh rm -r /dbfs/mnt/$\{env\]

  • 2 kudos
3 More Replies
shubhamb
by New Contributor III
  • 4352 Views
  • 2 replies
  • 3 kudos

Why does my Notebook fails if I try to load a function from another Notebook in Repos in Databricks

My function in func.pydef lower_events(df): return df.withColumn("event",f.lower(f.col("event")))My main notebook import pyspark.sql.functions as f from pyspark.sql.functions import udf, col, lower import sys   sys.path.append("..") from folder.func...

  • 4352 Views
  • 2 replies
  • 3 kudos
Latest Reply
shubhamb
New Contributor III
  • 3 kudos

@Kaniz Fatma​  https://community.databricks.com/s/question/0D58Y00008ouo6xSAA/how-to-fetch-environmental-variables-saved-in-one-notebook-into-another-notebook-in-databricks-repos-and-notebooksCan you please look into this

  • 3 kudos
1 More Replies
mortenhaga
by Contributor
  • 3414 Views
  • 2 replies
  • 2 kudos

Resolved! Importing python function with spark.read.jdbc in to Repos

Hi all!Before we used Databricks Repos we used the run magic to run various utility python functions from one notebook inside other notebooks, fex like reading from a jdbc connections. We now plan to switch to repos to utilize the fantastic CI/CD pos...

image.png image
  • 3414 Views
  • 2 replies
  • 2 kudos
Latest Reply
mortenhaga
Contributor
  • 2 kudos

Thats...odd. I was sure I had tried that, but now it works somehow. I guess it has to be that now I did it with double quotation marks. Thanks anyway! Works like a charm.

  • 2 kudos
1 More Replies
Marra
by New Contributor III
  • 7782 Views
  • 7 replies
  • 2 kudos

Read temporary views in SQL Analytics

I'm having issues trying to read temporary views in SQL Analytics module. Ive managed to create temporary views based on a query but I don't know how to read from them? Just using the name of the view returns "Table or view not found".

  • 7782 Views
  • 7 replies
  • 2 kudos
Latest Reply
Marra
New Contributor III
  • 2 kudos

No, I'm actually having issues reading from the view in the same session that created it. Using the same view name I get a table or view not found.

  • 2 kudos
6 More Replies
AndriusVitkausk
by New Contributor III
  • 1877 Views
  • 1 replies
  • 1 kudos

Autoloader event vs directory ingestion

For a production work load containing around 15k gzip compressed json files per hour all in a YYYY/MM/DD/HH/id/timestamp.json.gz directoryWhat would be the better approach on ingesting this into a delta table in terms of not only the incremental load...

  • 1877 Views
  • 1 replies
  • 1 kudos
Latest Reply
AndriusVitkausk
New Contributor III
  • 1 kudos

@Kaniz Fatma​ So i've not found a fix for the small file problem using autoloader, seems to struggle really badly against large directories, had a cluster running for 8h stuck on "listing directory" part with no end, cluster seemed completely idle to...

  • 1 kudos
karthikeyanr
by New Contributor II
  • 6210 Views
  • 4 replies
  • 6 kudos

Unable to import .dbc files in Databricks for "Databricks Developer Foundation Capstone"

Hi,I am not able to import .dbc file into Databricks workspace for "Databricks Developer Foundation Capstone". When I click import the error message is displayed. Secondly when I click the github link in the 'Download the capstone' error 404 is displ...

errorgit import error
  • 6210 Views
  • 4 replies
  • 6 kudos
Latest Reply
Atanu
Databricks Employee
  • 6 kudos

Hello @Karthikeyan.r3@cognizant.com R​ I agreed with Hubert. please write to  https://help.databricks.com

  • 6 kudos
3 More Replies
Constantine
by Contributor III
  • 3560 Views
  • 1 replies
  • 1 kudos

Resolved! Can we reuse checkpoints in Spark Streaming?

I am reading data from a Kafka topic, say topic_a. I have an application, app_one which uses Spark Streaming to read data from topic_a. I have a checkpoint location, loc_a to store the checkpoint. Now, app_one has read data till offset 90.Can I creat...

  • 3560 Views
  • 1 replies
  • 1 kudos
Latest Reply
jose_gonzalez
Databricks Employee
  • 1 kudos

Hi @John Constantine​,Is not recommended to share the checkpoint with your queries. Every streaming query should have their own checkpoint. If you can to start at the offset 90 in another query, then you can define it when starting your job. You can ...

  • 1 kudos

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels