cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

venkad
by Contributor
  • 2040 Views
  • 0 replies
  • 4 kudos

Default location for Schema/Database in Unity

Hello Bricksters,We organize the delta lake in multiple storage accounts. One storage account per data domain and one container per database. This helps us to isolate the resources and cost on the business domain level.Earlier, when a schema/database...

  • 2040 Views
  • 0 replies
  • 4 kudos
vizoso
by Databricks Partner
  • 2137 Views
  • 1 replies
  • 3 kudos

Cluster list in Microsoft.Azure.Databricks.Client fails because ClusterSource enum does not include MODELS. When you have a model serving cluster, Clu...

Cluster list in Microsoft.Azure.Databricks.Client fails because ClusterSource enum does not include MODELS.When you have a model serving cluster, ClustersApiClient.List method fails to deserialize the API response because that cluster has MODELS as C...

  • 2137 Views
  • 1 replies
  • 3 kudos
saurabh12521
by Databricks Partner
  • 4913 Views
  • 3 replies
  • 4 kudos

Unity through terraform

I am working on automation of Unity through terraform. I have referred below link link to get started :https://registry.terraform.io/providers/databricks/databricks/latest/docs/guides/unity-catalog-azureI am facing issue when I create metastore using...

image
  • 4913 Views
  • 3 replies
  • 4 kudos
Latest Reply
Pat
Esteemed Contributor
  • 4 kudos

Not sure if you got this working, but I noticed you are using provider: `databrickslabs/databricks`, hence why this is not avaialable. You should be using new provider: `databricks/databricks`: https://registry.terraform.io/providers/databricks/datab...

  • 4 kudos
2 More Replies
DataBricks_2022
by New Contributor III
  • 2347 Views
  • 1 replies
  • 1 kudos

Resolved! How to get started with Auto Loader using partner academy portal? Are there any videos and step by step material

Need Video and step by step documentation on Auto Loader as well as how to build end-to-end data pipeline

  • 2347 Views
  • 1 replies
  • 1 kudos
Latest Reply
karthik_p
Databricks Partner
  • 1 kudos

@raja iqbal​ below course will provide overview related Autoloader Course name: How to Use Databricks' Auto Loader for Incremental ETL with the Databricks Data Science and Data Engineering WorkspaceIf you register for Data Engineer Catalog, then you ...

  • 1 kudos
cvantassel
by New Contributor III
  • 15834 Views
  • 7 replies
  • 10 kudos

Is there any way to propagate errors from dbutils?

I have a master notebook that runs a few different notebooks on a schedule using the dbutils.notebook.run() function. Occasionally, these child notebooks will fail (due to API connections or whatever). My issue is, when I attempt to catch the errors ...

  • 15834 Views
  • 7 replies
  • 10 kudos
Latest Reply
wdphilli
Databricks Partner
  • 10 kudos

I have the same issue. I see no reason that Databricks couldn't propagate the internal exception back through their WorkflowException

  • 10 kudos
6 More Replies
parulpaul
by New Contributor III
  • 5540 Views
  • 1 replies
  • 2 kudos

AnalysisException: Multiple sources found for bigquery (com.google.cloud.spark.bigquery.BigQueryRelationProvider, com.google.cloud.spark.bigquery.v2.BigQueryTableProvider), please specify the fully qualified class name.

While reading data from BigQuery to Databricks getting the error : AnalysisException: Multiple sources found for bigquery (com.google.cloud.spark.bigquery.BigQueryRelationProvider, com.google.cloud.spark.bigquery.v2.BigQueryTableProvider), please spe...

  • 5540 Views
  • 1 replies
  • 2 kudos
Latest Reply
Debayan
Databricks Employee
  • 2 kudos

Hi @Parul Paul​ , could you please check if this is the scenario: https://stackoverflow.com/questions/68623803/load-to-bigquery-via-spark-job-fails-with-an-exception-for-multiple-sources-foun Also, you can refer: https://github.com/GoogleCloudDatapro...

  • 2 kudos
740209
by New Contributor II
  • 3379 Views
  • 4 replies
  • 1 kudos

Bug in db.fs.utils

When using db.fs.utils on a s3 bucket titled "${sometext}.${sometext}.${somenumber}${sometext}-${sometext}-${sometext}" we receive an error. PLEASE understand this is an issue with how it encodes the .${somenumber} because we verified with boto3 that...

  • 3379 Views
  • 4 replies
  • 1 kudos
Latest Reply
740209
New Contributor II
  • 1 kudos

@Debayan Mukherjee​ All the information is there please read accurately. I am not going to give you the actual bucket name I am using on a public forum. As i said above here is the command:dbutils.fs.ls("s3a://${bucket_name_here_follow_above_format}"...

  • 1 kudos
3 More Replies
ramankr48
by Databricks Partner
  • 13564 Views
  • 3 replies
  • 6 kudos
  • 13564 Views
  • 3 replies
  • 6 kudos
Latest Reply
Anonymous
Not applicable
  • 6 kudos

Hi @Raman Gupta​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Thanks...

  • 6 kudos
2 More Replies
parulpaul
by New Contributor III
  • 4480 Views
  • 2 replies
  • 7 kudos
  • 4480 Views
  • 2 replies
  • 7 kudos
Latest Reply
parulpaul
New Contributor III
  • 7 kudos

No solution found

  • 7 kudos
1 More Replies
gud4eve
by New Contributor III
  • 8415 Views
  • 5 replies
  • 5 kudos

Resolved! Why is Databricks on AWS cluster start time less than 5 mins and EMR cluster start time is 15 mins?

We are migrating from AWS EMR to Databricks. One thing that we have noticed during the POCs is that Databricks cluster of same size and instance type takes much lesser time to start compared to EMR.My understanding is Databricks also would be request...

  • 8415 Views
  • 5 replies
  • 5 kudos
Latest Reply
karthik_p
Databricks Partner
  • 5 kudos

@gud4eve​ what kind of cluster you are using, have you configured pools. if not as @Werner Stinckens​ said there might be chance Databricks worked hard to get provisioning of instances in faster way

  • 5 kudos
4 More Replies
Raagavi
by New Contributor
  • 8275 Views
  • 1 replies
  • 1 kudos

Is there a way to read the CSV files automatically from on-premises network locations and write back to the same from Databricks?

Is there a way to read the CSV files automatically from on-premises network locations and write back to the same from Databricks? 

  • 8275 Views
  • 1 replies
  • 1 kudos
Latest Reply
Debayan
Databricks Employee
  • 1 kudos

Hi @Raagavi Rajagopal​ , you can access files on mounted object storage (just an example) or files, please refer: https://docs.databricks.com/files/index.html#access-files-on-mounted-object-storageAnd in the DBFS , CSV files can be read and write fr...

  • 1 kudos
mattjones
by Databricks Employee
  • 1532 Views
  • 0 replies
  • 1 kudos

Hi all - Matt Jones here, I’m on the Data Streaming team at Databricks and wanted to share a few takeaways from last week’s Current 2022 data streamin...

Hi all - Matt Jones here, I’m on the Data Streaming team at Databricks and wanted to share a few takeaways from last week’s Current 2022 data streaming event (formerly Kafka Summit) in Austin.By far the most common question we got at the booth was ho...

Current 2022 Banner Image
  • 1532 Views
  • 0 replies
  • 1 kudos
Ross
by New Contributor II
  • 3053 Views
  • 1 replies
  • 0 kudos

Failed R install package of survminer in Databricks 10.4 LTS

I am trying to install the survminer package but I get a non-zero exit status. It may be due to the jpeg package which is a pre-requisite but this also fails when installing independently.install.packages("survminer", repos = "https://cran.microsoft....

  • 3053 Views
  • 1 replies
  • 0 kudos
Latest Reply
shan_chandra
Databricks Employee
  • 0 kudos

@Ross Hamilton​ - Please follow the below steps in the given orderRun the below init script in an isolated notebook and add the init script to the issue cluster > Advanced options > Init Scripts%python dbutils.fs.put("/tmp/test/init_script.sh",""" #...

  • 0 kudos
pratik21
by New Contributor II
  • 1016 Views
  • 0 replies
  • 0 kudos

Business requirement to Schedule single job on different timings

Hi Everyone ,My Business requirement is to schedule single job from 1st to 10th of the month on 12 AM , 3AM , 12PM , 4PM , 8PM and from 10th to Month End 1AM , 12PM , 4PM , 8PM right now we have created 2 schedular to meet the requirement and using c...

  • 1016 Views
  • 0 replies
  • 0 kudos
Dave_Nithio
by Contributor II
  • 3251 Views
  • 3 replies
  • 0 kudos

Resolved! Data Engineering with Databricks Module 6.3L Error: Autoload CSV

I am currently taking the Data Engineering with Databricks course and have run into an error. I have also attempted this with my own data and had a similar error. In the lab, we are using autoloader to read a spark stream of csv files saved in the DB...

  • 3251 Views
  • 3 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

As a small aside, you don't need the third argument in the structfields

  • 0 kudos
2 More Replies
Labels