cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

tigger
by New Contributor III
  • 4788 Views
  • 3 replies
  • 2 kudos

Resolved! Is it possible to disable retryWrites using .option()?

Hello everyone,I'm trying to write to DocumentDB using org.mongodb.spark:mongo-spark-connector_2.12:3.0.1. The DocDB is version 4 which doesn't support Retryable Writes so I disabled the feature setting option "retryWrites" to "false" (also tried wit...

  • 4788 Views
  • 3 replies
  • 2 kudos
Latest Reply
Anonymous
Not applicable
  • 2 kudos

@Hugh Vo​ - If Sajehs's answer resolved the issue, would you be happy to mark their answer as best?

  • 2 kudos
2 More Replies
Anonymous
by Not applicable
  • 2622 Views
  • 2 replies
  • 2 kudos

Resolved! Are notebooks encrypted even if no CMK is provided?

This document (https://docs.databricks.com/security/keys/customer-managed-keys-managed-services-aws.html) describes how to use a customer managed key to encrypt notebooks in the control plane.We would please like to verify: if no CMK is provided, are...

  • 2622 Views
  • 2 replies
  • 2 kudos
Latest Reply
Filippo-DB
Databricks Employee
  • 2 kudos

Hello @Nathan Buesgens​ , from a high level point of view, by default, notebooks source code and metadata in the control plane are encrypted at rest in AWS RDS using AWS KMS with a Databricks-managed Key. But there is other data related to notebooks ...

  • 2 kudos
1 More Replies
kpendergast
by Contributor
  • 4094 Views
  • 3 replies
  • 3 kudos

Resolved! How do I create a job for a notebook not in the /Users/ directory?

I am setting up a job to to load data from S3 into Delta using Auto loader. I can do this fine in interactive mode. When trying to create a job in the UI. I can select the notebook in the root directory I created for the project within the create jo...

  • 4094 Views
  • 3 replies
  • 3 kudos
Latest Reply
User16844513407
Databricks Employee
  • 3 kudos

Hi @Ken Pendergast​, you are supposed to be able to reference any Notebook you have the right permissions on so it looks like you are running into a bug, can you please reach out to support or email me directly with your workspace ID? My email is jan...

  • 3 kudos
2 More Replies
yadsmc
by New Contributor II
  • 2676 Views
  • 3 replies
  • 0 kudos

Resolved! SQL Issues with 10.0 runtime

I was testing my sqls with new 10.0 runtime and found some interesting/weird thing. The same sql with explode function fails for some scenarios in 10.0! Could not figure out yet the reason

  • 2676 Views
  • 3 replies
  • 0 kudos
Latest Reply
BilalAslamDbrx
Databricks Employee
  • 0 kudos

@Yadhuram MC​  if the issue persists, please email me at bilal dot aslam at databricks dot com. I would like to get to the root of this issue. It

  • 0 kudos
2 More Replies
Anonymous
by Not applicable
  • 3513 Views
  • 2 replies
  • 2 kudos

Resolved! OPTIMIZE

I have been testing OPTIMIZE a huge set of data (about 775 million rows) and getting mixed results. When I tried on a 'string' column, the query return in 2.5mins and using the same column as 'integer', using the same query, it return 9.7 seconds. Pl...

  • 3513 Views
  • 2 replies
  • 2 kudos
Latest Reply
Anonymous
Not applicable
  • 2 kudos

@Werner Stinckens​  Thanks for your explanation.

  • 2 kudos
1 More Replies
Atharva_Salunke
by Contributor
  • 16926 Views
  • 20 replies
  • 10 kudos

Resolved! Haven't received Badge for Apache Spark 3.0 Associate Dev certification

I have given my exam on 8/10/2021 and have passed it, subsequently I received me certificate but haven't received the badge associated with it yet, its been almost 2 weeks since I received the certificate and have raised 2 requests through the main c...

  • 16926 Views
  • 20 replies
  • 10 kudos
Latest Reply
Anonymous
Not applicable
  • 10 kudos

@Atharva Salunke​ - If you're referring to @Maithreyi Pagar​, they're all set.

  • 10 kudos
19 More Replies
Anonymous
by Not applicable
  • 2680 Views
  • 0 replies
  • 0 kudos

Is the "patch"/update method of the repos API synchronous?

The repos API has a patch method to update a repo in the workspace (to do a git pull).We would please like to verify: is this method fully synchronous? Is it guaranteed to only return a 200 after the update is complete? Or, would immediately referenc...

  • 2680 Views
  • 0 replies
  • 0 kudos
RantoB
by Valued Contributor
  • 8841 Views
  • 8 replies
  • 3 kudos

Resolved! How to export a Databricks repos in dbc format with databricks CLI

Hi,How can I export a Databricks repository in dbc format with databricks CLI ?It is possible to make databricks workspace export_dir path/to/dir .but notdatabricks repos export_dir path/to/dir .Thanks for you answers

  • 8841 Views
  • 8 replies
  • 3 kudos
Latest Reply
Prabakar
Databricks Employee
  • 3 kudos

@Bertrand BURCKER​  Is your requirement to do it only from the CLI? Or to export the repos?If it is to export the repos, you can export it as DBC format from the UI.

  • 3 kudos
7 More Replies
JK2021
by New Contributor III
  • 10740 Views
  • 10 replies
  • 5 kudos

Resolved! An unidentified special character is added in outbound file when transformed in databricks. Please help with suggestion?

Data from external source is copied to ADLS, which further gets picked up by databricks, then this massaged data is put in the outbound file . A special character ? (question mark in black diamond) is seen in some fields in outbound file which may br...

  • 10740 Views
  • 10 replies
  • 5 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 5 kudos

Are you sure it is Databricks which puts the special character in place?It could also have happened during the copy of the external system to ADLS.If you use Azure Data Factory f.e. you have to define the encoding (UTF-8 or UTF-16, ...)

  • 5 kudos
9 More Replies
MartinB
by Contributor III
  • 14736 Views
  • 8 replies
  • 9 kudos

Resolved! Is there a way to create a non-temporary Spark View with PySpark?

Hi,When creating a Spark view using SparkSQL ("CREATE VIEW AS SELCT ...") per default, this view is non-temporary - the view definition will survive the Spark session as well as the Spark cluster.In PySpark I can use DataFrame.createOrReplaceTempView...

  • 14736 Views
  • 8 replies
  • 9 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 9 kudos

why not to create manage table?dataframe.write.mode(SaveMode.Overwrite).saveAsTable("<example-table>")   # later when we need data resultDf = spark.read.table("<example-table>")

  • 9 kudos
7 More Replies
Nick_Hughes
by New Contributor III
  • 6692 Views
  • 9 replies
  • 4 kudos

Resolved! Does Databricks persist the alert results history?

Hi, this seems like a really basic feature to have, an alert is generated and spits out the email, but the URL doesn't take you to the events list that happened at that time but just the query in the editor (via the alert config screen). We're really...

  • 6692 Views
  • 9 replies
  • 4 kudos
Latest Reply
Prabakar
Databricks Employee
  • 4 kudos

@Nick Hughes​ I was looking at our ideas portal and could see an API feature was requested for the Alerts (DB-I-4289). To create an API for the alerts and pull data, we might need to keep the alert history persistent. So this feature should suffice y...

  • 4 kudos
8 More Replies
Sandeep
by Databricks Employee
  • 975 Views
  • 0 replies
  • 4 kudos

spark.apache.org

Per API docs on StreamingQuery.stop(),   https://spark.apache.org/docs/latest/api/scala/org/apache/spark/sql/streaming/StreamingQuery.html   It says, this stops the execution of this query if it is running and waits until the termination of the query...

  • 975 Views
  • 0 replies
  • 4 kudos
dimoobraznii
by New Contributor III
  • 8803 Views
  • 5 replies
  • 6 kudos

Resolved! Autoloader failed

I used autoloader with TriggerOnce = true and ran it for weeks with schedule. Today it broke:The metadata file in the streaming source checkpoint directory is missing. This metadatafile contains important default options for the stream, so the stream...

  • 8803 Views
  • 5 replies
  • 6 kudos
Latest Reply
Deepak_Bhutada
Databricks Employee
  • 6 kudos

Hi dimoobraznii (Customer)This error comes in streaming when someone makes changes to the streaming checkpoint directory manually or points some streaming type to the checkpoint of some other streaming type. Please check if any changes were made to t...

  • 6 kudos
4 More Replies
Anonymous
by Not applicable
  • 3793 Views
  • 3 replies
  • 7 kudos

Resolved! How does 73% of the data go unused for analytics or decision-making?

Is Lakehouse the answer? Here's a good resource that was just published: https://dbricks.co/3q3471X

  • 3793 Views
  • 3 replies
  • 7 kudos
Latest Reply
Anonymous
Not applicable
  • 7 kudos

@Alexis Lopez​ - If @Dan Zafar​ 's or @Harikrishnan Kunhumveettil​'s answers solved the issue, would you be happy to mark one of their answers as best so other members can find the solution more easily?

  • 7 kudos
2 More Replies
Gapy
by New Contributor II
  • 2067 Views
  • 1 replies
  • 1 kudos

Auto Loader Schema-Inference and Evolution for parquet files

Dear all,will (and when) will Auto Loader also support Schema-Inference and Evolution for parquet files, at this point it is only for JSON and CSV supported if i am not mistaken?Thanks and regards,Gapy

  • 2067 Views
  • 1 replies
  • 1 kudos
Latest Reply
Sandeep
Databricks Employee
  • 1 kudos

@Gasper Zerak​ , This will be available in near future (DBR 10.3 or later). Unfortunately, we don't have an SLA at this moment.

  • 1 kudos

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels