cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

GCera
by New Contributor II
  • 2584 Views
  • 1 replies
  • 0 kudos

import * from ../my/relative/path

I have the following repository structure:/Repos/main/MyRepo/ -> run (folder) -> setup (folder) -> src (folder) -> main (notebook)Now, starting from a notebook in the "src" folder I need to run/import all variables defined in all notebooks stored in ...

  • 2584 Views
  • 1 replies
  • 0 kudos
sanjay
by Valued Contributor II
  • 14816 Views
  • 8 replies
  • 0 kudos

error after updating delta table com.databricks.sql.transaction.tahoe.DeltaUnsupportedOperationException: Detected a data update

Hi,I have pipeline running. I have updated one file in delta table which is already processed. Now I am getting errorcom.databricks.sql.transaction.tahoe.DeltaUnsupportedOperationException: Detected a data update. This is currently not supported. If ...

  • 14816 Views
  • 8 replies
  • 0 kudos
Latest Reply
Sanjeev_Chauhan
New Contributor II
  • 0 kudos

Hi Sanjay, You can try adding .option("overwriteSchema", "true")

  • 0 kudos
7 More Replies
abhijitnag
by New Contributor II
  • 1869 Views
  • 2 replies
  • 0 kudos

Materialize View creation not supported from DLT Pipeline

Hi Team, I have a very basic scenario where I am using my custom catalog and want materialize view to get created from DLT table at the end of pipeline. The SQL used as below for the same.where "loom_data_transform" is a Streaming table. But pipeline...

abhijitnag_1-1704047659613.png abhijitnag_0-1704047565470.png
Data Engineering
Delta Live Table
dlt
Unity Catalog
  • 1869 Views
  • 2 replies
  • 0 kudos
Latest Reply
warsamebashir
New Contributor II
  • 0 kudos

Hey @abhijitnag are you sure your loom_data_transform was created as a STREAMING table? docs:https://docs.databricks.com/en/sql/language-manual/sql-ref-syntax-ddl-create-streaming-table.html    

  • 0 kudos
1 More Replies
Pavani22
by New Contributor
  • 801 Views
  • 0 replies
  • 0 kudos

Registered for Wrong certification

Hi, I am registered for databricks certification today at 23:00hr and I accidentally choose databricks data analyst certification instead of databricks data engineer associate.I have left several mails and submitted several forms to be able to fix th...

  • 801 Views
  • 0 replies
  • 0 kudos
naveenprasanth
by New Contributor
  • 2429 Views
  • 1 replies
  • 1 kudos

Issue with Reading MongoDB Data in Unity Catalog Cluster

I am encountering an issue while trying to read data from MongoDB in a Unity Catalog Cluster using PySpark. I have shared my code below: from pyspark.sql import SparkSession database = "cloud" collection = "data" Scope = "XXXXXXXX" Key = "XXXXXX-YYY...

Data Engineering
mongodb
spark config
Spark Connector package
Unity Catalog
  • 2429 Views
  • 1 replies
  • 1 kudos
Latest Reply
Wojciech_BUK
Valued Contributor III
  • 1 kudos

Few points 1. chce if you installed exactly same driver version as you are pointing this in code (2.12:3.2.0) it has to match 100percentorg.mongodb.spark:mongo-spark-connector_2.12:3.2.02. I have seen people configuring  connction to atlas in two way...

  • 1 kudos
Learnit
by New Contributor II
  • 2028 Views
  • 1 replies
  • 0 kudos

Unable to locate "Configure databricks" in visual studio code

Need help in locating "configure databricks" option to connect to a cluster in visual studio code. Not sure which settings need to enabled to view/access the configuration. Below is the screen shot of it.

  • 2028 Views
  • 1 replies
  • 0 kudos
Latest Reply
Learnit
New Contributor II
  • 0 kudos

  • 0 kudos
NivRen
by New Contributor
  • 1567 Views
  • 0 replies
  • 0 kudos

Dashboard HTML Export: DisplayHTML Formatting Lost

I am building an analytics dashboard that we plan to export into HTML to provide to clients. We are running into major issues specifically with DisplayHTML visualizations. We are using DisplayHTML for help text, sub-headings, and for a KPI banner at ...

Data Engineering
dashboard
DisplayHTML
  • 1567 Views
  • 0 replies
  • 0 kudos
dzmitry_tt
by New Contributor
  • 3097 Views
  • 1 replies
  • 0 kudos

DeltaRuntimeException: Keeping the source of the MERGE statement materialized has failed repeatedly.

I'm using Autoloader (in Azure Databricks) to read parquet files and write their data into the Delta table.schemaEvolutionMode is set to 'rescue'.In foreach_batch I do1) Transform of read dataframe;2) Create temp view based on read dataframe and merg...

Data Engineering
autoloader
MERGE
streaming
  • 3097 Views
  • 1 replies
  • 0 kudos
Latest Reply
Wojciech_BUK
Valued Contributor III
  • 0 kudos

Hmm, you can't have duplicated data in source dataframe/batch but it should error out with diffrent erro like "Cannot perform Merge as multiple source rows matched and attempted to modify the same target row...".Also this behaviour after rerun is str...

  • 0 kudos
EDDatabricks
by Contributor
  • 2386 Views
  • 1 replies
  • 0 kudos

Slow stream static join in Spark Structured Streaming

SituationRecords are streamed from an input Delta table via a Spark Structured Streaming job. The streaming job performs the following.Read from input Delta table (readStream)Static join on small JSONStatic join on big Delta tableWrite to three Delta...

EDDatabricks_1-1703760391974.png
Data Engineering
Azure Databricks
optimization
Spark Structured Streaming
Stream static join
  • 2386 Views
  • 1 replies
  • 0 kudos
Latest Reply
Wojciech_BUK
Valued Contributor III
  • 0 kudos

You have quite small machines that you are using, please take into consideration that a lot of memory of machine is occupied by other processes https://kb.databricks.com/clusters/spark-shows-less-memoryThis is not good idea to broadcast huge data fra...

  • 0 kudos
Erik
by Valued Contributor III
  • 12986 Views
  • 4 replies
  • 3 kudos

Resolved! How to run code-formating on the notebooks

Has anyone found a nice way to run code-formating (like black) on the notebooks **in the workspace**? My current workflow is to commit the file, pull it locally, format, repush and pull. It would be nice if it was some relatively easy way to run blac...

  • 12986 Views
  • 4 replies
  • 3 kudos
Latest Reply
MartinPlay01
New Contributor II
  • 3 kudos

Hi Erik,I don't know if you are aware of this feature, currently there is an option to format the code in your databricks notebooks using the black code style formatter.Just you need to either have a version of your DBR equal to or greater than 11.2 ...

  • 3 kudos
3 More Replies
XClar_40456
by New Contributor
  • 1966 Views
  • 2 replies
  • 1 kudos

Resolved! Are there system tables that are customer accessible for setting up job run health monitoring in GCP Databricks?

Is Overwatch still an active project, is there anything equivalent for GCP Databricks or any plans for Overwatch to be available in GCP? 

  • 1966 Views
  • 2 replies
  • 1 kudos
Latest Reply
SriramMohanty
Databricks Employee
  • 1 kudos

Yes overwatch supports GCP.

  • 1 kudos
1 More Replies
rt-slowth
by Contributor
  • 614 Views
  • 0 replies
  • 0 kudos

Help design my streaming pipeline

###Data Source- AWS RDS- Database migration tasks have been created using AWS DMS- Relevant cdc information is being stored in a specific bucket in S3### Data frequency- Once a day (but not sure when, sometime after 6pm)### Development environment- d...

  • 614 Views
  • 0 replies
  • 0 kudos
RabahO
by New Contributor III
  • 1965 Views
  • 1 replies
  • 0 kudos

Handling data close to SCD2 with Delta tables

Hello, stack used: pyspark and delta tablesI'm working with some data that look a bit like SCD2 data.Basically, the data has columns that represent an id, a rank column and other informations, here's an example:login, email, business_timestamp => the...

  • 1965 Views
  • 1 replies
  • 0 kudos
Latest Reply
Wojciech_BUK
Valued Contributor III
  • 0 kudos

Your problem is exactly like SCD2 . You just add one more column with valid to date ( optionals you can add flag is actual to tag current records)You can use DLT apply changes syntax. Alternatively Merge statement .On the top of that table you can bu...

  • 0 kudos

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels