cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

afk
by New Contributor III
  • 6035 Views
  • 2 replies
  • 2 kudos

Change data feed from target tables of APPLY CHANGES

Up until yesterday I was (sort of) able to read changes from target tables of apply changes operations (either through tables_changes() or using readChangeFeed). I say sort of because the meta columns (_change_type, _commit_version, _commit_timestamp...

  • 6035 Views
  • 2 replies
  • 2 kudos
ElaPG
by New Contributor III
  • 2818 Views
  • 1 replies
  • 0 kudos

DLT concurrent pipeline updates.

Hi!Regarding this info "An Azure Databricks workspace is limited to 100 concurrent pipeline updates." (Release 2023.16 - Azure Databricks | Microsoft Learn), what is considered as an update? Changes in pipeline logic or each pipeline run?

  • 2818 Views
  • 1 replies
  • 0 kudos
sher
by Valued Contributor II
  • 7408 Views
  • 1 replies
  • 0 kudos

How to resolve the column name in s3 path saved as UUID format

our managed databricks tables stored in s3 as default, while i am reading that s3 path directly i am getting the column value as UUIDeg: column name ID in databricks tablewhile checking the S3 Path, the column name looks like COL- b400af61-9tha-4565-...

Data Engineering
deltatable
managedtables
  • 7408 Views
  • 1 replies
  • 0 kudos
Latest Reply
sher
Valued Contributor II
  • 0 kudos

hi @Retired_mod Thank you for you are reply but the issue is i am not able to map  ID with COL- b400af61-9tha-4565-89c4-d6ba43f948b7. i useDESCRIBE TABLE EXTENDED table_namea query to get the list of UUID column names. and for real column name fettin...

  • 0 kudos
rt-slowth
by Contributor
  • 3170 Views
  • 2 replies
  • 1 kudos

How to call a table created with create_table using dlt in a separate notebook?

I created a separate pipeline notebook to generate the table via DLT, and a separate notebook to write the entire output to redshift at the end. The table created via DLT is called spark.read.table("{schema}.{table}").This way, I can import[MATERIALI...

  • 3170 Views
  • 2 replies
  • 1 kudos
alejandrofm
by Valued Contributor
  • 10723 Views
  • 10 replies
  • 15 kudos

All-purpose clusters not remembering custom tags

Hi, we have several clusters used with Notebooks, we don't delete them, just start-stop according to the "minutes of inactivity" set.I'm trying to set a custom tag, so I wait until the cluster shuts down, add a tag, check that the tag is among then "...

  • 10723 Views
  • 10 replies
  • 15 kudos
Latest Reply
Dribka
New Contributor III
  • 15 kudos

@alejandrofm the behavior you're describing, where the custom tag disappears after the cluster restarts, might be related to the cluster configuration or the specific settings of your Databricks environment. To troubleshoot this, ensure that the cust...

  • 15 kudos
9 More Replies
Daniel20
by New Contributor
  • 1689 Views
  • 0 replies
  • 0 kudos

Flattening a Nested Recursive JSON Structure into a Struct List

This is from Spark Event log on Event SparkListenerSQLExecutionStart.How to flatten the sparkPlanInfo struct into an array of the same struct, then later explode it. Note that the element children is an array containing the parent struct, and the lev...

  • 1689 Views
  • 0 replies
  • 0 kudos
804082
by New Contributor III
  • 3227 Views
  • 4 replies
  • 1 kudos

Resolved! "Your workspace is hosted on infrastructure that cannot support serverless compute."

Hello,I wanted to try out Lakehouse Monitoring, but I receive the following message during setup: "Your workspace is hosted on infrastructure that cannot support serverless compute."I meet all requirements outlined in the documentation. My workspace ...

  • 3227 Views
  • 4 replies
  • 1 kudos
Latest Reply
SSundaram
Contributor
  • 1 kudos

Lakehouse MonitoringThis feature is in Public Preview in the following regions: eu-central-1, eu-west-1, us-east-1, us-east-2, us-west-2, ap-southeast-2. Not all workspaces in the regions listed are supported. If you see the error “Your workspace is ...

  • 1 kudos
3 More Replies
Wayne
by New Contributor III
  • 31459 Views
  • 0 replies
  • 0 kudos

How to flatten a nested recursive JSON struct to a list of struct

This is from Spark Event log on Event SparkListenerSQLExecutionStart.How to flatten the sparkPlanInfo struct into an array of the same struct, then later explode it. Note that the element children is an array containing the parent struct, and the lev...

  • 31459 Views
  • 0 replies
  • 0 kudos
Arnold_Souza
by New Contributor III
  • 7660 Views
  • 1 replies
  • 0 kudos

Delta Live Tables consuming different files from the same path are combining the schema

SummaryI am using Delta Live Tables to create a pipeline in Databricks and I am facing a problem of merging the schema of different files that are placed in the same folder in a datalake, even though I am using File Patterns to separate the data inge...

Data Engineering
cloud_files
Databricks SQL
Delta Live Tables
read_files
  • 7660 Views
  • 1 replies
  • 0 kudos
Latest Reply
Arnold_Souza
New Contributor III
  • 0 kudos

Found a solution:Never use 'fileNamePattern', '*file_1*',Instead, put the pattern directly into the path:"abfss://<container>@<storage_account>.dfs.core.windows.net/path/to/folder/*file_1*"

  • 0 kudos
bzh
by New Contributor
  • 4511 Views
  • 3 replies
  • 0 kudos

Question: Delta Live Table, multiple streaming sources to the single target

We are trying to writing multiple sources to the same target table using DLT, but getting the below errors. Not sure what we are missing here in the code....File /databricks/spark/python/dlt/api.py:817, in apply_changes(target, source, keys, sequence...

  • 4511 Views
  • 3 replies
  • 0 kudos
Latest Reply
nag_kanchan
New Contributor III
  • 0 kudos

The solution did not work for me. It was throwing an error stating: raise Py4JError( py4j.protocol.Py4JError: An error occurred while calling o434.readStream. Trace: py4j.Py4JException: Method readStream([class java. util.ArrayList]) does not exist.A...

  • 0 kudos
2 More Replies
Faisal
by Contributor
  • 2901 Views
  • 1 replies
  • 0 kudos

DLT - how to log number of rows read and written

Hi @Retired_mod - how to log number of rows read and written in dlt pipeline, I want to store it in audit tables post the pipeline update completes. Can you give me sample query code ?

  • 2901 Views
  • 1 replies
  • 0 kudos
Latest Reply
Faisal
Contributor
  • 0 kudos

Thanks @Retired_mod but I asked on how to log number of rows/written via a delta live table (DLT) pipeline, not a delta lake table and the solution you gave is related to data factory pipeline which is not what I need.

  • 0 kudos
AFox
by Contributor
  • 7587 Views
  • 3 replies
  • 3 kudos

databricks-connect: PandasUDFs importing local packages: ModuleNotFoundError

databricks-connect==14.1.0Related to other posts:https://community.databricks.com/t5/data-engineering/modulenotfounderror-serializationerror-when-executing-over/td-p/14301https://stackoverflow.com/questions/59322622/how-to-use-a-udf-defined-in-a-sub-...

  • 7587 Views
  • 3 replies
  • 3 kudos
Latest Reply
AFox
Contributor
  • 3 kudos

There is a way to do this!! spark.addArtifact(src_zip_path, pyfile=True) Some things of note:This only works on single user (non shared) clusterssrc_zip_path must be a posixpath type string (i.e. forward slash ) even on windows (drop C: and replace t...

  • 3 kudos
2 More Replies
amitdatabricksc
by New Contributor II
  • 13259 Views
  • 4 replies
  • 2 kudos

how to zip a dataframe

how to zip a dataframe so that i get a zipped csv output file. please share command. it is only 1 dataframe involved and not multiple. 

  • 13259 Views
  • 4 replies
  • 2 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 2 kudos

writing to a local directory does not work.See this topic:https://community.databricks.com/s/feed/0D53f00001M7hNlCAJ

  • 2 kudos
3 More Replies

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels