cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

mvmiller
by New Contributor III
  • 3552 Views
  • 1 replies
  • 0 kudos

How to ignore Writestream UnknownFieldException error

I have a parquet file that I am trying to write to a delta table:df.writeStream  .format("delta")  .option("checkpointLocation", f"{targetPath}/delta/{tableName}/__checkpoints")  .trigger(once=True)  .foreachBatch(processTable)  .outputMode("append")...

  • 3552 Views
  • 1 replies
  • 0 kudos
Latest Reply
shan_chandra
Databricks Employee
  • 0 kudos

@mvmiller - Per the below documentation, The stream will fail with unknownFieldException, the schema evolution mode by default is addNewColumns. so, Databricks recommends configuring Auto Loader streams with workflows to restart automatically after s...

  • 0 kudos
RTabur
by New Contributor II
  • 2116 Views
  • 2 replies
  • 0 kudos

[Bug] Orphan storage location

Hello,I'm not able to re-create an external location after removing its owner from Databricks Account. I'm getting the following error:Input path url 'abfss://foo@bar.dfs.core.windows.net/' overlaps with an existing external location within 'CreateEx...

  • 2116 Views
  • 2 replies
  • 0 kudos
Latest Reply
PL_db
Databricks Employee
  • 0 kudos

Your metastore admin can list all external locationsYour metastore admin can then drop the external location 

  • 0 kudos
1 More Replies
AxelBrsn
by New Contributor III
  • 8096 Views
  • 2 replies
  • 0 kudos

Resolved! Importing python to DLT - Not working with DLT Pipeline

Hello, we are trying to adapt our developments (notebook with delta tables), into Delta Live Tables Pipelines.We tried to import Python files that are very useful for data transformations (silver data cleaning, for example) :From the Cluster (run man...

Data Engineering
Delta Live Table
import
pipeline
python
  • 8096 Views
  • 2 replies
  • 0 kudos
Latest Reply
AxelBrsn
New Contributor III
  • 0 kudos

The solution is to import from Python but also add the python file in the Pipeline settings, in the list of source code.

  • 0 kudos
1 More Replies
data-engineer-d
by Contributor
  • 4564 Views
  • 3 replies
  • 4 kudos

Parametrize the DLT pipeline for dynamic loading of many tables

I am trying to ingest hundreds of tables with CDC, where I want to create a generic/dynamic pipeline which can accept parameters (e.g table_name, schema, file path) and run the logic on it. However, I am not able to find a way to pass parameters to p...

Data Engineering
Delta Live Tables
  • 4564 Views
  • 3 replies
  • 4 kudos
Latest Reply
Gilg
Contributor II
  • 4 kudos

If you have different folders for each of your source tables, you can leverage python loops to naturally iterate over the folders.To do this, you need to create a create_pipeline function that has table_name, schema, path as your parameters. Inside t...

  • 4 kudos
2 More Replies
Ravikumashi
by Contributor
  • 2174 Views
  • 0 replies
  • 0 kudos

Issue with applying ACL's in Unit catlog enabled workspace

We have been using Hive Metastore in Databricks workspaces and recently enabled Unity Catalog for one of the workspace. However, we are encountering issues while applying grants on databases. The system is complaining, stating that table access contr...

Data Engineering
Databricks
spark simba
Unity Catalog
  • 2174 Views
  • 0 replies
  • 0 kudos
Tam
by New Contributor III
  • 12881 Views
  • 1 replies
  • 2 kudos

Delta Table on AWS Glue Catalog

I have set up Databricks cluster to work with AWS Glue Catalog by enabling the spark.databricks.hive.metastore.glueCatalog.enabled to true. However, when I create a Delta table on Glue Catalog, the schema reflected in the AWS Glue Catalog is incorrec...

Tam_0-1700157256870.png Tam_1-1700157262740.png
  • 12881 Views
  • 1 replies
  • 2 kudos
Latest Reply
monometa
New Contributor II
  • 2 kudos

Hi, could you please refer to something or explain in more detail your point about querying Delta Lake files directly instead of through the AWS Glue catalog and why it was highlighted as a best practice?

  • 2 kudos
NDK_1
by New Contributor II
  • 1986 Views
  • 1 replies
  • 0 kudos

I would like to Create a schedule in Databricks that runs a job on 1st working day of every month

I would like to create a schedule in Databricks that runs a job on the first working day of every month (working days referring to Monday through Friday). I tried using Cron syntax but didn't have any luck. Is there any way we can schedule this in Da...

  • 1986 Views
  • 1 replies
  • 0 kudos
Latest Reply
shan_chandra
Databricks Employee
  • 0 kudos

@NDK_1 - Cron syntax won't allow the combination of day of month and day of week. you can try creating two different schedules  - one for the first day, second day of the month and then add custom logic to check if it is an working day and then trigg...

  • 0 kudos
Constantine
by Contributor III
  • 16588 Views
  • 2 replies
  • 6 kudos

Resolved! CREATE TEMP TABLE FROM CTE

I have written a CTE in Spark SQL WITH temp_data AS (   ......   )   CREATE VIEW AS temp_view FROM SELECT * FROM temp_view; I get a cryptic error. Is there a way to create a temp view from CTE using Spark SQL in databricks?

  • 16588 Views
  • 2 replies
  • 6 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 6 kudos

In the CTE you can't do a CREATE. It expects an expression in the form of expression_name [ ( column_name [ , ... ] ) ] [ AS ] ( query )where expression_name specifies a name for the common table expression.If you want to create a view from a CTE, y...

  • 6 kudos
1 More Replies
test_123
by New Contributor
  • 1356 Views
  • 1 replies
  • 0 kudos

Autoloader not detecting changes/updated values for xml file

if i update the value in xml then autoloader not detecting the changes.same for delete/remove column or property in xml.  So request to you please help me to fix this issue

  • 1356 Views
  • 1 replies
  • 0 kudos
Latest Reply
Walter_C
Databricks Employee
  • 0 kudos

It seems that the issue you're experiencing with Autoloader not detecting changes in XML files might be related to how Autoloader handles schema inference and evolution. Autoloader can automatically detect the schema of loaded XML data, allowing you...

  • 0 kudos
SyedGhouri
by New Contributor III
  • 8868 Views
  • 2 replies
  • 0 kudos

Cannot create jobs with jobs api - Azure databricks - private network

HiI'm trying to deploy the databricks jobs from dev to prod environment. I have jobs in dev environment and using azure devops, I deployed the jobs in the code format to prod environment. Now when I use the post method to create the job programmatica...

  • 8868 Views
  • 2 replies
  • 0 kudos
Latest Reply
daniel_sahal
Databricks MVP
  • 0 kudos

@SyedGhouri You need to setup self-hosted Azure DevOps Agent inside your VNET.

  • 0 kudos
1 More Replies
pshuk
by New Contributor III
  • 6032 Views
  • 2 replies
  • 0 kudos

Copying files from dev environment to prod environment

Hi,Is there a quick and easy way to copy files between different environments? I have copied a large number of files on my dev environment (unity catalog) and want to copy them over to production environment. Instead of doing it from scratch, can I j...

  • 6032 Views
  • 2 replies
  • 0 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 0 kudos

If you want to copy files in Azure, ADF is usually the fastest option (for example TB of csvs, parquets). If you want to copy tables, just use CLONE. If it is files with code just use Repos and branches.

  • 0 kudos
1 More Replies
aseufert
by New Contributor III
  • 11466 Views
  • 2 replies
  • 3 kudos

Git Stash

Looked through some previous posts and documentation and couldn't find anything related to use of Git stash in Databricks Repos. Perhaps I missed it. I also don't see an option in the UI.Does anyone know if there's a way to stash changes either in th...

  • 11466 Views
  • 2 replies
  • 3 kudos
Latest Reply
javierbg
New Contributor III
  • 3 kudos

This is actually a big hurdle when trying to switch between working in two different branches, it would be a welcome addition to the Databricks IDE.

  • 3 kudos
1 More Replies

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels