cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

NamrataHindujaS
by New Contributor III
  • 2179 Views
  • 2 replies
  • 3 kudos

Resolved! Namrata Hinduja Geneva, Switzerland (Swiss) - Getting Started with Databricks

Hi everyone,I'm Namrata Hinduja Geneva, Switzerland (Swiss) and I come from an ETL background and am looking to get started with Databricks. I'd appreciate your guidance on a clear learning roadmap, as well as any industry-recognized certifications t...

  • 2179 Views
  • 2 replies
  • 3 kudos
Latest Reply
NamrataHindujaS
New Contributor III
  • 3 kudos

Thanks to Vinay_M_R for your valuable reply — it’s a great help. I’ll definitely follow the instructions.     RegardsNamrata Hinduja Geneva, Switzerland (Swiss)

  • 3 kudos
1 More Replies
turagittech
by Contributor
  • 995 Views
  • 2 replies
  • 0 kudos

DLT pipeline python stop scanning all databases in source

Hi All,I have set up a DLT pipleline for SQL Server to use CDC as per this instruction https://learn.microsoft.com/en-us/azure/databricks/ingestion/lakeflow-connect/sql-server-pipeline I have it in principal working, however, it scans all databases a...

  • 995 Views
  • 2 replies
  • 0 kudos
Latest Reply
turagittech
Contributor
  • 0 kudos

I thought I might follow up this after getting it all working with the help of my local Databricks office. AS the CDC has been crated it scans metadata for the server that you connect to. This may get altered in a future release, I have no idea as to...

  • 0 kudos
1 More Replies
Splush_
by New Contributor III
  • 2041 Views
  • 2 replies
  • 0 kudos

Error using COPY INTO after changing schema name

Hey guys,I found a weird bug with the COPY INTO command. I have copied a folder in Azure Cloud Storage with a delta table. This worked perfectly. But after changing the name of the schema for this table, it stopped working because it keeps trying to ...

Splush__0-1752756534234.png
  • 2041 Views
  • 2 replies
  • 0 kudos
Latest Reply
lingareddy_Alva
Esteemed Contributor
  • 0 kudos

Hi @Splush_ This is a common caching issue in Databricks when working with COPY INTO operations.The system is holding onto metadata about the old schema location even after you've renamed it.Clear the COPY INTO operation history:COPY INTO {new_landin...

  • 0 kudos
1 More Replies
cpatte7372
by New Contributor III
  • 4240 Views
  • 4 replies
  • 1 kudos

Databricks Community Edition or Databricks Freen Account Verification Code Not Being Recieved

Dear Community,My email provider won't allow verification emails from Databricks email address: 'noreply@databricks.com' because of the formatting of the email. Because the verification code is actually in the subject of the email my email providers ...

cpatte7372_0-1752743924027.png
  • 4240 Views
  • 4 replies
  • 1 kudos
Latest Reply
cpatte7372
New Contributor III
  • 1 kudos

Hi Advika,I reached out Databricks Support and they recommended asking this community, see message, below Hi,Thank you for reaching out to Databricks Support!Currently, the option to redirect the OTP to a different email address is not available. We ...

  • 1 kudos
3 More Replies
sopon
by New Contributor
  • 1778 Views
  • 1 replies
  • 0 kudos

Cosmos Spark Connector keep loading

I am try to connect to cosmos using spark cosmos connector following this instruction: https://learn.microsoft.com/en-us/azure/cosmos-db/nosql/tutorial-spark-connector?pivots=programming-language-python.The problem is the all spark cosmos operation k...

  • 1778 Views
  • 1 replies
  • 0 kudos
Latest Reply
szymon_dybczak
Esteemed Contributor III
  • 0 kudos

Hi @sopon ,Could you check drivers logs? Maybe we will find some useful information that will help us pinpoint root cause?Also, you can check if you can resolve private endpoint of Comos DB from databricks workspace $sh telnet comos_db_fqdn 433

  • 0 kudos
databricks_use2
by New Contributor II
  • 5321 Views
  • 7 replies
  • 3 kudos

Autolader and files with invalid path

I'm encountering an issue with Autoloader where it fails to process certain files due to specific characters in their names. For example, files that begin with an underscore (e.g., _data_etc.).json) are ignored and not processed. After some investiga...

  • 5321 Views
  • 7 replies
  • 3 kudos
Latest Reply
BS_THE_ANALYST
Databricks Partner
  • 3 kudos

@databricks_use2 I'm merely echoing the responses above but it sounds like you should be renaming those files before doing anything. Post here also supports this idea: https://community.databricks.com/t5/data-engineering/how-do-i-read-the-contents-of...

  • 3 kudos
6 More Replies
smpa01
by Contributor
  • 4503 Views
  • 2 replies
  • 2 kudos

Resolved! External Table from volume

@szymon_dybczak I am experimenting to see if there is a way for me to create an external table from files written into unity catalog volume. I tried the following but it did not work.# COMMAND ---------- # DBTITLE 1, Daily Fetch and Write # sample ...

  • 4503 Views
  • 2 replies
  • 2 kudos
Latest Reply
Pat
Esteemed Contributor
  • 2 kudos

Hi @smpa01 ,You cannot create external table on the data in volume:see link to documentation: https://docs.databricks.com/gcp/en/volumes

  • 2 kudos
1 More Replies
AR3
by New Contributor
  • 1914 Views
  • 1 replies
  • 1 kudos

Why aren't my Delta Live Tables stored in the expected folder structure in ADLS?

I set up an Azure Data Lake Storage (ADLS) account with containers named metastore, bronze, silver, gold, and source. I created a Unity Catalog metastore in Databricks via the admin console, and I created a container called metastore in my Data Lake....

  • 1914 Views
  • 1 replies
  • 1 kudos
Latest Reply
szymon_dybczak
Esteemed Contributor III
  • 1 kudos

Hi @AR3 ,I think DLT up until recently supported only a managed tables. Now they rebranded it to Lakeflow Declarative Pipelines and add option called Lakeflow Declarative Pipelines sinks. Lakeflow Declarative Pipelines sinks are targets for Lakeflow ...

  • 1 kudos
iskidet_glenny
by New Contributor
  • 2938 Views
  • 2 replies
  • 0 kudos

Possibility of creating and running concurrent Job Runs from a single job all parameters driven

Hello Community,I hope everyone is doing well.I’ve been exploring the idea of creating multiple instances of a job which will be jobs runs with different parameter configurations. Has anyone else considered this approach?Imagine a scenario where you ...

  • 2938 Views
  • 2 replies
  • 0 kudos
Latest Reply
Roshaan
New Contributor II
  • 0 kudos

I have seen correlation that bigger the cluster configuration leads to more concurrent job runs successfully, is that true and if so why? 

  • 0 kudos
1 More Replies
joshuat
by Contributor
  • 6894 Views
  • 5 replies
  • 0 kudos

How to partition JDBC Oracle read query and cast with TO_DATE on partition date field?

I'm attempting to fetch an Oracle Netsuite table in parallel via JDBC using the Netsuite Connect JAR, already installed on the cluster and setup correctly. I can do successfully with a single-threaded approach using the `dbtable` option:table = 'Tran...

  • 6894 Views
  • 5 replies
  • 0 kudos
Latest Reply
joshuat
Contributor
  • 0 kudos

@pavlosskev I did not and have to do partitioned reads via the ID.

  • 0 kudos
4 More Replies
Y2DTL
by New Contributor III
  • 6308 Views
  • 5 replies
  • 6 kudos

Resolved! Stream/static Join

Hi allWould appreciate your help on a topic.when performing a join between a static and streaming dataframe is the latest version of the  static table used at the start of the job or within each micro-batch. Documentation doesn’t seem to specifically...

  • 6308 Views
  • 5 replies
  • 6 kudos
Latest Reply
szymon_dybczak
Esteemed Contributor III
  • 6 kudos

Hi @Y2DTL ,Here's an answer from documentation:  A stream-static join joins the latest valid version of a Delta table (the static data) to a data stream using a stateless join. When Databricks processes a micro-batch of data in a stream-static join, ...

  • 6 kudos
4 More Replies
joeyslaptop
by New Contributor II
  • 10912 Views
  • 6 replies
  • 3 kudos

How to add a column to a new table containing the original source filenames in DataBricks.

If this isn't the right spot to post this, please move it or refer me to the right area.I recently learned about the "_metadata.file_name".  It's not quite what I need.I'm creating a new table in DataBricks and want to add a USR_File_Name column cont...

Data Engineering
Databricks
filename
import
SharePoint
Upload
  • 10912 Views
  • 6 replies
  • 3 kudos
Latest Reply
Debayan
Databricks Employee
  • 3 kudos

Hi, Could you please elaborate more on the expectation here? 

  • 3 kudos
5 More Replies
allyallen
by New Contributor III
  • 4844 Views
  • 5 replies
  • 0 kudos

Resolved! Variable Compute clusters within a Job

We have 3 possible compute clusters that we can run a notebook against.They are varying sizes and the one that the notebook uses will depend on the size of the data being processed.We "t-shirt size" each tenant base on their data size (S, M, L) and c...

  • 4844 Views
  • 5 replies
  • 0 kudos
Latest Reply
allyallen
New Contributor III
  • 0 kudos

Hi @eniwoke That's a great solution thank you so much!Our process is now as follows:NB1 gets the tenant t-shirt size and sets the cluster_id for each size as a variable.The notebook then loops through each tenant and using the DataBricks API updates ...

  • 0 kudos
4 More Replies
Steffen
by New Contributor III
  • 4243 Views
  • 4 replies
  • 1 kudos

Resolved! DictionaryFilters Pushdown on Views

HelloI have a very simple table with time series data with three columns:id (long): unique id of signalts (unix timestamp): timestamp of the event in unix timestamp formatvalue (double): value of the signal at the given timestampFor every second ther...

  • 4243 Views
  • 4 replies
  • 1 kudos
Latest Reply
szymon_dybczak
Esteemed Contributor III
  • 1 kudos

Hi @Steffen , This happens because you're applying some functions to ts attribute like FLOOR, from_unix_timestamp etc., which hides the raw ts from Spark's optimizer, so it can’t push down filters.If you can, try to add additional attribute to your u...

  • 1 kudos
3 More Replies
ShankarM
by Databricks Partner
  • 3620 Views
  • 3 replies
  • 0 kudos

DBR version 10.4 impact

hi,For one of our projects which is in production we are using DBR 10.4 for which EOL was Mar 18th, 2025.I wanted to know will there any impact to existing workloads which are running in production. Is yes then can you let me know the impact and risk...

  • 3620 Views
  • 3 replies
  • 0 kudos
Latest Reply
Isi
Honored Contributor III
  • 0 kudos

Hello @ShankarM Actually, there is no official End of Life (EoL) date provided by Databricks. If you check the documentation I referenced in my previous message, EoL is the next phase after End of Support (EoS), but Databricks does not announce a spe...

  • 0 kudos
2 More Replies
Labels