Hello
Can anyone help with an error I am getting when running ADF. An ingestion pipeline fails and when I click on the link I am taken to a Databricks error message "7 duplicates detected in transformed data". However, when I run the transformation cell of the notebook in question I get no issues with the data produced and there are zero duplicate rows. Another notebook referencing this notebook (which is also run as part of the ADF pipeline) has a check for duplicates and that is what is causing the ADF ingestion pipeline to fail. Since I have been unable to replicate the error and identify any duplicate rows based on the SQL which is being run in the Databricks notebook, is anyone able to advise me on anything I can do within Databricks to get it to tell me what the 7 rows of data in question are? Sorry if this request is a bit muddled, I am new to Databricks.
Thank you