cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Manjula_Ganesap
by Contributor
  • 6407 Views
  • 2 replies
  • 1 kudos

Resolved! Delta Live Table pipeline failure - Table missing

Hi All,I set up a DLT pipeline to create 58 bronze tables and a subsequent DLT live table that joins the 58 bronze tables created in the first step. The pipeline runs successfully most times.My issue is that the pipeline fails once every 3/4 runs say...

Manjula_Ganesap_0-1692373291621.png Manjula_Ganesap_1-1692373340027.png
  • 6407 Views
  • 2 replies
  • 1 kudos
Latest Reply
Manjula_Ganesap
Contributor
  • 1 kudos

@jose_gonzalez @Retired_mod  - Missed to update the group on the fix. Reached out to Databricks to understand and it was identified that the threads call that i was making was causing the issue. After i removed it - i don't see it happening. 

  • 1 kudos
1 More Replies
Manjula_Ganesap
by Contributor
  • 2879 Views
  • 2 replies
  • 1 kudos

Delta Live Table (DLT) Initialization fails frequently

With no change in code, i've noticed that my DLT initialization fails and then an automatic rerun succeeds. Can someone help me understand this behavior. Thank you.  

Manjula_Ganesap_0-1694002699491.png
  • 2879 Views
  • 2 replies
  • 1 kudos
Latest Reply
Manjula_Ganesap
Contributor
  • 1 kudos

@jose_gonzalez  - Missed to update the group on the fix. Reached out to Databricks to understand and it was identified that the threads call that i was making was causing the issue. After i removed it - i don't see it happening. 

  • 1 kudos
1 More Replies
Kit
by New Contributor III
  • 6931 Views
  • 2 replies
  • 1 kudos

How to use checkpoint with change data feed

I have a scheduled job (running in continuous mode) with the following code``` ( spark .readStream .option("checkpointLocation", databricks_checkpoint_location) .option("readChangeFeed", "true") .option("startingVersion", VERSION + 1)...

  • 6931 Views
  • 2 replies
  • 1 kudos
Latest Reply
Anonymous
Not applicable
  • 1 kudos

Hi @Kit Yam Tse​ Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answers y...

  • 1 kudos
1 More Replies
editter
by New Contributor II
  • 3252 Views
  • 1 replies
  • 1 kudos

Unable to open a file in dbfs. Trying to move files from Google Bucket to Azure Blob Storage

Background:I am attempting to download the google cloud sdk on Databricks. The end goal is to be able to use the sdk to transfer files from a Google Cloud Bucket to Azure Blob Storage using Databricks. (If you have any other ideas for this transfer p...

Data Engineering
dbfs
Google Cloud SDK
pyspark
tarfile
  • 3252 Views
  • 1 replies
  • 1 kudos
Latest Reply
editter
New Contributor II
  • 1 kudos

Thanks you for the response!2 Questions:1. How would you create a cluster with the custom requirements for the google cloud sdk? Is that still possible for a Unity Catalog enabled cluster with Shared Access Mode?2. Is a script action the same as a cl...

  • 1 kudos
AMadan
by New Contributor III
  • 14597 Views
  • 1 replies
  • 1 kudos

Date difference in Months

Hi Team,I am working on migration from Sql server to databricks environment.I encounter a challenge where Databricks and sql server giving different results for date difference function. Can you please help?--SQL SERVERSELECT DATEDIFF(MONTH , '2007-0...

  • 14597 Views
  • 1 replies
  • 1 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 1 kudos

While I was pretty sure it has to do with T-SQL not following ANSI standards, I could not actually tell you what exactly the difference is.  So I asked chatgpt and here we go:The difference between DATEDIFF(month, date1, date2) in T-SQL and ANSI SQL ...

  • 1 kudos
alvaro_databric
by New Contributor III
  • 6577 Views
  • 0 replies
  • 0 kudos

Azure Databricks Spot Cost

Hi all,I started using Azure Spot VMs by switching on the spot option when creating a cluster, however in the Azure billing dashboard, after some months of using spot instances, I only have OnDemand PurchaseType. Does someone guess what could be happ...

  • 6577 Views
  • 0 replies
  • 0 kudos
THIAM_HUATTAN
by Valued Contributor
  • 56480 Views
  • 8 replies
  • 2 kudos

Skip number of rows when reading CSV files

staticDataFrame = spark.read.format("csv")\ .option("header", "true").option("inferSchema", "true").load("/FileStore/tables/Consumption_2019/*.csv") when above, I need an option to skip say first 4 lines on each CSV file, How do I do that?

  • 56480 Views
  • 8 replies
  • 2 kudos
Latest Reply
Michael_Appiah
Databricks Partner
  • 2 kudos

The option... .option("skipRows", <number of rows to skip>) ...works for me as well. However, I am surprised that the official Spark doc does not list it as a CSV Data Source Option: https://spark.apache.org/docs/latest/sql-data-sources-csv.html#data...

  • 2 kudos
7 More Replies
rsamant07
by New Contributor III
  • 1788 Views
  • 0 replies
  • 0 kudos

TLS Mutual Authentication for Databricks API

Hi,we are exploring the use of Databricks Statement Execution API for sharing the data through API to different consumer applications, however  we have a security requirement  to configure TLS Mutual Authentication to limit the consumer application t...

  • 1788 Views
  • 0 replies
  • 0 kudos
IvanK
by New Contributor III
  • 4866 Views
  • 1 replies
  • 0 kudos

Register permanent UDF from Python file

Hello,I am trying to create a permanent UDF from a Python file with dependencies that are not part of the standard Python library.How do I make use of CREATE FUNCTION (External) [1] to create a permanent function in Databricks, using a Python file th...

Data Engineering
Create function
python
  • 4866 Views
  • 1 replies
  • 0 kudos
nikhilkumawat
by Databricks Partner
  • 12159 Views
  • 3 replies
  • 1 kudos

Install maven package on job cluster

I have a single user cluster and I have created a workflow which will read excel file from Azure storage account. For reading excel file I am using com.crealytics:spark-excel_2.13:3.4.1_0.19.0  library on single user all-purpose cluster.I have alread...

  • 12159 Views
  • 3 replies
  • 1 kudos
Latest Reply
nikhilkumawat
Databricks Partner
  • 1 kudos

Hi @Retired_mod Can you ellaborate few more things:1. When spark-shell installs any maven package, what is the default location where it downloads the jar file ?2. As far as I know default location for jars is "/databricks/jars/" from where spark pic...

  • 1 kudos
2 More Replies
merca
by Valued Contributor II
  • 13364 Views
  • 7 replies
  • 7 kudos

How can I give users permissions to see the objects metadata without access to data

Only permissions I can see are select and this gives access to data and that is very unwanted. I only want users to see the metadata, like table/view/column names and descriptions/comments and location and such but not to see any data.

  • 13364 Views
  • 7 replies
  • 7 kudos
Latest Reply
merca
Valued Contributor II
  • 7 kudos

@Uma Maheswara Rao Desula​ , @Geeta Sai Boddu​  and @S S​ ,Thank you for the responses. I have gotten answer from Databricks and it seems this is not possible and this is something that is investigated as a capability.

  • 7 kudos
6 More Replies
silvadev
by New Contributor III
  • 10932 Views
  • 1 replies
  • 0 kudos

Resolved! MongoDB Spark Connector v10.x read error on Databricks 13.x

I have facing a error when I am trying to read data from any MongoDB collection using MongoDB Spark Connector v10.x on Databricks v13.x.The below error appear to start at line #113 of MongoDB Spark Connector Library (v10.2.0):  java.lang.NoSuchMethod...

Data Engineering
mongodb
spark
  • 10932 Views
  • 1 replies
  • 0 kudos
Latest Reply
silvadev
New Contributor III
  • 0 kudos

The problem was fixed in Databricks Runtime 13.3 LTS.

  • 0 kudos
jonathan-dufaul
by Valued Contributor
  • 15639 Views
  • 2 replies
  • 0 kudos

Resolved! Error updating workflow, webhook not found?

I have no idea what this error means or what it could mean. When I'm trying to save a workflow I get a popup saying this:

image
  • 15639 Views
  • 2 replies
  • 0 kudos
Latest Reply
Robin_LOCHE
New Contributor II
  • 0 kudos

I had the same issue, thanks for the info! Apparently it's also possible to fix it by removing all the actual notification in the interface (the bugged one is not displayed, but if you remove everything for some reason it removes the bugged one too)....

  • 0 kudos
1 More Replies
Labels