cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Yunky007
by New Contributor
  • 1447 Views
  • 3 replies
  • 0 kudos

ETL pipeline

I have an ETL pipeline in workflows which I am using to create materialized view. I want to schedule the pipeline for 10 hours only starting from 10 am. How can I schedule that? I can only see hourly basis schedule or cron syntax. I want the compute ...

  • 1447 Views
  • 3 replies
  • 0 kudos
Latest Reply
KaelaniBraster
New Contributor II
  • 0 kudos

Use cron syntax with a stop condition after 10 hours runtime.

  • 0 kudos
2 More Replies
nito
by New Contributor
  • 502 Views
  • 0 replies
  • 0 kudos

New remote (dbfs) caching python library

I had some problems getting much speedup at all from spark or DB disk cache, which I think is essential when developing PySpark code iteratively in notebooks. So I developed a handy caching-library for this which has recently been open sourced, see h...

  • 502 Views
  • 0 replies
  • 0 kudos
Prashant2
by New Contributor II
  • 3097 Views
  • 4 replies
  • 0 kudos

import pymssql fails on DLT Serverless

I have a delta live table pipeline which works fine on normal DLT job cluster.But as soon as we switch it to use serverless compute it fails.The failure happens at "import pymssql" after doing pip install pymssql as first statement of the source code...

  • 3097 Views
  • 4 replies
  • 0 kudos
Latest Reply
eniwoke
Contributor II
  • 0 kudos

Hi @Prashant2 I am curious to know how you installed the library in your notebook. Did you use%pip install pymssqlIf so, could you try using a shell command instead, like:!pip install pymssqlI’ve had success using !pip install in serverless compute e...

  • 0 kudos
3 More Replies
Takuya-Omi
by Valued Contributor III
  • 2027 Views
  • 3 replies
  • 0 kudos

Limitations When Using Instance Profiles to Connect to Kinesis

I encountered an issue where I couldn’t successfully connect to Kinesis Data Streams using instance profile authentication while working with Delta Live Tables (DLT) in a Unity Catalog (UC)-enabled environment.According to the documentation, instance...

  • 2027 Views
  • 3 replies
  • 0 kudos
Latest Reply
am1go
New Contributor II
  • 0 kudos

I'm in the same boat - tried every workaround possible, nothing works for me. Databricks is pushing Unity Catalog hard so i find it unsettling that there is no solution for this issue other than reverting back to using hive metastore.

  • 0 kudos
2 More Replies
israelst
by New Contributor III
  • 6129 Views
  • 8 replies
  • 5 kudos

DLT can't authenticate with kinesis using instance profile

When running my notebook using personal compute with instance profile I am indeed able to readStream from kinesis. But adding it as a DLT with UC, while specifying the same instance-profile in the DLT pipeline setting - causes a "MissingAuthenticatio...

Data Engineering
Delta Live Tables
Unity Catalog
  • 6129 Views
  • 8 replies
  • 5 kudos
Latest Reply
am1go
New Contributor II
  • 5 kudos

Has anyone figured it out? Tried all solutions posted in this thread, nothing works for me...

  • 5 kudos
7 More Replies
Shaurya_greyhou
by New Contributor
  • 2105 Views
  • 3 replies
  • 0 kudos

Inquiry on How to Return Visualizations, Images, or URLs from Databricks Genie API

Question 1)I am currently working on integrating Databricks Genie with Microsoft Teams and am looking for guidance on how to return visualizations, images, or URLs in a format that can be rendered in Teams. Specifically, I am trying to figure out how...

  • 2105 Views
  • 3 replies
  • 0 kudos
Latest Reply
NandiniN
Databricks Employee
  • 0 kudos

For Q1 -  You can share a Databricks dashboard's URL to allow rendering or interaction in Microsoft Teams: Create a dashboard in Databricks, ensuring it contains the required visualizations or interaction widgets.Publish the dashboard and ensure that...

  • 0 kudos
2 More Replies
carlos_tasayco
by Contributor
  • 3348 Views
  • 2 replies
  • 1 kudos

Resolved! Pylint github workflow

I am implementing this to my workflow however there is a problem I want to avoid Enrollment_Profile/mv_person_test_pylint.py:22:9: E0602: Undefined variable 'spark' (undefined-variable)Can someone avoid that issue with spark dlt?

  • 3348 Views
  • 2 replies
  • 1 kudos
Latest Reply
carlos_tasayco
Contributor
  • 1 kudos

Hi,Yes I found that solution too, the problem after that was dbutils, pylint identify that as a no imported library. I found a solution at least for me in the pylintrc file I added this:Doing this I avoide this problem. 

  • 1 kudos
1 More Replies
SeekingSolution
by New Contributor II
  • 957 Views
  • 2 replies
  • 1 kudos

Dynamic Parameter

I have a query I need to run with two parameters: Workflow and workflow steps. The dropdown list supplied by "Steps" should change based on the input of the "Workflow" dropdown.When I use the following code, it creates the "Steps" dropdown list based...

  • 957 Views
  • 2 replies
  • 1 kudos
Latest Reply
SeekingSolution
New Contributor II
  • 1 kudos

That's a shame it has to be re-instantiated each time! Thank you for letting me know that functionality is not currently supported.

  • 1 kudos
1 More Replies
Johannes_E
by New Contributor III
  • 1342 Views
  • 2 replies
  • 0 kudos

Loguru doesn't save logs to Databricks volume

I've added an external volume named "logs" to my Databricks Unity Catalog. Within a Databricks notebook I can verify that it exists (os.path.exists(path='/Volumes/my_catalog/schema_name/logs') and even write a file to it that I can see within the Dat...

  • 1342 Views
  • 2 replies
  • 0 kudos
Latest Reply
Thomas_Zhang
Databricks Partner
  • 0 kudos

I am having the same problem. I am using a work-around currently but definitely would love to see a solution. FYI: here is my work-around:logger.add(f"{output_folder_path}/../logging/workflow_job1_{datetime_str}.log",rotation='10 days',retention="10 ...

  • 0 kudos
1 More Replies
Fuzail
by New Contributor III
  • 3259 Views
  • 4 replies
  • 2 kudos

Resolved! Databricks JDBC Error while connecting from Datastage JDBC connector

I am reading data from Databricks in datatstage 11.7 on prem using datastage JDBC connector and getting the below error. I tried to limit the select queries to one row , it was able to read data form the source, JDBC_Connector_0: The connector encoun...

  • 3259 Views
  • 4 replies
  • 2 kudos
Latest Reply
Louis_Frolio
Databricks Employee
  • 2 kudos

Here are some suggestions, not sure if it fits with what you are doing but they are worth mentioning.   The Databricks JDBC driver currently does not support batch updates, which is why your updates appear to process row by row with a batch size of 1...

  • 2 kudos
3 More Replies
Dharinip
by Contributor
  • 3943 Views
  • 4 replies
  • 1 kudos

Resolved! Incremental Load on Materialized Views

Is incremental load possible on Materialized views. I would like to get some tutorials or videos on how to perform incremental refresh on MVs in gold layers. Also is it mandatory to have PKs for performing incremental loads in MVs.

  • 3943 Views
  • 4 replies
  • 1 kudos
Latest Reply
Dharinip
Contributor
  • 1 kudos

This is great. Thank you so much.

  • 1 kudos
3 More Replies
_singh_vish
by New Contributor III
  • 1565 Views
  • 2 replies
  • 1 kudos

Resolved! Working of @DLT.table

I am using @Dlt.table decorator to create a table which will store history for my tables.My code works like this:@Dlt.table(name="table name")def target:       Custom spark code to create history Even though the spark code creates and prints history ...

  • 1565 Views
  • 2 replies
  • 1 kudos
Latest Reply
lingareddy_Alva
Esteemed Contributor
  • 1 kudos

@_singh_vish DLT assumes the result of each @dlt.table decorator is the current state of the table at that point in time. So, when you define a DLT table using @dlt.table, whatever DataFrame is returned by that function will replace the previous data...

  • 1 kudos
1 More Replies
Thomas_Zhang
by Databricks Partner
  • 758 Views
  • 1 replies
  • 0 kudos

DLT job failed to parse timestamp string with T and Z.

Hi I am struggling with converting a timestamp string with T and Z to a timestamp column in my DLT job.  here is the relevent code snippet:trans_rules={'timestamp_value', '''to_timestamp(timstamp_str, "yyyy-MM-dd'T'HH:mm:ss.SSS'Z'")'''}In my DLT func...

Screenshot 2025-05-02 at 7.25.03 AM.png
  • 758 Views
  • 1 replies
  • 0 kudos
Latest Reply
Ayushi_Suthar
Databricks Employee
  • 0 kudos

Hi @Thomas_Zhang , Good Day! Can you give it a try with the below code format and check if it helps you to set the timezone with Z :  ======================= date_format( to_timestamp(`createdon`, 'yyyy-MM-dd\'T\'HH:mm:ss.SSSSSSSX'), 'yyyy-MM-dd\'T...

  • 0 kudos
pferreira
by Databricks Partner
  • 4623 Views
  • 5 replies
  • 2 kudos

MongoDB Spark Connector v10.x read error on Databricks 14.3 LTS

Im facing an error when updating DBR from 13.3 LTS to 14.3LTSIm using the spark:mongo-spark-connector:10.2.1 and running the following script   connectionString = ****** database = ***** collection = ***** spark = SparkSession \ .builder \ ...

  • 4623 Views
  • 5 replies
  • 2 kudos
Latest Reply
Namrata1
New Contributor II
  • 2 kudos

Hi @pmaferreira ,can you please help with which version you are using and is it supports ignoreNullValues option?

  • 2 kudos
4 More Replies
Labels