cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

janglais
by Visitor
  • 10 Views
  • 0 replies
  • 0 kudos

DLT Pipeline with unknown deleted source data

Hello.. I need help. So the context is : - ERP data for company in my group is stored in sql tables - Currently, once per day we copy the last 2 months of data (creation date) from each table into our datalake landing zone (we can however do full cop...

  • 10 Views
  • 0 replies
  • 0 kudos
Mits11
by New Contributor
  • 12 Views
  • 0 replies
  • 0 kudos

Community edition cluster - UI shows incorrect cores

Hi,I am a community edition user which gives me cluster ( as per below image)15GB of memory and 2 cores with one driver node ONLY.However,when I read a csv file of 181MB size,1) it generates 8 partitiones.As per default maxPartitionBytes is set to 12...

Mits11_1-1761165245208.png Mits11_3-1761165673802.png Mits11_2-1761165566839.png
  • 12 Views
  • 0 replies
  • 0 kudos
Rjdudley
by Honored Contributor
  • 265 Views
  • 3 replies
  • 0 kudos

Resolved! AUTO CDC API and sequence column

The docs for AUTO CDC API stateYou must specify a column in the source data on which to sequence records, which Lakeflow Declarative Pipelines interprets as a monotonically increasing representation of the proper ordering of the source data.Can this ...

  • 265 Views
  • 3 replies
  • 0 kudos
Latest Reply
Rjdudley
Honored Contributor
  • 0 kudos

Thanks Szymon, I'm familiar with the Postgre SQL implementation and was hoping Databricks would behave the same.

  • 0 kudos
2 More Replies
ankit001mittal
by New Contributor III
  • 1842 Views
  • 1 replies
  • 2 kudos

DLT schema evolution/changes in the logs

Hi all,I want to figure out how to find when the schema evolution/changes are happening in the objects in DLT pipelines through the DLT logs.Could you please share some sample DLT logs which explains about the schema changes?Thank you for your help.

  • 1842 Views
  • 1 replies
  • 2 kudos
Latest Reply
mark_ott
Databricks Employee
  • 2 kudos

To find when schema evolution or changes are happening in objects within DLT (Delta Live Table) pipelines, you need to monitor certain entries within the DLT logs or Delta transaction logs that signal modifications to the underlying schema of a table...

  • 2 kudos
minhhung0507
by Valued Contributor
  • 2488 Views
  • 3 replies
  • 0 kudos

DLT Flow Failed Due to Missing Flow Checkpoints Directory When Using Unity Catalog

I’m encountering an issue while running a Delta Live Tables (DLT) pipeline that is managed using Unity Catalog on Databricks. The pipeline has failed and is not restarting, showing the following error:java.lang.IllegalArgumentException: flow checkpoi...

  • 2488 Views
  • 3 replies
  • 0 kudos
Latest Reply
mark_ott
Databricks Employee
  • 0 kudos

The best practices for setting up checkpointing in Delta Live Tables (DLT) pipelines when using Unity Catalog are largely centered on leveraging Databricks' managed services, adhering to Unity Catalog's table management conventions, and minimizing th...

  • 0 kudos
2 More Replies
sumitkumar_284
by New Contributor II
  • 306 Views
  • 4 replies
  • 1 kudos

Not able to refresh powerbi dashboar form databricks jobs

I am trying to refresh Power BI Dashboard using Databricks jobs and constantly getting this error, but I am providing optional parameters which includes catalog and database. Also, things to note that I am able to do refresh on Power BI UI using both...

  • 306 Views
  • 4 replies
  • 1 kudos
Latest Reply
szymon_dybczak
Esteemed Contributor III
  • 1 kudos

Hi @sumitkumar_284 ,Can you provide us more details? Are you using Unity Catalog? Which authentication mechanism you have? In which version of Power BI Desktop you've developed your semantic model/dashboard? Do you meet all below requirements?Publish...

  • 1 kudos
3 More Replies
hf-databricks
by Visitor
  • 33 Views
  • 1 replies
  • 0 kudos

Unable to create workspace

Hi Team,we have challenge creating workspace in data bricks account created on top of aws.below are the details:Databricks account name : saichaitanya.vaddadhi@healthfirsttech.com's LakehouseAWS Account id : 720016114009databricks id: 1ee8765f-b472-4...

  • 33 Views
  • 1 replies
  • 0 kudos
Latest Reply
BS_THE_ANALYST
Esteemed Contributor II
  • 0 kudos

@hf-databricks there's a quickstart guide for creating a workspace with AWS: https://docs.databricks.com/aws/en/admin/workspace/quick-start There's a list of requirements:There's more options for creating workspaces. Above, I just listed the recommen...

  • 0 kudos
der
by Contributor
  • 52 Views
  • 8 replies
  • 0 kudos

Rasterio on shared/standard cluster has no access to proj.db

We try to use rasterio on a Databricks shared/standard cluster with DBR 17.1. Rasterio is directly installed on the cluster as library. Code:import rasterio rasterio.show_versions()Output: rasterio info:rasterio: 1.4.3GDAL: 3.9.3PROJ: 9.4.1GEOS: 3.11...

  • 52 Views
  • 8 replies
  • 0 kudos
Latest Reply
Chiran-Gajula
New Contributor
  • 0 kudos

Hi @der Can you try adding this in your test script.import osos.environ["PROJ_LIB"]="/databricks/native/proj-data"Hope users have access to this path /databricks/native/proj-data 

  • 0 kudos
7 More Replies
maninegi05
by New Contributor
  • 228 Views
  • 3 replies
  • 0 kudos

DLT Pipeline Stopped working

Hello, Suddenly our DLT pipelines we're getting failures saying thatLookupError: Traceback (most recent call last): result_df = result_df.withColumn("input_file_path", col("_metadata.file_path")).withColumn( ...

  • 228 Views
  • 3 replies
  • 0 kudos
Latest Reply
Louis_Frolio
Databricks Employee
  • 0 kudos

Greetings @maninegi05 , I did some digging internally and I believe some recent changes to the DLT image may be to blame. We are aware of regression issue and are actively working to address them. TL/DR Why you might see “LookupError: ContextVar 'par...

  • 0 kudos
2 More Replies
devpavan
by New Contributor
  • 107 Views
  • 7 replies
  • 0 kudos

Encountering an error while setting up a single-node cluster on top of aws

Hi Team,I'm trying to create a single-node cluster in Databricks on AWS, but I'm encountering an error. Could you please assist me with this?{ "reason": { "code": "INVALID_ARGUMENT", "type": "CLIENT_ERROR", "parameters": { "databr...

  • 107 Views
  • 7 replies
  • 0 kudos
Latest Reply
nayan_wylde
Honored Contributor III
  • 0 kudos

@devpavan Are you using API or terraform to create. Can you please share the json config that you are passing?

  • 0 kudos
6 More Replies
aravind-ey
by New Contributor II
  • 19389 Views
  • 22 replies
  • 4 kudos

vocareum lab access

Hi I am doing a data engineering course in databricks(Partner labs) and would like to have access to vocareum workspace to practice using the demo sessions.can you please help me to get the access to this workspace?regards,Aravind

  • 19389 Views
  • 22 replies
  • 4 kudos
Latest Reply
Eicke
Visitor
  • 4 kudos

You can log into databricks, search for "Canada Sales" in the Marketplace and find "Simulated Canada Sales and Opportunities Data". Get free instant access, wait a few seconds for the warehouse to be built for you et voila: the tables for building th...

  • 4 kudos
21 More Replies
lezwon
by Contributor
  • 2320 Views
  • 1 replies
  • 1 kudos

Cant view DAB deployed pipelines in Databricks UI

I am using the databricks asset pipeline to version control the jobs and pipelines in my workspace. I recently pulled these pipelines from the workspace using the `databricks bundle generate pipeline` command and deployed them back using `databricks ...

  • 2320 Views
  • 1 replies
  • 1 kudos
Latest Reply
Louis_Frolio
Databricks Employee
  • 1 kudos

Hey @lezwon    Thanks for the details and screenshots—this looks like a permissions/ownership issue with your newly deployed Delta Live Tables pipelines.   What’s going on Pipelines run under the pipeline owner’s identity (Databricks recommends a ser...

  • 1 kudos
mahfooz_iiitian
by New Contributor III
  • 30 Views
  • 3 replies
  • 0 kudos

databricks serverless cluster and poetry private repository

Currently we are evaluating the databricks serverless. It support public repository in poetry as dependency path but it is not supporting private repository as we are not sure whether put the credentials details regarding privare repository.

  • 30 Views
  • 3 replies
  • 0 kudos
Latest Reply
szymon_dybczak
Esteemed Contributor III
  • 0 kudos

Hi @mahfooz_iiitian ,Databricks supports private repositories only for Notebook-scoped libraries.In serverless you can use do it using pip install (of course store you token in a safe palce):Notebook-scoped Python libraries - Azure Databricks | Micro...

  • 0 kudos
2 More Replies
VaDim
by New Contributor III
  • 1765 Views
  • 2 replies
  • 1 kudos

ModuleNotFound error when using transformWithStateInPandas via a class defined outside the notebook

As per Databricks documentation when I define the class that extends `StatefulProcessor` in a Notebook everything works ok however, execution fails with ModuleNotFound error as soon as the class definition is moved to a file (module) of it's own in a...

Data Engineering
transformWithState
  • 1765 Views
  • 2 replies
  • 1 kudos
Latest Reply
VaDim
New Contributor III
  • 1 kudos

This is no longer an issue; it must be some patch version of DBX Runtime 16.4 fixed it and it works now without doing any changes to original code.Thanks.

  • 1 kudos
1 More Replies

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels