cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

AbkSpl
by New Contributor III
  • 16287 Views
  • 8 replies
  • 6 kudos

Resolved! Making a connection to the tables in Dynamics app through the dataverse TDS endpoint

I wish to do some analysis on tables that are stored in dataverse in databricks. I know that PowerBi uses its Dataverse connector to fetch the data using a Dataverse's TDS endpoint. The tables that we import in PowerBi using this connector is nearly ...

  • 16287 Views
  • 8 replies
  • 6 kudos
Latest Reply
NavinW
New Contributor II
  • 6 kudos

Did you manage to connect to dataverse from Databricks ?I am trying to do the same but no luck.,

  • 6 kudos
7 More Replies
jeremy98
by Honored Contributor
  • 11362 Views
  • 6 replies
  • 3 kudos

Best practice on how to create a configuration yaml files for each workspace environment based?

Hi Community,My team and I are working on refactoring our DAB repository, and we’re considering creating a configuration folder based on our environments—Dev, Staging, and Production workspaces.What would be a common and best practice for structuring...

  • 11362 Views
  • 6 replies
  • 3 kudos
Latest Reply
koji_kawamura
Databricks Employee
  • 3 kudos

Hi @jeremy98 and all, I agree with @saurabh18cs . Having configuration files for each deployment target is a very convenient and manageable solution. Since I couldn't find a plain example showing the project structure, I created one here. https://git...

  • 3 kudos
5 More Replies
jeremy98
by Honored Contributor
  • 2319 Views
  • 2 replies
  • 1 kudos

Resolved! how to get schedule information about a job in databricks?

Hi community,I was reading the Databricks API documentation and I want to get information about one job if this is schedule with the status PAUSED or UNPAUSED. I was watching that there is this api call: https://docs.databricks.com/api/workspace/jobs...

  • 2319 Views
  • 2 replies
  • 1 kudos
Latest Reply
KaranamS
Contributor III
  • 1 kudos

Hi @jeremy98 , It looks like the access token is incorrect or not valid. Can you please verify the following?1. Validate your access token - if you get 403 forbidden error, your access token is invalid.curl -X GET "https://<workspace_host>/api/2.2/jo...

  • 1 kudos
1 More Replies
antr
by Databricks Partner
  • 3285 Views
  • 3 replies
  • 0 kudos

DLT full refresh and resets

When doing a full refresh in DLT, the ables seem to be in a reset/empty state until they're populated again. This can break downstream dependencies, if they try to use the data during pipeline execution.How to handle such case properly?

  • 3285 Views
  • 3 replies
  • 0 kudos
Latest Reply
Advika
Community Manager
  • 0 kudos

Hello @antr! In DLT, a full refresh on a streaming table resets state processing and checkpoint data, potentially disrupting downstream processes that rely on it. To avoid this, use incremental updates (default) or append mode instead of full refresh...

  • 0 kudos
2 More Replies
Shivap
by New Contributor III
  • 1116 Views
  • 3 replies
  • 0 kudos

Need to extract data from delta tables and need to move it to on-prem, what's the best approach

I want to extract data from databricks delta tables and need to move it to on-prem what's the best way to accomplish it -

  • 1116 Views
  • 3 replies
  • 0 kudos
Latest Reply
Stefan-Koch
Databricks Partner
  • 0 kudos

An easy way to do is to use Airbyte. You can run Airbyte locally, connect to databricks and copy the data to your OnPrem location.https://docs.airbyte.com/integrations/destinations/databricks

  • 0 kudos
2 More Replies
Rasputin312
by Databricks Partner
  • 2235 Views
  • 1 replies
  • 1 kudos

Resolved! Widgets Not Displaying

I am trying to run this attention visualization in my Databricks notebook.   This is my code and this is the error I get:```from IPython.display import display, Javascriptimport ipywidgets as widgetsfrom ipywidgets import interactfrom transformers im...

  • 2235 Views
  • 1 replies
  • 1 kudos
Latest Reply
koji_kawamura
Databricks Employee
  • 1 kudos

Hi @Rasputin312 ! I was able to render the visualization with bertviz library. The default moview_view html_action is view that does not work with Databricks notebook. Instead, using the returned HTML, we can visualize the model. display(model_view(a...

  • 1 kudos
Kayla
by Valued Contributor II
  • 1749 Views
  • 2 replies
  • 2 kudos

Resolved! Scheduled Workflow options and DST Change

So, I have a workflow that runs 2:35 am daily.Is there really no way to configure that so it isn't completely skipped during the spring time change?

  • 1749 Views
  • 2 replies
  • 2 kudos
Latest Reply
ashraf1395
Honored Contributor
  • 2 kudos

Hi @Kayla ,i suggest best solution would be to use UTC. Even databricks recommends that.Or shifting the job 30mins - 1 hr

  • 2 kudos
1 More Replies
ramy
by New Contributor II
  • 9281 Views
  • 5 replies
  • 2 kudos

Getting JOB-ID dynamically to create another job to refer as job-task

I am trying to create a new job in Databricks Asset Bundles which refers to another job-task and passing parameters to it. However, the previous job is not created yet (Or will be cretead using Databricks asset bundles in higher envs when deploying t...

  • 9281 Views
  • 5 replies
  • 2 kudos
Latest Reply
priya12
New Contributor II
  • 2 kudos

The lookup works. Here is how it can be used for a job existing outside the asset bundlevariables: my_jobid:    description: Enter the Databricks Job name you want to refer.    lookup:      job: 'My Job1'In the resources section, refer...

  • 2 kudos
4 More Replies
asisaarav
by New Contributor
  • 1145 Views
  • 1 replies
  • 0 kudos

Error : The spark driver has stopped unexpectedly and is restarting

Hi community,Getting an error in the code: Error : The spark driver has stopped unexpectedly and is restarting. Your notebook will be automatically restarted. Cancel you help here in understanding what methods we can use to get it fixed. I tried look...

  • 1145 Views
  • 1 replies
  • 0 kudos
Latest Reply
saurabh18cs
Honored Contributor III
  • 0 kudos

The error message indicates an issue with the Spark driver in your Databricks environment. This can be caused by various factors such as:Check Cluster Configuration: Ensure that your Databricks cluster has sufficient resources (CPU, memory) to handle...

  • 0 kudos
giladba
by New Contributor III
  • 10295 Views
  • 12 replies
  • 11 kudos

access to event_log TVF

Hi, According to the documentation:https://docs.databricks.com/en/delta-live-tables/observability.html"The event_log TVF can be called only by the pipeline owner and a view created over the event_log TVF can be queried only by the pipeline owner. The...

  • 10295 Views
  • 12 replies
  • 11 kudos
Latest Reply
larsbbb
Databricks Partner
  • 11 kudos

@LakehouseGuy  @mkEngineer @hcjp @neha_ayodhya I just saw the following option in dlt pipelines! I haven't testing it yet, but it looks promising.This also looks new documentation:https://learn.microsoft.com/en-us/azure/databricks/dlt/observability#q...

  • 11 kudos
11 More Replies
sandeepmankikar
by Databricks Partner
  • 1176 Views
  • 1 replies
  • 0 kudos

Complex Embedded Workflows

Can complex embedded workflows be created using Databricks Bundle, where multiple workflows are interconnected in a parent-child format? If Databricks Bundle doesn't support this, what would be the best alternative for creating and deploying such wor...

  • 1176 Views
  • 1 replies
  • 0 kudos
Latest Reply
ashraf1395
Honored Contributor
  • 0 kudos

Yup you can create , complex workflows as well in databricks bundles , some examples can beYou can have all of them defined from child to parent and can call those chirl workflows as workflow task in the parent workflows, all different kinds of tasks...

  • 0 kudos
ShivangiB
by New Contributor III
  • 1530 Views
  • 3 replies
  • 0 kudos

Liquid Clustering limitation clustering on write does not support source queries that include filter

I have a query :%sqlinsert into ucdata.brz.liquidcluster_table_data select sum(col1) as col1,col2,sum(col3) as col3 from ucdata.brz.liquidcluster_table_data  group by col2This query I am running with run time version 13.3 and it is still working. But...

  • 1530 Views
  • 3 replies
  • 0 kudos
Latest Reply
ShivangiB
New Contributor III
  • 0 kudos

Hey Team, can you please help on this

  • 0 kudos
2 More Replies
Datanoob123
by New Contributor II
  • 6456 Views
  • 6 replies
  • 1 kudos

Query to show column names in common between multiple tables

Hi all, I have a large amount of tables that I would like a query to pull the column names present in these tables that are common between all the tables.  I know about show columns, but can't seem to use this or another method to achieve this. This ...

Data Engineering
comparing tables
show columns
sql
  • 6456 Views
  • 6 replies
  • 1 kudos
Latest Reply
KaranamS
Contributor III
  • 1 kudos

Hi @Datanoob123 ,I agree with @Stefan-Koch! It could be that you don't have access to the system tables. Please reach out to your Databricks Admin to grant you required permissions to system tables. You can then use the query I shared to get the requ...

  • 1 kudos
5 More Replies
HarryRichard08
by New Contributor II
  • 2302 Views
  • 3 replies
  • 0 kudos

Unable to Access S3 from Serverless but Works on Cluster

Hi everyone,I am trying to access data from S3 using an access key and secret. When I run the code through Databricks clusters, it works fine. However, when I try to do the same from a serverless cluster , I am unable to access the data.I have alread...

  • 2302 Views
  • 3 replies
  • 0 kudos
Latest Reply
KaranamS
Contributor III
  • 0 kudos

Hi @HarryRichard08, Databricks recommends using instance profiles (IAM roles) to connect to AWS S3 as they provide a secure and scalable method without embedding credentials in a notebook. Have you tried this approach?https://docs.databricks.com/aws/...

  • 0 kudos
2 More Replies
HoussemBL
by New Contributor III
  • 1866 Views
  • 2 replies
  • 0 kudos

Databricks asset bundle deploys DLT pipelines as duplicate resources

Dear Community,I have a deployment issue after restructuring my project.Previously, our project was organized with the following structure:project/src/project/resources/project/databricks.ymlAs part of an optimization effort, we have transitioned to ...

  • 1866 Views
  • 2 replies
  • 0 kudos
Latest Reply
HoussemBL
New Contributor III
  • 0 kudos

Hi @ashraf1395 ,I am creating two separate databricks.yml for each sub-project.

  • 0 kudos
1 More Replies
Labels