cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 
Data + AI Summit 2024 - Data Engineering & Streaming

Forum Posts

Smu_Tan
by New Contributor
  • 2477 Views
  • 4 replies
  • 1 kudos

Resolved! Does Databricks supports the Pytorch Distributed Training for multiple devices?

Hi, Im trying to use the databricks platform to do the pytorch distributed training, but I didnt find any info about this. What I expected is using multiple clusters to run a common job using pytorch distributed data parallel (DDP) with the code belo...

  • 2477 Views
  • 4 replies
  • 1 kudos
Latest Reply
adarsh8304
New Contributor II
  • 1 kudos

Hey, so we even can't use the TorchDistributor and Distributed Data Parallel to achieve the distributed training thing in my code, and `TorchDistributor` is a spark written distribution library, coz with this setup I am not able to get the the requir...

  • 1 kudos
3 More Replies
meghana_tulla
by New Contributor II
  • 17 Views
  • 2 replies
  • 0 kudos

How to Set Expiration Time for Delta Sharing URL in Databricks Using Terraform?

 I am automating Delta Sharing from Databricks to non-Databricks recipients using Terraform. I can successfully create shares and recipients with my Terraform code, retrieve the sharing URL after creating the recipient, and see that the URL gets a de...

  • 17 Views
  • 2 replies
  • 0 kudos
Latest Reply
jeremy98
New Contributor II
  • 0 kudos

Hello, I did it yesterday through account console (idk if you can do it using terraform).If you are an admin at higher level you can go in that window and enable your metastore to set a token with an expiration date. I hope I answer to your problem

  • 0 kudos
1 More Replies
issa
by Visitor
  • 11 Views
  • 0 replies
  • 0 kudos

How to access bronze dlt in silver dlt

I have a job in Workflows thatt runs two DLT pipelines, one for Bronze_Transaction and on for Silver_Transaction. The reason for two DLT pipelines is because i want the tables to be created in bronze catalog and erp schema, and silver catalog and erp...

Data Engineering
dlt
DLT pipeline
Medallion
Workflows
  • 11 Views
  • 0 replies
  • 0 kudos
jeremy98
by New Contributor II
  • 19 Views
  • 2 replies
  • 0 kudos

using VSCode extension to interact with Databricks

Hello community, I want to understand if it is possible to use Databricks Connect inside VSCode IDE to interact with Notebooks in local interactively like in Databricks Notebook, Is it possible? Because now I can only use the cluster and wait after t...

  • 19 Views
  • 2 replies
  • 0 kudos
Latest Reply
carolgrey98
  • 0 kudos

@jeremy98 wrote:Hello community, I want to understand if it is possible to use Databricks Connect inside VSCode IDE to interact with Notebooks in local interactively like in Databricks Notebook, Is it possible? Because now I can only use the cluster ...

  • 0 kudos
1 More Replies
s3
by New Contributor II
  • 12 Views
  • 1 replies
  • 0 kudos

extracting attachments from outlook

Can we fetch attachments from outlook in databricks?

  • 12 Views
  • 1 replies
  • 0 kudos
Latest Reply
Stefan-Koch
Contributor II
  • 0 kudos

Hi s3You could use Microsoft graph for that. Here is an example: https://learn.microsoft.com/en-us/answers/questions/1631663/using-graph-api-to-retrieve-emailAnother way I have always done this is through the Logic App. It is pretty easy to set up an...

  • 0 kudos
anantkharat
by New Contributor
  • 53 Views
  • 1 replies
  • 0 kudos

Getting

payload = {"clusters": [{"num_workers": 4}],"pipeline_id": pipeline_id}update_url = f"{workspace_url}/api/2.0/pipelines/{pipeline_id}"response = requests.put(update_url, headers=headers, json=payload)for this, i'm getting below output with status cod...

Data Engineering
Databricks
Delta Live Tables
  • 53 Views
  • 1 replies
  • 0 kudos
Latest Reply
merry867
New Contributor
  • 0 kudos

Hello,Thanks for this post.Best RegardsStatistics for Spotify

  • 0 kudos
skarpeck
by New Contributor III
  • 145 Views
  • 2 replies
  • 0 kudos

Update set in foreachBatch

I need to track codes of records that were ingested in foreachBatch function, and pass it as a task value, so downstream tasks can take actions based on this output. What would be the best approach to achieve that? Now, I have a following solution, b...

  • 145 Views
  • 2 replies
  • 0 kudos
Latest Reply
raphaelblg
Databricks Employee
  • 0 kudos

@skarpeck does your input df contain any filters? The empty codes variable could be due to empty microbatches maybe. Please check the numInputRows from your query's Stream Monitoring Metrics. I recommend you to check if there are input rows for the b...

  • 0 kudos
1 More Replies
Data_Analytics1
by Contributor III
  • 29035 Views
  • 10 replies
  • 10 kudos

Failure starting repl. How to resolve this error? I got this error in a job which is running.

Failure starting repl. Try detaching and re-attaching the notebook.java.lang.Exception: Python repl did not start in 30 seconds. at com.databricks.backend.daemon.driver.IpykernelUtils$.startIpyKernel(JupyterDriverLocal.scala:1442) at com.databricks.b...

  • 29035 Views
  • 10 replies
  • 10 kudos
Latest Reply
PabloCSD
Contributor II
  • 10 kudos

I have had this problem many times, today I made a copy of the cluster and it got "de-saturated", it could help someone in the future

  • 10 kudos
9 More Replies
Shreyash_Gupta
by New Contributor II
  • 40 Views
  • 1 replies
  • 0 kudos

Resolved! How do Databricks notebooks differ from traditional Jupyter notebooks

Can someone please explain the key difference between a Databricks notebook and a Jupyter notebook.

  • 40 Views
  • 1 replies
  • 0 kudos
Latest Reply
Walter_C
Databricks Employee
  • 0 kudos

The key differences between a Databricks notebook and a Jupyter notebook are as follows: Integration and Collaboration: Databricks Notebooks: These are integrated within the Databricks platform, providing a unified experience for data science and ma...

  • 0 kudos
Harsha777
by New Contributor III
  • 115 Views
  • 5 replies
  • 1 kudos

Resolved! Does column masking work with job clusters

Hi,We are trying to experiment with the column masking feature.Here is our use case:We have added a masking function to one of the columns of a tablethe table is part of a notebook with some transformation logicthe notebook is executed as part of a w...

Harsha777_0-1732696132629.png Harsha777_1-1732696804007.png
  • 115 Views
  • 5 replies
  • 1 kudos
Latest Reply
Walter_C
Databricks Employee
  • 1 kudos

Hello, the shared cluster on a job will act the same as in an all purpose cluster, basically means that the cluster will be available for any user with permissions to it, in a job there will not be much actions to be done but when an action you are r...

  • 1 kudos
4 More Replies
MarkD
by New Contributor II
  • 3244 Views
  • 9 replies
  • 0 kudos

SET configuration in SQL DLT pipeline does not work

Hi,I'm trying to set a dynamic value to use in a DLT query, and the code from the example documentation does not work.SET startDate='2020-01-01'; CREATE OR REFRESH LIVE TABLE filtered AS SELECT * FROM my_table WHERE created_at > ${startDate};It is g...

Data Engineering
Delta Live Tables
dlt
sql
  • 3244 Views
  • 9 replies
  • 0 kudos
Latest Reply
smit_tw
New Contributor
  • 0 kudos

@anardinelli Can you please help with a solution? I am am having issue with setting a variable in delta live table pipeline and use it with APPLY CHANGES INTO syntax. 

  • 0 kudos
8 More Replies

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group
Labels