cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 
Data + AI Summit 2024 - Data Engineering & Streaming

Forum Posts

dwiltse12
by New Contributor II
  • 8282 Views
  • 2 replies
  • 1 kudos

Tableau Delta Sharing

Does anyone have any recent examples of using Tableau and Delta Sharing? The video below mentions using web connector but this connector has been depreciated in Tableau 2023.1. https://www.youtube.com/watch?v=Yg-5LXH9K1I&t=913shttps://help.tableau.co...

  • 8282 Views
  • 2 replies
  • 1 kudos
Latest Reply
JohnMT
New Contributor II
  • 1 kudos

Hi,I am still trying to figure out how to use delta sharing with tableau. I've looking for information for a month without any success. As mentionned before, web data connector is deprecatedAny help would be appreciated.thanks, Johnattan  

  • 1 kudos
1 More Replies
ElaPG
by New Contributor III
  • 4469 Views
  • 1 replies
  • 1 kudos

notebooks naming convention

I have read info about objects names but are there any best practices regarding notebooks naming convention?

  • 4469 Views
  • 1 replies
  • 1 kudos
Latest Reply
Rajeev45
Databricks Employee
  • 1 kudos

It is recommended to name notebooks descriptively so that it is easy to understand their purpose and content. A good practice is to follow a consistent naming convention to help keep notebooks organized. These are some of general practices Use naming...

  • 1 kudos
AG2
by New Contributor III
  • 1469 Views
  • 1 replies
  • 0 kudos

Orchestration

Is it possible to use redwood orchestration over Databricks ?

  • 1469 Views
  • 1 replies
  • 0 kudos
Latest Reply
Miguel_Suarez
Databricks Employee
  • 0 kudos

Hi @AG2 , We don't currently support Redwood Orchestration over Databricks. Best, Miguel

  • 0 kudos
Hubert-Dudek
by Esteemed Contributor III
  • 4837 Views
  • 1 replies
  • 0 kudos

dlt append_flow = multiple streams into a single Delta table

With the append_flow method in Delta Live Tables, you can effortlessly combine data from multiple streams into a single Delta table.

dlt_target.png
  • 4837 Views
  • 1 replies
  • 0 kudos
Latest Reply
jose_gonzalez
Databricks Employee
  • 0 kudos

Thank you for sharing this information @Hubert-Dudek 

  • 0 kudos
Hubert-Dudek
by Esteemed Contributor III
  • 9097 Views
  • 1 replies
  • 3 kudos

row-level concurrency

Databricks Runtime 14.2 now has row-level concurrency generally available and enabled by default for Delta tables with deletion vectors. This feature dramatically reduces conflicts between concurrent write operations.

142.png
  • 9097 Views
  • 1 replies
  • 3 kudos
Latest Reply
jose_gonzalez
Databricks Employee
  • 3 kudos

Thank you for sharing this @Hubert-Dudek !!!

  • 3 kudos
grazie
by Contributor
  • 1167 Views
  • 0 replies
  • 0 kudos

Run a job as different service principals

We currently have several workflows that are basically copies with the only difference being that they run with different service principals and so have different permissions and configuration based on who is running. The way this is managed today is...

  • 1167 Views
  • 0 replies
  • 0 kudos
reshmir18
by New Contributor II
  • 1427 Views
  • 1 replies
  • 0 kudos

Unable to setcheckpointdir in unitycatalog enabled workspace

I have a Unity catalog enabled workspace where I am trying to setCheckpointDir during runtime. The method looks to authenticate using fs.azure.account.key instead of storage credentials. I am using databricks access connector which has "Storage Blob ...

Data Engineering
autoloader
Databricks
storagecredentials
streaming
unitycatalog
  • 1427 Views
  • 1 replies
  • 0 kudos
Latest Reply
reshmir18
New Contributor II
  • 0 kudos

@Retired_mod I have provided all the necessary permissions and were able to browse through the folders of the container added as an external location.I don't understand why the method setcheckpointdir looks for account key when the access is already ...

  • 0 kudos
Anup
by New Contributor III
  • 8257 Views
  • 1 replies
  • 1 kudos

Resolved! Copy Into : Pattern for sub-folders

While trying to ingest data from the S3 bucket, we are running into a situation where the data in s3 buckets is in sub-folders of multiple depths.Is there a good way of specifying patterns for the above case?We tried using the following for a depth o...

  • 8257 Views
  • 1 replies
  • 1 kudos
Latest Reply
" src="" />
This widget could not be displayed.
This widget could not be displayed.
This widget could not be displayed.
  • 1 kudos

This widget could not be displayed.
While trying to ingest data from the S3 bucket, we are running into a situation where the data in s3 buckets is in sub-folders of multiple depths.Is there a good way of specifying patterns for the above case?We tried using the following for a depth o...

This widget could not be displayed.
  • 1 kudos
This widget could not be displayed.
MinMin
by New Contributor II
  • 2734 Views
  • 3 replies
  • 0 kudos

Extra underscore behind ".xlsm" and ".xlsx" after exporting excel files from Databricks

Hi all, I tried to export several excel files from Databricks. But there will always be one extra underscore behind ".xlsm" and ".xlsx", if I export them and try to open the files on local system. I have to manually remove the underscore from the fil...

  • 2734 Views
  • 3 replies
  • 0 kudos
Latest Reply
DH_Fable
New Contributor II
  • 0 kudos

Hi, did you find a solution this? I have the same/similar problem where when I save a dataframe from a Databricks notebook using to_excel() it saves the file with extension ".xlsx_" rather then "xlsx", meaning to open I have to manually download and ...

  • 0 kudos
2 More Replies
Kira
by New Contributor
  • 827 Views
  • 0 replies
  • 0 kudos

FeatureStoreClient speed up create_training_set

I am trying to create training set with 10 Feature Lookups (about 1200 features total). # all args for create_training_set df = fs.create_training_set(args).load_df()I must store this data to delta table for further analysis. Writing this returned da...

Data Engineering
Feature Store
MachineLearning
  • 827 Views
  • 0 replies
  • 0 kudos
williamwjs
by New Contributor II
  • 8076 Views
  • 2 replies
  • 1 kudos

Issue with Could not initialize class $linec4a1686037264c21b0e58b369fab8f2d59.$read$

Our job is written in Scala on DataBricks. It used to have the same problem, but was managed to work with putting all case classes in a separate cell. However, lately it started to fail again due to the same error:Could not initialize class $linec4a1...

  • 8076 Views
  • 2 replies
  • 1 kudos
Latest Reply
williamwjs
New Contributor II
  • 1 kudos

Hi @Retired_mod , may I ask if there's any updates to this issue? Thank you!

  • 1 kudos
1 More Replies
fijoy
by Contributor
  • 14181 Views
  • 6 replies
  • 11 kudos

How to remove widgets from a notebook dashboard?

I'm creating a dashboard from the output of a notebook cell, but noticing that the dashboard displays the the widgets of the notebook in addition to the cell output. How can I remove the widgets from the dashboard?

  • 14181 Views
  • 6 replies
  • 11 kudos
Latest Reply
Nico2
New Contributor II
  • 11 kudos

Did you find any solution for this? I am facing a similar issue wanting to create multiple dashboads on a single notebook where not all widgets are relevant for both dashboards. this makes it difficult for users to understand the dahsboard.

  • 11 kudos
5 More Replies
Dp15
by Contributor
  • 8545 Views
  • 1 replies
  • 1 kudos

Schema Deletion -Structured Streaming

Hi,I have a Structured Stream which reads data from my silver layer and creates a gold layer using foreachBatch. The stream has been working fine, but now I have change where there are deletions to the schema and some of the columns from the silver l...

  • 8545 Views
  • 1 replies
  • 1 kudos
Latest Reply
Dp15
Contributor
  • 1 kudos

@Retired_mod Thank you so much for a detailed explanation 

  • 1 kudos
Phani1
by Valued Contributor II
  • 12287 Views
  • 2 replies
  • 2 kudos

encryption

Hi Databricks, Could you please guide me on the below scenario?Here is the use case we are trying to solve forCurrently environment is using “Voltage” as an encryption tool for encrypting the data in S3 in conjunction with business business-provided ...

  • 12287 Views
  • 2 replies
  • 2 kudos
Latest Reply
AliaCollier
New Contributor II
  • 2 kudos

To replace "Voltage" with Databricks encryption, follow these steps: set up a Customer Managed Key in AWS, configure the S3 bucket, read data in Databricks, and implement custom UDFs for AES encryption/decryption.

  • 2 kudos
1 More Replies
TiagoMag
by New Contributor III
  • 9091 Views
  • 1 replies
  • 2 kudos

Resolved! DLT pipeline evolution schema error

Hello everyone, I am currently working on my first dlt pipeline, and I stumped on a problem which I am struggling to solve.I am working on several tables where I have a column called "my_column" with an array of json with two keys : 1 key : score, 2n...

  • 9091 Views
  • 1 replies
  • 2 kudos
Latest Reply
" src="" />
This widget could not be displayed.
This widget could not be displayed.
This widget could not be displayed.
  • 2 kudos

This widget could not be displayed.
Hello everyone, I am currently working on my first dlt pipeline, and I stumped on a problem which I am struggling to solve.I am working on several tables where I have a column called "my_column" with an array of json with two keys : 1 key : score, 2n...

This widget could not be displayed.
  • 2 kudos
This widget could not be displayed.

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group
Labels