cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 
Data + AI Summit 2024 - Data Engineering & Streaming

Forum Posts

Ullsokk
by New Contributor III
  • 1867 Views
  • 2 replies
  • 0 kudos

What is a good way to implement unit tests using github actions for databricks?

I am trying to use a git template for unit tests on a databricks project. The framework uses pylint, pytest and black to check the code. But I am having a lot of trouble getting the github actions vm to run the code without issues. I have had issues ...

  • 1867 Views
  • 2 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Hi @Stian Arntsen​ Hope everything is going great.Just wanted to check in if you were able to resolve your issue. If yes, would you be happy to mark an answer as best so that other members can find the solution more quickly? If not, please tell us so...

  • 0 kudos
1 More Replies
najmead
by Contributor
  • 3285 Views
  • 2 replies
  • 0 kudos

Creating an external table reference vs creating a view

In a practical sense, what is the difference between creating an external table;create table my_catalog.my_schema.my_favourite_table location 'abfss://path/to/my/dataversus creating a view that references the same dataset;create view my_catalog.my_sc...

  • 3285 Views
  • 2 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Hi @Nicholas Mead​ Thank you for your question! To assist you better, please take a moment to review the answer and let me know if it best fits your needs.Please help us select the best solution by clicking on "Select As Best" if it does.Your feedbac...

  • 0 kudos
1 More Replies
param3sh
by New Contributor
  • 1500 Views
  • 3 replies
  • 0 kudos

Performance b/w Managed Table and Un-Managed table

I am using Databricks in Azure. I want to mount ADLS Gen2 on Databricks and create unmanged (external) tables on the mount point. But before that I want to know which will give best performance, is it Managed table (stores data in DBFS root)or Un-ma...

  • 1500 Views
  • 3 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Hi @Paramesh Malla​ Thank you for your question! To assist you better, please take a moment to review the answer and let me know if it best fits your needs.Please help us select the best solution by clicking on "Select As Best" if it does.Your feedba...

  • 0 kudos
2 More Replies
suman_gypsy
by New Contributor
  • 947 Views
  • 2 replies
  • 0 kudos

github to adb workspace linkage

i have a workspace in my adb and in that workspace i have folder contains lots of notebook , for backup purpoe i want to copy all the notebook in my github repo, how can i do that in one shot ?

  • 947 Views
  • 2 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Hi @suman mukherjee​ Thank you for your question! To assist you better, please take a moment to review the answer and let me know if it best fits your needs.Please help us select the best solution by clicking on "Select As Best" if it does.Your feedb...

  • 0 kudos
1 More Replies
alexlod
by New Contributor III
  • 5839 Views
  • 2 replies
  • 3 kudos

Getting error "User is not an owner of Account" when creating a storage credential in Azure Databricks

I'm using Azure Databricks.I've followed this guide to create an Azure Storage Account and an Access Connector for Azure Databricks. I've given the `Storage Blob Data Contributor` role to the Access Connector in the Storage Account. When I go to the ...

Screenshot 2023-02-06 at 5.26.49 PM
  • 5839 Views
  • 2 replies
  • 3 kudos
Latest Reply
Anonymous
Not applicable
  • 3 kudos

Hi @Alex Loddengaard​ Hope everything is going great.Just wanted to check in if you were able to resolve your issue. If yes, would you be happy to mark an answer as best so that other members can find the solution more quickly? If not, please tell us...

  • 3 kudos
1 More Replies
RLH
by New Contributor
  • 1872 Views
  • 2 replies
  • 0 kudos

Delta Live Table Merge/Upserts

Hello,I am trying to create a basic DLT pipeline which does an incremental load. First time it runs perfectly without any issues. However when there are records to be updated, the pipeline fails with the following error:"Flow silver has FAILED fatall...

  • 1872 Views
  • 2 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Hi @Ram LH​ Thank you for your question! To assist you better, please take a moment to review the answer and let me know if it best fits your needs.Please help us select the best solution by clicking on "Select As Best" if it does.Your feedback will ...

  • 0 kudos
1 More Replies
Leodatabricks
by Contributor
  • 6195 Views
  • 14 replies
  • 23 kudos

How to secure all clusters and then start running the code

When there are slow nodes, sometimes a job needs to resize its number of clusters to reach the required number of nodes. Is there any way to make sure no code is running before all nodes are secured? Thank you!

  • 6195 Views
  • 14 replies
  • 23 kudos
Latest Reply
Anonymous
Not applicable
  • 23 kudos

Hi @Leo Bao​ Hope everything is going great.Just wanted to check in if you were able to resolve your issue. If yes, would you be happy to mark an answer as best so that other members can find the solution more quickly? If not, please tell us so we ca...

  • 23 kudos
13 More Replies
Upendra_Kumar
by New Contributor
  • 1436 Views
  • 3 replies
  • 0 kudos

Not able to perform update in delta table in databricks using 3 tables

Hi,I am able to perform merge from 2 tables but have requirement to update table based on 3 tables like following query.update a set a.name=b.namefrom table1 a inner join table2 b on a.id=b.idinner join table3 c on a.id=c.idThanks in advance..

  • 1436 Views
  • 3 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Hi @upendra kumar sharma​ Help us build a vibrant and resourceful community by recognizing and highlighting insightful contributions. Mark the best answers and show your appreciation!Thanks and Regards

  • 0 kudos
2 More Replies
pc
by New Contributor II
  • 2491 Views
  • 4 replies
  • 0 kudos

Error in SQL statement: AnalysisException: The query operator `UpdateCommandEdge` contains one or more unsupported expression types Aggregate, Window or Generate.

com.databricks.backend.common.rpc.DatabricksExceptions$SQLExecutionException: org.apache.spark.sql.AnalysisException: The query operator `UpdateCommandEdge` contains one or more unsupportedexpression types Aggregate, Window or Generate.Invalid expres...

  • 2491 Views
  • 4 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Hi @Pradeep Chauhan​ Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answe...

  • 0 kudos
3 More Replies
pradeepgadkari
by New Contributor II
  • 2087 Views
  • 4 replies
  • 3 kudos

Is the Model tab not available in Databricks Community cloud edition?

I am currently doing the Scalable Machine Learning Course and I observed that the menu options available in the videos are a bit different from what I have on my edition.Is it because I am using the Community edition or there is some setting to make ...

  • 2087 Views
  • 4 replies
  • 3 kudos
Latest Reply
Anonymous
Not applicable
  • 3 kudos

Hi @Pradeep Gadkari​ Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answe...

  • 3 kudos
3 More Replies
SS0201
by New Contributor II
  • 3276 Views
  • 4 replies
  • 0 kudos

Slow updates/upserts in Delta tables

When using Delta tables with DBR jobs or even with DLT pipelines, the upserts (especially updates) (on key and timestamp) are taking quite higher than expected time to update the files/tables data (~2 mins for even 1 record poll) (Inserts are lightni...

  • 3276 Views
  • 4 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Hi @Surya Agarwal​ Hope everything is going great.Just wanted to check in if you were able to resolve your issue. If yes, would you be happy to mark an answer as best so that other members can find the solution more quickly? If not, please tell us so...

  • 0 kudos
3 More Replies
Ajay-Pandey
by Esteemed Contributor III
  • 1851 Views
  • 3 replies
  • 6 kudos

Databricks Web Terminal Now you can direct access the web terminal direct from the view menu this is very handy while working on the terminal. 

Databricks Web TerminalNow you can direct access the web terminal direct from the view menu this is very handy while working on the terminal.

image image
  • 1851 Views
  • 3 replies
  • 6 kudos
Latest Reply
Anonymous
Not applicable
  • 6 kudos

Hi @Ajay Pandey​ Hope everything is going great.Just wanted to check in if you were able to resolve your issue. If yes, would you be happy to mark an answer as best so that other members can find the solution more quickly? If not, please tell us so w...

  • 6 kudos
2 More Replies
Sagacious
by New Contributor II
  • 13885 Views
  • 5 replies
  • 0 kudos

How to upload large files to Databricks? and how to unzip files successfully?

I have two JSON files, one ~3 gb and one ~5 gb. I am unable to upload them to databricks community edition as they exceed the max allowed up-loadable file size (~2 gb). If I zip them I am able to upload them, but I am also having issues figuring out ...

  • 13885 Views
  • 5 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Hi @Sage Olson​ Hope everything is going great.Just wanted to check in if you were able to resolve your issue. If yes, would you be happy to mark an answer as best so that other members can find the solution more quickly? If not, please tell us so we...

  • 0 kudos
4 More Replies
Data_Analytics1
by Contributor III
  • 23908 Views
  • 9 replies
  • 10 kudos

Failure starting repl. How to resolve this error? I got this error in a job which is running.

Failure starting repl. Try detaching and re-attaching the notebook.java.lang.Exception: Python repl did not start in 30 seconds. at com.databricks.backend.daemon.driver.IpykernelUtils$.startIpyKernel(JupyterDriverLocal.scala:1442) at com.databricks.b...

  • 23908 Views
  • 9 replies
  • 10 kudos
Latest Reply
Anonymous
Not applicable
  • 10 kudos

Hi @Mahesh Chahare​ Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answer...

  • 10 kudos
8 More Replies
DataBricks_Use1
by New Contributor
  • 1612 Views
  • 2 replies
  • 0 kudos

DLT live Table-Incremental Refresh

Hi All,In our ETL Framework, we have four layers Raw, Foundation ,Trusted & Unified .In raw we are copying the file in JSON Format from a source, using ADF pipeline .In the next Layer(i.e. Foundation) we are flattening the Json Files and converting t...

  • 1612 Views
  • 2 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Hi @DataBricks_User9 c​ Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best an...

  • 0 kudos
1 More Replies

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group
Labels