cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

theanhdo
by New Contributor III
  • 2943 Views
  • 1 replies
  • 0 kudos

Resolved! Databricks Asset Bundles library dependencies - JAR file

Hi there,I have used databricks asset bundles (DAB) to deploy workflows. For each job, I will create a job cluster and install external libraries by specifying libraries in each task, for example:- task_key: my-task  job_cluster_key: my-cluster  note...

  • 2943 Views
  • 1 replies
  • 0 kudos
Latest Reply
theanhdo
New Contributor III
  • 0 kudos

Thanks very much @Retired_mod for your thorough answer.

  • 0 kudos
thackman
by Databricks Partner
  • 2068 Views
  • 1 replies
  • 0 kudos

Python udfs, Spark Connect, included modules. Compatibility issues with shared compute

Our current system uses Databricks notebooks and we have some shared notebooks that define some python udfs. This was working for us until we tried to switch from single user clusters to shared clusters. Shared clusters and serverless now use Spark C...

  • 2068 Views
  • 1 replies
  • 0 kudos
Latest Reply
thackman
Databricks Partner
  • 0 kudos

I'm not sure what you mean by "Ensure the Python binary's location is correctly set to resolve runtime issues" . We aren't using any binaries. Everything is just Databricks notebooks.  In our case if we define a python udf function in the root notebo...

  • 0 kudos
YS1
by Contributor
  • 6379 Views
  • 4 replies
  • 5 kudos

Resolved! SQL Server To Databricks Table Migration

Hello,Is there an equivalent SQL code for the following Pyspark code? I'm trying to copy a table from SQL Server to Databricks and save it as a managed delta table.jdbcHostname = "your_sql_server_hostname" jdbcPort = 1433 jdbcDatabase = "your_databas...

  • 6379 Views
  • 4 replies
  • 5 kudos
Latest Reply
jacovangelder
Databricks MVP
  • 5 kudos

The only option to have it in Databricks SQL is lakehouse federation with a SQL Server connection. 

  • 5 kudos
3 More Replies
operryman
by New Contributor
  • 1227 Views
  • 0 replies
  • 0 kudos

Performance drop from databricks 12.2 to 14.3 LTS - solved with checkpoint(), looking for root cause

On databricks 12.2, a piece of code which has an action takes a minute to run.On databricks 14.3, the same code unchanged, same inputs, now takes 10 minutes.Attempting to debug using explain() shows the plan is huge (150k plus rows of output)Replacin...

  • 1227 Views
  • 0 replies
  • 0 kudos
VenkateswarluAd
by New Contributor
  • 2581 Views
  • 2 replies
  • 1 kudos

DLT- apply_changes() SCD2 - Rename_columns : __START_AT & __END_AT

 Renaming the column names of the __START_AT and __END_AT columns created when using the dlt.apply_changes() method for performing SCD2 type updates.

  • 2581 Views
  • 2 replies
  • 1 kudos
Latest Reply
Ravivarma
Databricks Employee
  • 1 kudos

Hello @VenkateswarluAd , Greetings of the day! The columns __START_AT and __END_AT are used to track the validity period of each record for SCD Type 2 updates. Please be aware that renaming these columns could disrupt the functionality of the SCD Typ...

  • 1 kudos
1 More Replies
Magesh2798
by New Contributor II
  • 1499 Views
  • 0 replies
  • 1 kudos

Query execution after establishing Databricks to Information Design Tool JDBC Connection

Hello all,I have created a JDBC connection from Databricks to Information Design Tool using access token generated using Databricks Service Principal.But it’s throwing below error while running query on top of Databricks data in Information Design Bu...

  • 1499 Views
  • 0 replies
  • 1 kudos
Laltu_singh
by New Contributor II
  • 3284 Views
  • 3 replies
  • 1 kudos

Accessing Private API in databricks notebook

Hello, I am trying to access an API in databricks python notebook which is available within a restricted network. ​When I try to access that API, it's not able to find the URL used to access the API and throws an HTTP error (max retries exceeded).​d...

  • 3284 Views
  • 3 replies
  • 1 kudos
Latest Reply
pjv
New Contributor III
  • 1 kudos

Hi! Could you recommend a way to setup a proxy server that can reroute all HTTP traffic according to the above advice? Thank you!Kind regards,Pim

  • 1 kudos
2 More Replies
Nisharunnisa
by New Contributor II
  • 2526 Views
  • 0 replies
  • 1 kudos

Error: cannot create job: 'SERVICE_PRINCIPAL_NAME' cannot be set as run_as_username

Hi Team, I am trying to deploy workflows to Databricks Workspace via Databricks Asset Bundle (DAB) using Azure Service Principle. Below is my databricks.yml file which i am using for DAB.I am replacing the "SERVICE_PRINCIPAL_NAME" variable in my Jenk...

  • 2526 Views
  • 0 replies
  • 1 kudos
yalei
by New Contributor
  • 6861 Views
  • 1 replies
  • 0 kudos

leaflet not works in notebook(R language)

I saw this notebook: htmlwidgets-azure - Databricks (microsoft.com)However, it is not reproducible. I got a lot errors:there is no package called ‘R.utils’. This is easy to fix, just install the package "R.utils""can not be unloaded". This is not ...

  • 6861 Views
  • 1 replies
  • 0 kudos
Latest Reply
KAdamatzky
New Contributor III
  • 0 kudos

Hi yalei,  Did you have any luck fixing this issue? I am also trying to replicate the htmlwidgets notebook and am running into the same error.Unfortunately, the suggestions provided by Kaniz_Fatma below did not work.

  • 0 kudos
ksenija
by Contributor
  • 2913 Views
  • 3 replies
  • 1 kudos

Resolved! DLT pipeline - silver table, joining streaming data

Hello!I'm trying to do my modeling in DLT pipelines. For bronze, I created 3 streaming views. When I try to join them to create silver table, I got an error that I can't join stream and stream without watermarks. I tried adding them but then I got no...

  • 2913 Views
  • 3 replies
  • 1 kudos
Latest Reply
Ravivarma
Databricks Employee
  • 1 kudos

Hello @ksenija , Greetings! Streaming uses watermarks to control the threshold for how long to continue processing updates for a given state entity. Common examples of state entities include: Aggregations over a time window. Unique keys in a join b...

  • 1 kudos
2 More Replies
ShankarM
by Databricks Partner
  • 1128 Views
  • 1 replies
  • 1 kudos

Resolved! Serverless feature audit in data engg.

As recently announced in the summit that notebooks, jobs, workflows will run in serverless mode, how do we track/debug the compute cluster metrics in this case especially when there are performance issues while running jobs/workflows.

  • 1128 Views
  • 1 replies
  • 1 kudos
Latest Reply
imsabarinath
New Contributor III
  • 1 kudos

Databricks is planning to enable some system tables to capture some of these metrics and same can be leveraged for troubleshooting as starting point is my view

  • 1 kudos
vkumar
by New Contributor
  • 1153 Views
  • 0 replies
  • 0 kudos

Receiving Null values from Eventhub streaming.

Hi, I am new to PySpark, and facing an issue while consuming data from the Azure eventhub. I am unable to deserialize the consumed data. I see only null values upon deserializing data using the schema. Please find the below schema, eventhub message, ...

  • 1153 Views
  • 0 replies
  • 0 kudos
Oliver_Angelil
by Valued Contributor II
  • 14453 Views
  • 9 replies
  • 6 kudos

Resolved! Confusion about Data storage: Data Asset within Databricks vs Hive Metastore vs Delta Lake vs Lakehouse vs DBFS vs Unity Catalogue vs Azure Blob

Hi thereIt seems there are many different ways to store / manage data in Databricks.This is the Data asset in Databricks: However data can also be stored (hyperlinks included to relevant pages):in a Lakehousein Delta Lakeon Azure Blob storagein the D...

Screenshot 2023-05-09 at 17.02.04
  • 14453 Views
  • 9 replies
  • 6 kudos
Latest Reply
Rahul_S
New Contributor II
  • 6 kudos

Informative.

  • 6 kudos
8 More Replies
jwilliam
by Contributor
  • 6427 Views
  • 3 replies
  • 6 kudos

Resolved! Has Unity Catalog been available in Azure Gov Cloud?

We are using Databricks with Premium Tier in Azure Gov Cloud. We check the Data section but don't see any options to Create Metastore.

  • 6427 Views
  • 3 replies
  • 6 kudos
Latest Reply
User16672493709
Databricks Employee
  • 6 kudos

Azure.gov does not have Unity Catalog (as of July 2024). I think previous responses missed the context of government cloud in OP's question. UC has been open sourced since this question was asked, and is a more comprehensive solution in commercial cl...

  • 6 kudos
2 More Replies
Labels