cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Immassive
by New Contributor II
  • 794 Views
  • 1 replies
  • 0 kudos

Reading information_schema tables through JDBC connection

Hi, I am using Unity Catalog as storage for data. I have an external system that establishes connection to Unity Catalog via a JDBC connection using the Databricks driver:Configure the Databricks ODBC and JDBC drivers - Azure Databricks | Microsoft L...

  • 794 Views
  • 1 replies
  • 0 kudos
Latest Reply
Immassive
New Contributor II
  • 0 kudos

Note: I can see the tables of the system.information schema in the UI of Databricks and read them there.

  • 0 kudos
alj_a
by New Contributor III
  • 563 Views
  • 2 replies
  • 0 kudos

source db and target db in DLT

Hi,Thanks in advance.I am new in DLT, the scenario is i need to read the data from cloud storage(ADLS) and load it into my bronze table. and read it from bronz table -> do some DQ checks and load the cleaned data into my silver table. finally populat...

  • 563 Views
  • 2 replies
  • 0 kudos
Latest Reply
Kaniz
Community Manager
  • 0 kudos

Hi @alj_a, In Delta Live Tables (DLT), the database or schema name is specified in the table name itself. You can specify the database name in the @dlt.table decorator by using the format database_name.table_name.

  • 0 kudos
1 More Replies
marianopenn
by New Contributor III
  • 1405 Views
  • 2 replies
  • 1 kudos

Databricks VSCode Extension Sync Timeout

I am using the databricks VSCode extension to sync my local repository to Databricks Workspaces. I have everything configured such that smaller syncs work fine, but a full sync of my repository leads to the following error:Sync Error: Post "https://<...

Data Engineering
dbx sync
Repos
VSCode
Workspaces
  • 1405 Views
  • 2 replies
  • 1 kudos
Latest Reply
kimongrigorakis
New Contributor II
  • 1 kudos

Same issue here..... Can someone please help??

  • 1 kudos
1 More Replies
278875
by New Contributor
  • 3412 Views
  • 4 replies
  • 1 kudos

How do I figure out the cost breakdown for Databricks

I'm trying to figure out the cost breakdown for the Databricks usage for my team.When I go into the Databricks administration console and click Usage when I select to show the usage By SKU it just displays the type of cluster but not the name of it. ...

  • 3412 Views
  • 4 replies
  • 1 kudos
Latest Reply
MuthuLakshmi
New Contributor III
  • 1 kudos

Please check the below docs for usage related informations. The Billable Usage Logs: https://docs.databricks.com/en/administration-guide/account-settings/usage.html You can filter them using tags for more precise information which you are looking for...

  • 1 kudos
3 More Replies
dave_d
by New Contributor II
  • 1809 Views
  • 3 replies
  • 1 kudos

What is the "Columnar To Row" node in this simple Databricks SQL query profile?

I am running a relatively simple SQL query that writes back to a table on a Databricks serverless SQL warehouse, and I'm trying to understand why there is a "Columnar To Row" node in the query profile that is consuming the vast majority of the time s...

dave_d_0-1696974904324.png
  • 1809 Views
  • 3 replies
  • 1 kudos
Latest Reply
Annapurna_Hiriy
New Contributor III
  • 1 kudos

 @dave_d We do not have a document with list of operations that would bring up ColumnarToRow node. This node provides a common executor to translate an RDD of ColumnarBatch into an RDD of InternalRow. This is inserted whenever such a transition is de...

  • 1 kudos
2 More Replies
erigaud
by Honored Contributor
  • 1620 Views
  • 3 replies
  • 0 kudos

Merge DLT with Delta Table

Is there anyway to accomplish this ? I have an existing Delta Table and a separate Delta Live Table pipelines and I would like to merge data from a DLT to my existing Delta Table. Is this doable or completely impossible ?

  • 1620 Views
  • 3 replies
  • 0 kudos
Latest Reply
LeifBruen
New Contributor II
  • 0 kudos

Merging data from a Delta Live Table (DLT) into an existing Delta Table is possible with careful planning. Transition data from DLT to Delta Table through batch processing, data transformation, and ETL processes, ensuring schema compatibility. 

  • 0 kudos
2 More Replies
NotARobot
by New Contributor III
  • 766 Views
  • 0 replies
  • 1 kudos

Force DBR/Spark Version in Delta Live Tables Cluster Policy

Is there a way to use Compute Policies to force Delta Live Tables to use specific Databricks Runtime and PySpark versions? While trying to leverage some of the functions in PySpark 3.5.0, I don't seem to be able to get Delta Live Tables to use Databr...

test_cluster_policy.png dlt_version.png
Data Engineering
Compute Policies
Delta Live Tables
Graphframes
pyspark
  • 766 Views
  • 0 replies
  • 1 kudos
Akshith_Rajesh
by New Contributor III
  • 7440 Views
  • 4 replies
  • 5 kudos

Resolved! Call a Stored Procedure in Azure Synapse with input and output Params

driver_manager = spark._sc._gateway.jvm.java.sql.DriverManager connection = driver_manager.getConnection(mssql_url, mssql_user, mssql_pass) connection.prepareCall("EXEC sys.sp_tables").execute() connection.close()The above code works fine but however...

  • 7440 Views
  • 4 replies
  • 5 kudos
Latest Reply
sivaram_bh
New Contributor II
  • 5 kudos

statement="EXEC procedurename  imputparametre , ? "driver_manager = spark._sc._gateway.jvm.java.sql.DriverManagercon = driver_manager.getConnection(jdbcUrl, username, pwd)exec_statement = con.prepareCall(statement)exec_statement.registerOutParameter(...

  • 5 kudos
3 More Replies
databicky
by Contributor II
  • 5541 Views
  • 4 replies
  • 0 kudos

BLOCK_OFFSET_INSIDE_BLOCK ROW_OFFSET_INSIDE_BLOCK is not working

 BLOCK_OFFSET_INSIDE_BLOCK ROW_OFFSET_INSIDE_BLOCK command is not working in spark, but these command is running in hive , when running in spark it get failed with invalid column like that

  • 5541 Views
  • 4 replies
  • 0 kudos
Latest Reply
Kaniz
Community Manager
  • 0 kudos

Hi @databicky, Could you paste your code stack here instead of the screenshot?

  • 0 kudos
3 More Replies
databicky
by Contributor II
  • 2706 Views
  • 5 replies
  • 1 kudos

Resolved! No handler for udf/udaf/udtf for function

i created one function using jar file which is present in the cluster location, but when executing the hive query it is showing error as no handler for udf/udaf/udtf . this queries is running fine in hd insight clusters but when running in databricks...

IMG20231015164650.jpg
  • 2706 Views
  • 5 replies
  • 1 kudos
Latest Reply
Kaniz
Community Manager
  • 1 kudos

Hi @databicky ,  The error message "No handler for UDF/UDAF/UDTF" typically occurs when Spark cannot locate the UDF/UDAF/UDTF you registered. This can happen if the JAR file containing the UDF/UDAF/UDTF is not correctly loaded into Spark or the func...

  • 1 kudos
4 More Replies
BST
by New Contributor
  • 584 Views
  • 1 replies
  • 0 kudos

Resolved! Spark - Cluster Mode - Driver

When running a Spark Job in Cluster Mode, how does Spark decide which worker node to place the driver resources ? 

  • 584 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz
Community Manager
  • 0 kudos

Hi @BST, When running a Spark job in cluster mode, it involves a central manager (e.g., YARN, Mesos, Kubernetes), a driver program, and worker nodes. The driver program is submitted to the central manager, which allocates resources and decides where ...

  • 0 kudos
anirudh_a
by New Contributor II
  • 6016 Views
  • 8 replies
  • 3 kudos

Resolved! 'No file or Directory' error when using pandas.read_excel in Databricks

I am baffled by the behaviour of Databricks:Below you can see the contents of the directory using dbutils in Databricks. It shows the `test.xlsx` file clearly in directory (and I can even open it using `dbutils.fs.head`) But when I go to use panda.re...

wCLqf
Data Engineering
dbfs
panda
spark
spark config
  • 6016 Views
  • 8 replies
  • 3 kudos
Latest Reply
DamnKush
New Contributor II
  • 3 kudos

Hey, I encountered it recently. I can see you are using the shared cluster, try switching to a single user cluster and it will fix it.Can someone let me know why it wasn't working w a shared cluster?Thanks.

  • 3 kudos
7 More Replies
priyanananthram
by New Contributor II
  • 4868 Views
  • 4 replies
  • 1 kudos

Delta live tables for large number of tables

Hi There I am hoping for some guidance I have some 850 tables that I need to ingest using  a DLT Pipeline. When I do this my event log shows that driver node dies becomes unresponsive likely due to GC.Can DLT be used to ingest large number of tablesI...

  • 4868 Views
  • 4 replies
  • 1 kudos
Latest Reply
Sidhant07
New Contributor III
  • 1 kudos

Delta Live Tables (DLT) can indeed be used to ingest a large number of tables. However, if you're experiencing issues with the driver node becoming unresponsive due to garbage collection (GC), it might be a sign that the resources allocated to the dr...

  • 1 kudos
3 More Replies
Databricks143
by New Contributor III
  • 745 Views
  • 1 replies
  • 0 kudos

Failure to intialize congratulations

Hi team,When we reading the CSV file from azure blob using databricks we are not getting any key error and able to read the data from  blob .But if we are trying to read XML file  it failed with key issue invalid configuration . Error:Failure to inti...

  • 745 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz
Community Manager
  • 0 kudos

Hi @Databricks143, Please check this link here. Please LMK if that helps.

  • 0 kudos
Graham
by New Contributor III
  • 3519 Views
  • 5 replies
  • 2 kudos

"MERGE" always slower than "CREATE OR REPLACE"

OverviewTo update our Data Warehouse tables, we have tried two methods: "CREATE OR REPLACE" and "MERGE". With every query we've tried, "MERGE" is slower.My question is this: Has anyone successfully gotten a "MERGE" to perform faster than a "CREATE OR...

  • 3519 Views
  • 5 replies
  • 2 kudos
Latest Reply
Manisha_Jena
New Contributor III
  • 2 kudos

Hi @Graham Can you please try Low Shuffle Merge [LSM]  and see if it helps? LSM is a new MERGE algorithm that aims to maintain the existing data organization (including z-order clustering) for unmodified data, while simultaneously improving performan...

  • 2 kudos
4 More Replies
Labels
Top Kudoed Authors