cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

HariharaSam
by Contributor
  • 26574 Views
  • 8 replies
  • 4 kudos

Resolved! To get Number of rows inserted after performing an Insert operation into a table

Consider we have two tables A & B.qry = """INSERT INTO Table ASelect * from Table B where Id is null """spark.sql(qry)I need to get the number of records inserted after running this in databricks.

  • 26574 Views
  • 8 replies
  • 4 kudos
Latest Reply
GRCL
New Contributor III
  • 4 kudos

Almost same advice than Hubert, I use the history of the delta table :df_history.select(F.col('operationMetrics')).collect()[0].operationMetrics['numOutputRows']You can find also other 'operationMetrics' values, like 'numTargetRowsDeleted'.

  • 4 kudos
7 More Replies
Chilangdon
by New Contributor
  • 7708 Views
  • 3 replies
  • 2 kudos

How to connect to a delta table that lives in a blob storage to display in a web app?

Hi, somebody to help me how to connect a delta table with a web app? I search to a delta-rs library but I can't obtain to make the connection.

  • 7708 Views
  • 3 replies
  • 2 kudos
Latest Reply
etsyal1e2r3
Honored Contributor
  • 2 kudos

Without downloading the files directly every time, you have to create a sql warehouse cluster and connect to it via jdbc connection. This way you just use the requests library in python (or an equal one in another language like axios for javascript) ...

  • 2 kudos
2 More Replies
GS2312
by New Contributor II
  • 5402 Views
  • 6 replies
  • 5 kudos

KeyProviderException when trying to create external table on databricks

Hi There,I have been trying to create an external table on Azure Databricks with below statement.df.write.partitionBy("year", "month", "day").format('org.apache.spark.sql.execution.datasources.parquet.ParquetFileFormat').option("path",sourcepath).mod...

  • 5402 Views
  • 6 replies
  • 5 kudos
Latest Reply
Anonymous
Not applicable
  • 5 kudos

Hi @Gaurishankar Sakhare​ Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best ...

  • 5 kudos
5 More Replies
PK225
by New Contributor III
  • 3038 Views
  • 4 replies
  • 4 kudos
  • 3038 Views
  • 4 replies
  • 4 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 4 kudos

If you mean a stream-static join, yes that is possible:https://learn.microsoft.com/en-us/azure/databricks/delta-live-tables/transform#--stream-static-joinsIf not, what exactly do you mean?

  • 4 kudos
3 More Replies
qwerty1
by Contributor
  • 810 Views
  • 0 replies
  • 0 kudos

Why am I not able to view all table properties?

We have a live streaming table created using the commandCREATE OR REFRESH STREAMING LIVE TABLE foo TBLPROPERTIES ( "pipelines.autoOptimize.zOrderCols" = "c1,, c2, c3, c4", "delta.randomizeFilePrefixes" = "true" );But when I run the show table propert...

  • 810 Views
  • 0 replies
  • 0 kudos
avivshafir
by New Contributor II
  • 1003 Views
  • 0 replies
  • 1 kudos

SQL endpoint querying table changes using a greater timestamp or version than the last table commit

Reading table changes using a greater timestamp or version than the last table commit throws an error and can be changed using a flag timestampOutOfRange.enabled,My issue is that I use an SQL endpoint and I don't see any way of providing this spark f...

  • 1003 Views
  • 0 replies
  • 1 kudos
drewtoby
by New Contributor II
  • 9612 Views
  • 2 replies
  • 1 kudos

Resolved! How to Pull Cached SQL Table into Python Dictionary?

Hello,I have been working on this issue as a proof of concept - it would be extremely helpful to iterate through tables via loops in a few scenarios. I have a simple three column dimension that I added to a cached table.cache lazy table hedis_cache s...

Method 1 Method 2
  • 9612 Views
  • 2 replies
  • 1 kudos
Latest Reply
drewtoby
New Contributor II
  • 1 kudos

Got it to work, thank you for the tip! I needed to convert the dataframe over to a pandas dataframehttps://www.geeksforgeeks.org/convert-pyspark-dataframe-to-dictionary-in-python/

  • 1 kudos
1 More Replies
moski
by New Contributor II
  • 1941 Views
  • 3 replies
  • 1 kudos

How to import a data table from SQLQuery2 into Databricks notebook

Can anyone show me a few commands to import a table, say "mytable2 From: Microsoft SQL Server Into: Databricks Notebook using spark dataframe or at least pandas dataframeCheers!

  • 1941 Views
  • 3 replies
  • 1 kudos
Latest Reply
irfanaziz
Contributor II
  • 1 kudos

You can read any table from MSSQL. You would need to authenticate to the db, so your would need the connection string:def dbProps(): return { "user" : "db-user", "password" : "your password", "driver" : "com.microsoft.sqlserver.jdbc.SQLServerD...

  • 1 kudos
2 More Replies
Himanshu_90
by New Contributor III
  • 6219 Views
  • 8 replies
  • 7 kudos

Databricks sql not able to evaluate expression current_user

Hi,I have a table as below:create table default.test_user(ID bigint NOT NULL GENERATED BY DEFAULT AS IDENTITY (START WITH 1 INCREMENT BY 1),usr1 varchar(255) NOT NULL,ts1 timestamp NOT NULL,usr2 varchar(255) NOT NULL,ts2 timestamp NOT NULL) USING Del...

  • 6219 Views
  • 8 replies
  • 7 kudos
Latest Reply
Anonymous
Not applicable
  • 7 kudos

Hi @Himanshu Agrawal​ Hope everything is going great.Just wanted to check in if you were able to resolve your issue. If yes, would you be happy to mark an answer as best so that other members can find the solution more quickly? If not, please tell us...

  • 7 kudos
7 More Replies
rohit8491
by New Contributor III
  • 5347 Views
  • 3 replies
  • 8 kudos

Azure Databricks Connectivity with Power BI Cloud - Firewall Whitelisting

Hi Support TeamWe want to connect to tables in Azure Databricks via Power BI. We are able to connect this via Power BI Desktop but when we try to Publish the same, we can see the dataset associated does not refresh and throws error from Powerbi.comIt...

  • 5347 Views
  • 3 replies
  • 8 kudos
Latest Reply
rohit8491
New Contributor III
  • 8 kudos

Hi NoorThank you soo much for your response. Please see the below details for the error message. I just got to know that Power BI are Azure Databricks are in different tenants. Do you think it causes any issues? Do we need VNet peering to be configur...

  • 8 kudos
2 More Replies
Direo
by Contributor II
  • 1422 Views
  • 1 replies
  • 0 kudos

Operations applied when running fs.write_table to overwrite existing feature table in hive metastore

Hi,there was a need to query an older snapshot of a table. Therefore ran:deltaTable = DeltaTable.forPath(spark, 'dbfs:/<path>') display(deltaTable.history())and noticed that every fs.write_table run triggers two operations:Write and CREATE OR REPLACE...

image
  • 1422 Views
  • 1 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

@Direo Direo​ :When you use deltaTable.write() method to write a DataFrame into a Delta table, it actually triggers the Delta write operation internally. This operation performs two actions:It writes the new data to disk in the Delta format, andIt at...

  • 0 kudos
Anonymous
by Not applicable
  • 2340 Views
  • 4 replies
  • 0 kudos

Objective is to make table unique at ID using group by , concat_ws and collect_list ,combining distinct values in one row.

Objective is to make table unique at ID. Table structure is as in attached image.Query used is : selectID,concat_ws(' & ' , collect_list(Distinct Gender)) as Genderfrom tablegroup by IDIt can be possible if we can order values within collect_list and...

  • 2340 Views
  • 4 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Hi @Rishabh Shanker​ Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answe...

  • 0 kudos
3 More Replies
najmead
by Contributor
  • 3700 Views
  • 2 replies
  • 0 kudos

Creating an external table reference vs creating a view

In a practical sense, what is the difference between creating an external table;create table my_catalog.my_schema.my_favourite_table location 'abfss://path/to/my/dataversus creating a view that references the same dataset;create view my_catalog.my_sc...

  • 3700 Views
  • 2 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Hi @Nicholas Mead​ Thank you for your question! To assist you better, please take a moment to review the answer and let me know if it best fits your needs.Please help us select the best solution by clicking on "Select As Best" if it does.Your feedbac...

  • 0 kudos
1 More Replies
param3sh
by New Contributor
  • 1753 Views
  • 3 replies
  • 0 kudos

Performance b/w Managed Table and Un-Managed table

I am using Databricks in Azure. I want to mount ADLS Gen2 on Databricks and create unmanged (external) tables on the mount point. But before that I want to know which will give best performance, is it Managed table (stores data in DBFS root)or Un-ma...

  • 1753 Views
  • 3 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Hi @Paramesh Malla​ Thank you for your question! To assist you better, please take a moment to review the answer and let me know if it best fits your needs.Please help us select the best solution by clicking on "Select As Best" if it does.Your feedba...

  • 0 kudos
2 More Replies
Upendra_Kumar
by New Contributor
  • 1627 Views
  • 3 replies
  • 0 kudos

Not able to perform update in delta table in databricks using 3 tables

Hi,I am able to perform merge from 2 tables but have requirement to update table based on 3 tables like following query.update a set a.name=b.namefrom table1 a inner join table2 b on a.id=b.idinner join table3 c on a.id=c.idThanks in advance..

  • 1627 Views
  • 3 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Hi @upendra kumar sharma​ Help us build a vibrant and resourceful community by recognizing and highlighting insightful contributions. Mark the best answers and show your appreciation!Thanks and Regards

  • 0 kudos
2 More Replies
Labels