cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Chris_Konsur
by New Contributor III
  • 11611 Views
  • 4 replies
  • 6 kudos

Resolved! Error: The associated location ... is not empty but it's not a Delta table

I try to create a table but I get this error: AnalysisException: Cannot create table ('`spark_catalog`.`default`.`citation_all_tenants`'). The associated location ('dbfs:/user/hive/warehouse/citation_all_tenants') is not empty but it's not a Delta t...

  • 11611 Views
  • 4 replies
  • 6 kudos
Latest Reply
sachin_tirth
New Contributor II
  • 6 kudos

Hi Team, I am facing the same issue. When we try to load data to table in production batch getting error as table not in delta format. there is no recent change in table. and we are not trying any create or replace table. this is existing table in pr...

  • 6 kudos
3 More Replies
wyzer
by Contributor II
  • 4907 Views
  • 8 replies
  • 4 kudos

Resolved! How to pass parameters in SSRS/Power BI (report builder) ?

Hello,In SSRS/Power BI (report builder), how to query a table in Databricks with parameters please ?Because this code doesn't works :SELECT * FROM TempBase.Customers WHERE Name = {{ @P_Name }}Thanks.

  • 4907 Views
  • 8 replies
  • 4 kudos
Latest Reply
Nj11
New Contributor II
  • 4 kudos

Hi, I am not able to see the data in SSRS while I am using date parameters but with manual dates data is populating fine. The database is pointing to databricks. I am not sure what I am missing here. Please help me in this. ThanksI am trying with que...

  • 4 kudos
7 More Replies
Abbe
by New Contributor II
  • 1315 Views
  • 2 replies
  • 0 kudos

Update data type of a column within a table that has a GENERATED ALWAYS AS IDENTITY-column

I want to cast the data type of a column "X" in a table "A" where column "ID" is defined as GENERATED ALWAYS AS IDENTITY. Databricks refer to overwrite to achieve this: https://docs.databricks.com/delta/update-schema.htmlThe following operation:(spar...

  • 1315 Views
  • 2 replies
  • 0 kudos
Latest Reply
RajuBolla
New Contributor II
  • 0 kudos

Update is not working but delete is when i changed to DEFAULT property AnalysisException: UPDATE on IDENTITY column "XXXX_ID" is not supported.

  • 0 kudos
1 More Replies
MBV3
by New Contributor III
  • 7288 Views
  • 6 replies
  • 7 kudos

Resolved! External table from parquet partition

Hi,I have data in parquet format in GCS buckets partitioned by name eg. gs://mybucket/name=ABCD/I am trying to create a table in Databaricks as followsDROP TABLE IF EXISTS name_test; CREATE TABLE name_testUSING parquetLOCATION "gs://mybucket/name=*/...

  • 7288 Views
  • 6 replies
  • 7 kudos
Latest Reply
Pat
Honored Contributor III
  • 7 kudos

Hi @M Baig​ ,the error doesn't tell me much, but you could try:CREATE TABLE name_test USING parquet PARTITIONED BY ( name STRING) LOCATION "gs://mybucket/";

  • 7 kudos
5 More Replies
AkifCakir
by New Contributor
  • 15322 Views
  • 4 replies
  • 2 kudos

Resolved! Why Spark Save Modes , "overwrite" always drops table although "truncate" is true ?

Hi Dear Team, I am trying to import data from databricks to Exasol DB. I am using following code in below with Spark version is 3.0.1 ,dfw.write \ .format("jdbc") \ .option("driver", exa_driver) \ .option("url", exa_url) \ .option("db...

  • 15322 Views
  • 4 replies
  • 2 kudos
Latest Reply
Gembo
New Contributor II
  • 2 kudos

@AkifCakir , Were you able to find a way to truncate without dropping the table using the .write function as I am facing the same issue as well.

  • 2 kudos
3 More Replies
Graham
by New Contributor III
  • 4182 Views
  • 5 replies
  • 2 kudos

"MERGE" always slower than "CREATE OR REPLACE"

OverviewTo update our Data Warehouse tables, we have tried two methods: "CREATE OR REPLACE" and "MERGE". With every query we've tried, "MERGE" is slower.My question is this: Has anyone successfully gotten a "MERGE" to perform faster than a "CREATE OR...

  • 4182 Views
  • 5 replies
  • 2 kudos
Latest Reply
Manisha_Jena
New Contributor III
  • 2 kudos

Hi @Graham Can you please try Low Shuffle Merge [LSM]  and see if it helps? LSM is a new MERGE algorithm that aims to maintain the existing data organization (including z-order clustering) for unmodified data, while simultaneously improving performan...

  • 2 kudos
4 More Replies
my_community2
by New Contributor III
  • 7999 Views
  • 8 replies
  • 6 kudos

Resolved! dropping a managed table does not remove the underlying files

the documentation states that "drop table":Deletes the table and removes the directory associated with the table from the file system if the table is not EXTERNAL  table. An exception is thrown if the table does not exist.In case of an external table...

image.png
  • 7999 Views
  • 8 replies
  • 6 kudos
Latest Reply
MajdSAAD_7953
New Contributor II
  • 6 kudos

Hi,There is a way to force delete files after drop the table and don't wait 30 days to see size in S3 decrease?Tables that I dropped related to the dev and staging, I don't want to keep there files for 30 days 

  • 6 kudos
7 More Replies
Juha
by New Contributor II
  • 1337 Views
  • 3 replies
  • 2 kudos
  • 1337 Views
  • 3 replies
  • 2 kudos
Latest Reply
lawrence009
Contributor
  • 2 kudos

Have you figured out what the problem was? Could the issue be permission related?

  • 2 kudos
2 More Replies
HariharaSam
by Contributor
  • 51464 Views
  • 6 replies
  • 3 kudos

Resolved! Alter Delta table column datatype

Hi ,I am having a delta table and table contains data and I need to alter the datatype for a particular column.For example :Consider the table name is A and column name is Amount with datatype Decimal(9,4).I need alter the Amount column datatype from...

  • 51464 Views
  • 6 replies
  • 3 kudos
Latest Reply
saipujari_spark
Valued Contributor
  • 3 kudos

Hi @HariharaSam The following documents the info about how to alter a Delta table schema.https://docs.databricks.com/delta/update-schema.html

  • 3 kudos
5 More Replies
Eelke
by New Contributor II
  • 1742 Views
  • 3 replies
  • 0 kudos

I want to perform interpolation on a streaming table in delta live tables.

I have the following code:from pyspark.sql.functions import * !pip install dbl-tempo from tempo import TSDF   from pyspark.sql.functions import *   # interpolate target_cols column linearly for tsdf dataframe def interpolate_tsdf(tsdf_data, target_c...

  • 1742 Views
  • 3 replies
  • 0 kudos
Latest Reply
Eelke
New Contributor II
  • 0 kudos

The issue was not resolved because we were trying to use a streaming table within TSDF which does not work.

  • 0 kudos
2 More Replies
HariharaSam
by Contributor
  • 16590 Views
  • 10 replies
  • 4 kudos

Resolved! To get Number of rows inserted after performing an Insert operation into a table

Consider we have two tables A & B.qry = """INSERT INTO Table ASelect * from Table B where Id is null """spark.sql(qry)I need to get the number of records inserted after running this in databricks.

  • 16590 Views
  • 10 replies
  • 4 kudos
Latest Reply
GRCL
New Contributor III
  • 4 kudos

Almost same advice than Hubert, I use the history of the delta table :df_history.select(F.col('operationMetrics')).collect()[0].operationMetrics['numOutputRows']You can find also other 'operationMetrics' values, like 'numTargetRowsDeleted'.

  • 4 kudos
9 More Replies
tototox
by New Contributor III
  • 6008 Views
  • 3 replies
  • 2 kudos

how to check table size by partition?

I want to check the size of the delta table by partition.As you can see, only the size of the table can be checked, but not by partition.

  • 6008 Views
  • 3 replies
  • 2 kudos
Latest Reply
Anonymous
Not applicable
  • 2 kudos

@jin park​ :You can use the Databricks Delta Lake SHOW TABLE EXTENDED command to get the size of each partition of the table. Here's an example:%sql SHOW TABLE EXTENDED LIKE '<table_name>' PARTITION (<partition_column> = '<partition_value>') SELECT...

  • 2 kudos
2 More Replies
Chilangdon
by New Contributor
  • 4574 Views
  • 3 replies
  • 0 kudos

How to connect to a delta table that lives in a blob storage to display in a web app?

Hi, somebody to help me how to connect a delta table with a web app? I search to a delta-rs library but I can't obtain to make the connection.

  • 4574 Views
  • 3 replies
  • 0 kudos
Latest Reply
etsyal1e2r3
Honored Contributor
  • 0 kudos

Without downloading the files directly every time, you have to create a sql warehouse cluster and connect to it via jdbc connection. This way you just use the requests library in python (or an equal one in another language like axios for javascript) ...

  • 0 kudos
2 More Replies
GS2312
by New Contributor II
  • 3327 Views
  • 6 replies
  • 5 kudos

KeyProviderException when trying to create external table on databricks

Hi There,I have been trying to create an external table on Azure Databricks with below statement.df.write.partitionBy("year", "month", "day").format('org.apache.spark.sql.execution.datasources.parquet.ParquetFileFormat').option("path",sourcepath).mod...

  • 3327 Views
  • 6 replies
  • 5 kudos
Latest Reply
Anonymous
Not applicable
  • 5 kudos

Hi @Gaurishankar Sakhare​ Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best ...

  • 5 kudos
5 More Replies
Labels