cancel
Showing results for 
Search instead for 
Did you mean: 
Machine Learning
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

aladda
by Honored Contributor II
  • 1985 Views
  • 2 replies
  • 0 kudos

Resolved! How do I use the Copy Into command to copy data into a Delta Table? Looking for examples where you want to have a pre-defined schema

I've reviewed the COPY INTO docs here - https://docs.databricks.com/spark/latest/spark-sql/language-manual/delta-copy-into.html#examples but there's only one simple example. Looking for some additional examples that show loading data from CSV - with ...

  • 1985 Views
  • 2 replies
  • 0 kudos
Latest Reply
aladda
Honored Contributor II
  • 0 kudos

Here's an example for predefined schemaUsing COPY INTO with a predefined table schema – Trick here is to CAST the CSV dataset into your desired schema in the select statement of COPY INTO. Example below%sql CREATE OR REPLACE TABLE copy_into_bronze_te...

  • 0 kudos
1 More Replies
alesventus
by New Contributor III
  • 1529 Views
  • 2 replies
  • 3 kudos

Pyspark Merge parquet and delta file

Is it possible to use merge command when source file is parquet and destination file is delta? Or both files must delta files? Currently, I'm using this code and I transform parquet into delta and it works. But I want to avoid of this tranformation.T...

  • 1529 Views
  • 2 replies
  • 3 kudos
Latest Reply
Anonymous
Not applicable
  • 3 kudos

Hi @Ales ventus​ We haven't heard from you since the last response from @Kaniz Fatma​ , and I was checking back to see if her suggestions helped you.Or else, If you have any solution, please share it with the community, as it can be helpful to others...

  • 3 kudos
1 More Replies
Ismail1
by New Contributor III
  • 946 Views
  • 2 replies
  • 0 kudos

Can an HMS-managed table be upgraded to Unity Catalog?

As the question states, I am not getting the option to upgrade managed tables on UC. Is that possible, I can't find anything on the documentation?

  • 946 Views
  • 2 replies
  • 0 kudos
Latest Reply
Ismail1
New Contributor III
  • 0 kudos

In case anyone else ever faced the same issue

  • 0 kudos
1 More Replies
qwerty1
by Contributor
  • 673 Views
  • 1 replies
  • 1 kudos

Resolved! What is the disadvantage of using multiple Z-Order columns?

The documentation statesYou can specify multiple columns for  ZORDER BY as a comma-separated list. However, the effectiveness of the locality drops with each extra columnWhat does it mean for "effectiveness of the locality to drop" with each extra co...

  • 673 Views
  • 1 replies
  • 1 kudos
Latest Reply
Anonymous
Not applicable
  • 1 kudos

@Ashwin Bhaskar​ :Z-ordering is a technique to improve the performance of queries that involve filtering and grouping on specific columns in a large distributed database. When a table is z-ordered on a certain column or set of columns, the data is so...

  • 1 kudos
khh2023
by New Contributor
  • 801 Views
  • 1 replies
  • 0 kudos

Optimize operation with big increase in numRemovedFiles/numRemovedBytes/numAddedFiles/numAddedBytes

Hello, I have a daily loading process for a delta table and has a ‘optimize table’ step at the end. The optimize operation used to take about 5 minutes, but now takes about 3.5 hours. One thing I noticed from 'describe history' is the operationMetric...

image.png
  • 801 Views
  • 1 replies
  • 0 kudos
Latest Reply
mathan_pillai
Valued Contributor
  • 0 kudos

This is most likely because more files became eligible for compaction (optimize). By default there is a limit of 50 files or so per partition, below which the partition doesn't qualify for optimize. Only if there are 50+ files within a partition the...

  • 0 kudos
elgeo
by Valued Contributor II
  • 3012 Views
  • 1 replies
  • 4 kudos

Resolved! Insert into delta table fails

Hello experts. We are trying to execute an insert command with less columns than the target table:Insert into table_name( col1, col2, col10)Select col1, col2, col10from table_name2However the above fails with:Error in SQL statement: DeltaAnalysisExce...

  • 3012 Views
  • 1 replies
  • 4 kudos
Latest Reply
UmaMahesh1
Honored Contributor III
  • 4 kudos

Hi @ELENI GEORGOUSI​ Yes. When you are doing an insert, your provided schema should match with the target schema else it would throw an error.But you can still insert the data using another approach. Create a dataframe with your data having less colu...

  • 4 kudos
elementalM
by New Contributor III
  • 1382 Views
  • 4 replies
  • 4 kudos

Catch-up Structured Stream hangs on last step of write job to delta sync using toTable

I'm running databricks version 10.4 on gcp. I'm running a structured stream trying to process historical files in a delta table on gcp cloud storage. This source delta table is big but maintained with OPTIMIZE.The stream repartitions which seems to b...

image
  • 1382 Views
  • 4 replies
  • 4 kudos
Latest Reply
Anonymous
Not applicable
  • 4 kudos

Hi @Dwight Branscombe​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you....

  • 4 kudos
3 More Replies
Nachappa
by New Contributor III
  • 3041 Views
  • 8 replies
  • 12 kudos

Resolved! Data model tool to connect to Databricks or Data lake?

Hi Everyone,From data modeling documentation (Dimensional/ ER Diagram), is there any tool available which can connect to databricks/ data lake and read the table structure directly and also updates the structure of table whenever there is a addition ...

  • 3041 Views
  • 8 replies
  • 12 kudos
Latest Reply
Nachappa
New Contributor III
  • 12 kudos

Hi @Kaniz Fatma​ , @Prabakar Ammeappin​ : Thanks for the reply and information. Yes, I am able to connect via DBeaver to Databricks using the JDBC and supported provided link (Sorry for delay in update as I had to try on Trial version of Enterprise D...

  • 12 kudos
7 More Replies
MattM
by New Contributor III
  • 974 Views
  • 2 replies
  • 3 kudos

Resolved! Mapping Control data - Maintained by Business User

We are ingesting our data from ADLS into databricks as delta table. After raw layer we need to refer to a control\mapping layer which defines certain logic\measure definition. This would be incorporated in the subsequent silver or gold layer. This co...

  • 974 Views
  • 2 replies
  • 3 kudos
Latest Reply
MattM
New Contributor III
  • 3 kudos

Thanks for your response. Can business user without the help of any script modify any rows in the table after loading it onetime from CSV fiels?

  • 3 kudos
1 More Replies
Labels