cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

AkankshaGupta
by New Contributor II
  • 1678 Views
  • 0 replies
  • 1 kudos

Target database.table1 must be delta table

I created a table1 with some data. Then I truncated it to load new dataset. When I do select * from table . I get row count 0. But when I am trying to copy into using following command. I get error saying target table must be delta table: COPY INTO...

  • 1678 Views
  • 0 replies
  • 1 kudos
Srikanth_Gupta_
by Valued Contributor
  • 1424 Views
  • 1 replies
  • 0 kudos

Resolved! Does size of optimized files after running OPTIMIZE varies between cloud providers (S3, Blob and GCS)?

are there any other parameters to consider running OPTIMIZE depending cloud vendor?

  • 1424 Views
  • 1 replies
  • 0 kudos
Latest Reply
Ryan_Chynoweth
Esteemed Contributor
  • 0 kudos

The optimize is not dependent on the cloud provider whatsoever. Optimize will produce the same results regardless of the underlying storage. It is idempotent, meaning if it is run twice on the same dataset the the second execution has no effect.

  • 0 kudos
asher
by New Contributor II
  • 8753 Views
  • 1 replies
  • 0 kudos

List all files in a Blob Container

I am trying to find a way to list all files, and related file sizes, in all folders and all sub folders. I guess these are called blobs, in the Databricks world. Anyway, I can easily list all files, and related file sizes, in one single folder, but ...

  • 8753 Views
  • 1 replies
  • 0 kudos
Latest Reply
asher
New Contributor II
  • 0 kudos

from azure.storage.blob import BlockBlobService block_blob_service = BlockBlobService(account_name='your_acct_name', account_key='your_acct_key') mylist = [] generator = block_blob_service.list_blobs('rawdata') for blob in generator: mylist.append(...

  • 0 kudos
Labels