Data Engineering

Forum Posts

Sorted by:

by yubin-apollo • New Contributor II

11-30-2022 10:26:14 AM

4884 Views
4 replies
0 kudos

COPY INTO skipRows FORMAT_OPTIONS does not work

Based on the COPY INTO documentation, it seems I can use `skipRows` to skip the first `n` rows. I am trying to load a CSV file where I need to skip a few first rows in the file. I have tried various combinations, e.g. setting header parameter on or ...

Data Engineering

4884 Views
4 replies
0 kudos

11-30-2022 10:26:14 AM

View Replies

Latest Reply

karthik-kobai
New Contributor II

03-26-2024 6:29:52 AM

0 kudos

@yubin-apollo: My bad - I had the skipRows in the COPY_OPTIONS and not in the FORMAT_OPTIONS. It works, please ignore my previous comment. Thanks

0 kudos

03-26-2024 6:29:52 AM

3 More Replies

by Teja07 • New Contributor II

05-22-2023 4:06:23 AM

9398 Views
2 replies
0 kudos

File copy from local to dbfs

How to copy a file from local disk to databricks dbfs path. I tried as below but it is throwing me error:code i tried: dbutils.fs.cp("file://c:/user/file.txt",dbfs/data/) and dbutils.fs.cp("file:///c:/user/file.txt",dbfs/data/) error: File not found ...

Data Engineering

9398 Views
2 replies
0 kudos

05-22-2023 4:06:23 AM

View Replies

Latest Reply

venkatcrc
New Contributor III

05-22-2023 11:18:28 AM

0 kudos

I assume you cannot copy files from Local machine to dbfs using dbutils. you can upload files to dbfs using below gui option . Data --> Browse DFS --> Upload

0 kudos

05-22-2023 11:18:28 AM

1 More Replies

by pt-jake • New Contributor II

05-01-2023 6:33:27 AM

5304 Views
1 replies
1 kudos

Arrays of complex type always evaluate to ARRAY<STRING>?

Arrays of complex types seemingly always evaluate to ARRAY<STRING>. Therefore, casting or attempting to load JSON data with empty array values fails. For example, attempting to cast a JSON value of {"likes": []...} on load to the following table sche...

Data Engineering

5304 Views
1 replies
1 kudos

05-01-2023 6:33:27 AM

View Replies

Latest Reply

Anonymous
Not applicable

05-18-2023 2:30:55 AM

1 kudos

Hi @Jake Neyer Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answers yo...

1 kudos

05-18-2023 2:30:55 AM

by _deepak_ • New Contributor II

05-14-2023 11:38:52 PM

2301 Views
1 replies
2 kudos

Resolved! Shallow copy in databricks

Hi, I am new to Databricks. I need to setup a non-prod environment for which I need data of prod to be cloned in non-prod. Explored some and got to know about shallow copy. Is it possible to do shallow copy across environments? or Is it possible to d...

Data Engineering

2301 Views
1 replies
2 kudos

05-14-2023 11:38:52 PM

View Replies

Latest Reply

daniel_sahal
Databricks MVP

05-15-2023 10:50:30 PM

2 kudos

@deepak prasad I'm not sure it's possible to do that. Even with Unity Catalog enabled, you cannot use shallow clone.You can do two things here:Without UC - just simply recreate an empty table in your non-prod environment and do SELECT * from prod st...

2 kudos

05-15-2023 10:50:30 PM

by chanansh • Contributor

01-11-2023 9:46:12 AM

9363 Views
9 replies
9 kudos

copy files from azure to s3

I am trying to copy files from azure to s3. I've created a solution by comparing file lists and copy manually to a temp file and upload. However, I just found AutoLoader and I would like to use that https://docs.databricks.com/ingestion/auto-loader/i...

Data Engineering

9363 Views
9 replies
9 kudos

01-11-2023 9:46:12 AM

View Replies

Latest Reply

Falokun
New Contributor II

01-20-2023 5:06:27 AM

9 kudos

Just use tools like Goodsync and Gs Richcopy 360 to copy directly from blob to S3, I think you will never face problems like that

9 kudos

01-20-2023 5:06:27 AM

8 More Replies

by dataexplorer • New Contributor III

06-08-2022 2:10:11 PM

11455 Views
6 replies
5 kudos

Resolved! COPY INTO generating duplicate rows in Delta table

Hello Everyone,I'm trying to bulk load tables from a SQL server database into ADLS as parquet files and then loading these files into Delta tables (raw/bronze). I had done a one off history/base load but my subsequent incremental loads (which had a d...

Data Engineering

11455 Views
6 replies
5 kudos

06-08-2022 2:10:11 PM

View Replies

Latest Reply

dataexplorer
New Contributor III

06-09-2022 3:26:31 PM

5 kudos

thanks for the guidance!

5 kudos

06-09-2022 3:26:31 PM

5 More Replies

by Mec_Mec • New Contributor II

10-15-2021 12:29:08 AM

8098 Views
6 replies
4 kudos

Resolved! Copy a script from the current subscription to new subscription

I would like to check if there is a process to copy a script/code or migrate the script from the current subscription of the Azure Databricks - Notebooks to new subscription of Databricks (new notebook).

Data Engineering

8098 Views
6 replies
4 kudos

10-15-2021 12:29:08 AM

View Replies

Latest Reply

Mec_Mec
New Contributor II

10-22-2021 2:44:10 AM

4 kudos

how quickly move the Databricks notebooks from one account to another?

4 kudos

10-22-2021 2:44:10 AM

5 More Replies

by hoopla • New Contributor II

08-17-2021 6:52:59 PM

8520 Views
2 replies
1 kudos

Unable to copy mutiple files from file:/tmp to dbfs:/tmp

I am downloading multiple files by web scraping and by default they are stored in /tmp I can copy a single file by providing the filename and path %fs cp file:/tmp/2020-12-14_listings.csv.gz dbfs:/tmp but when I try to copy multiple files I get an ...

Data Engineering

8520 Views
2 replies
1 kudos

08-17-2021 6:52:59 PM

View Replies

Latest Reply

hoopla
New Contributor II

09-16-2021 11:10:01 AM

1 kudos

Thanks DeepakThis is what I have suspected.Hopefully the wild card feature might be available in futureThanks

1 kudos

09-16-2021 11:10:01 AM

1 More Replies

by User16826992666 • Databricks Employee

06-25-2021 9:11:58 AM

2155 Views
1 replies
2 kudos

Why would I make a deep clone of a Delta table vs reading the table and writing a copy to a new location?

It seems like with both techniques I would end up with a copy of my table. Trying to understand when I should be using a deep clone.

Data Engineering

2155 Views
1 replies
2 kudos

06-25-2021 9:11:58 AM

View Replies

Latest Reply

brickster_2018
Databricks Employee

06-25-2021 2:29:24 PM

2 kudos

A deep clone is recommended way as it holds the history of the table. Also, the DEEP clone is faster than the read-write approach.

2 kudos

06-25-2021 2:29:24 PM

Databricks Community

COPY INTO skipRows FORMAT_OPTIONS does not work

File copy from local to dbfs

Arrays of complex type always evaluate to ARRAY<STRING>?

Resolved! Shallow copy in databricks

copy files from azure to s3

Resolved! COPY INTO generating duplicate rows in Delta table

Resolved! Copy a script from the current subscription to new subscription

Unable to copy mutiple files from file:/tmp to dbfs:/tmp

Why would I make a deep clone of a Delta table vs reading the table and writing a copy to a new location?