Data Engineering

Forum Posts

Sorted by:

by mkrish28 • New Contributor II

12-14-2023 11:52:23 PM

2211 Views
2 replies
0 kudos

Resolved! Regarding Exam got suspended

Hello Team,I had a disappointing experience while attempting my first DataBricks certification. Abruptly, the proctor asked me to show my desk, and after complying. Eventually, they suspended my exam, citing excessive eye movement and other practices...

Data Engineering

2211 Views
2 replies
0 kudos

12-14-2023 11:52:23 PM

View Replies

Latest Reply

Cert-Team
Databricks Employee

12-15-2023 7:42:20 AM

0 kudos

@mkrish28 I'm sorry to hear you had this experience. Thank you for logging at ticket with the support team. They have informed me they have rescheduled your exam. Good luck!

0 kudos

12-15-2023 7:42:20 AM

1 More Replies

by samur • New Contributor II

11-09-2023 12:10:33 PM

2071 Views
1 replies
1 kudos

DBR 14.1 - foreachBatch in Spark Connect Shared Clusters are not supported in Unity Catalog.

I am getting this error on DBR 14.1AnalysisException: [UC_COMMAND_NOT_SUPPORTED.WITHOUT_RECOMMENDATION] The command(s): foreachBatch in Spark Connect Shared Clusters are not supported in Unity Catalog.This is the code: wstream = df.writeStream.foreac...

Data Engineering

2071 Views
1 replies
1 kudos

11-09-2023 12:10:33 PM

View Replies

by Iam • New Contributor II

12-14-2023 2:28:06 PM

1945 Views
1 replies
0 kudos

CANNOT_RENAME_ACROSS_SCHEMA message error

Hello...We enabled Unity Catalog and we are migrating schemas. When I ran the command sync schema catalog01.schema01 FROM hive_metastore.schema01 dry run I got the error CANNOT_RENAME_ACROSS_CATALOG, reviewing your documentation it only said CANNO...

Data Engineering

1945 Views
1 replies
0 kudos

12-14-2023 2:28:06 PM

View Replies

by PetitLepton • New Contributor II

12-15-2023 8:15:46 AM

8430 Views
1 replies
0 kudos

List parameter in Python SQL connector 3.0.1

Hi,up to recently in version of the Python SQL connector 2.9.3, I was using a list as a parameter in the cursor.execute(operation, parameters) method without any trouble. It seems that it is not possible anymore in version 3.0.1 as the parsing of par...

Data Engineering

8430 Views
1 replies
0 kudos

12-15-2023 8:15:46 AM

View Replies

Latest Reply

PetitLepton
New Contributor II

12-15-2023 8:45:16 AM

0 kudos

I should better read the documentation : https://github.com/databricks/databricks-sql-python/blob/v3.0.0/docs/parameters.md.

0 kudos

12-15-2023 8:45:16 AM

by dcardenas • New Contributor

12-15-2023 5:35:19 AM

815 Views
0 replies
0 kudos

Retrieving Logs with Job API Get-outputs service

Hello,I would like to retrieve the logs of some job that where launched using the Job Rest Api 2.0. I see in the doc that can be done with the service get-ouputs, however each time I check the service I just get the metadata part of the response but ...

Data Engineering

815 Views
0 replies
0 kudos

12-15-2023 5:35:19 AM

by ken2 • New Contributor II

12-05-2023 4:03:25 AM

2663 Views
3 replies
0 kudos

How to convert entity_id to notebook name or job

Hi, Databricks developers!I use system.access.table_lineage refering to this page.It's difficult for us to recognize which notebook was indicated by the entity_id.How do I get the table to convert entity_ids to Job names or Notebook names?

Data Engineering

2663 Views
3 replies
0 kudos

12-05-2023 4:03:25 AM

View Replies

Latest Reply

mlamairesse
Databricks Employee

12-15-2023 1:52:34 AM

0 kudos

Workflows system tables are coming very soon.

0 kudos

12-15-2023 1:52:34 AM

2 More Replies

by cg3 • New Contributor

12-14-2023 10:40:37 PM

915 Views
0 replies
0 kudos

Define VIEW in Databricks Asset Bundles?

Is it possible to define a Unity Catalog VIEW in a Databricks Asset Bundle, or specify in the bundle that a specific notebook gets run once per deployment?

Data Engineering

915 Views
0 replies
0 kudos

12-14-2023 10:40:37 PM

by Kishan1003 • New Contributor

10-19-2023 8:10:29 PM

3600 Views
1 replies
0 kudos

Merge Operation is very slow for S/4 Table ACDOCA

Hello,we have a scenario in Databricks where every day we get 60-70 million records and it takes a lot of time to merge the data into 28 billion records which is already sitting there . The time taken to rewrite the files which are affected is too ...

Data Engineering

3600 Views
1 replies
0 kudos

10-19-2023 8:10:29 PM

View Replies

Latest Reply

177991
New Contributor II

12-14-2023 3:17:06 PM

0 kudos

Hi @Kishan1003 did you find something helpful? Im dealing with a similar situation, acdoca table on my side is around 300M (fairly smaller), and incoming daily data is usually around 1M. I have try partition using period, like fiscyearper column, zo...

0 kudos

12-14-2023 3:17:06 PM

by costi9992 • New Contributor III

11-01-2023 3:36:35 AM

6327 Views
6 replies
0 kudos

Resolved! Add policy init_scripts.*.volumes.destination for dlt not working

Hi,I tried to create a policy to use it for DLTs that are ran with shared clusters, but when i run the DLT with this policy I have an error. Init-script is added to Allowed JARs/Init Scripts.DLT events error: Cluster scoped init script /Volumes/main/...

Data Engineering

6327 Views
6 replies
0 kudos

11-01-2023 3:36:35 AM

View Replies

Latest Reply

ayush007
New Contributor II

12-14-2023 8:42:05 AM

0 kudos

@costi9992I am facing same issue with UC enabled cluster with 13.3 Databricks Runtime.I have uploaded the init shell script in Volume with particular init script allowed by metastore admin.But I get the same error as you stated .When I looked in clus...

0 kudos

12-14-2023 8:42:05 AM

5 More Replies

by shivam-singh • New Contributor

12-14-2023 12:46:59 AM

1299 Views
1 replies
0 kudos

Databricks-Autoloader-S3-KMS

Hi, I am working on a requirement where I am using autoloader in a DLT pipeline to ingest new files as they come.This flow is working fine. However I am facing an issue, when we have the source bucket an s3 location, since the bucket is having a SSE-...

Data Engineering

1299 Views
1 replies
0 kudos

12-14-2023 12:46:59 AM

View Replies

Latest Reply

kulkpd
Contributor

12-14-2023 10:50:03 AM

0 kudos

Can you please paste the exact errors and check below things:check following if its related to KMS:1. IAM role policy and KMS policy should have allow permissions2. Did you use extraConfig while mounting the source-s3 bucket:If you have used IAM role...

0 kudos

12-14-2023 10:50:03 AM

by esalohs • New Contributor III

11-21-2023 8:28:59 AM

9952 Views
6 replies
4 kudos

Databricks Autoloader - list only new files in an s3 bucket/directory

I have an s3 bucket with a couple of subdirectories/partitions like s3a://Bucket/dir1/ and s3a://Bucket/dir2/. There is currently in the millions of files sitting in bucket in the various subdirectories/partitions. I'm getting new data in near real t...

Data Engineering

9952 Views
6 replies
4 kudos

11-21-2023 8:28:59 AM

View Replies

Latest Reply

kulkpd
Contributor

12-14-2023 10:38:52 AM

4 kudos

below option used while performing spark.readStream:::.option('cloudFiles.format', 'json').option('cloudFiles.inferColumnTypes', 'true').option('cloudFiles.schemaEvolutionMode', 'rescue').option('cloudFiles.useNotifications', True).option('skipChange...

4 kudos

12-14-2023 10:38:52 AM

5 More Replies

by Muhammed • New Contributor III

11-21-2023 6:13:24 AM

27743 Views
13 replies
0 kudos

Filtering files for query

Hi Team,While writing my data to datalake table I am getting 'filtering files for query', it would be stuck at writingHow can I resolve this issue

Data Engineering

27743 Views
13 replies
0 kudos

11-21-2023 6:13:24 AM

View Replies

Latest Reply

kulkpd
Contributor

12-14-2023 10:31:14 AM

0 kudos

My bad, somewhere in the screenshot I saw that but not able to find it now.Which source you are using to load the data, delta table, aws-s3, or azure-storage?

0 kudos

12-14-2023 10:31:14 AM

12 More Replies

by geetha_venkates • New Contributor II

12-08-2021 8:26:47 AM

12423 Views
7 replies
2 kudos

Resolved! How do we add a certificate file in Databricks for sparksubmit type of job?

How do we add a certificate file in Databricks for sparksubmit type of job?

Data Engineering

12423 Views
7 replies
2 kudos

12-08-2021 8:26:47 AM

View Replies

Latest Reply

nicozambelli
New Contributor II

12-13-2023 8:22:45 AM

2 kudos

I have the same problem... when i worked with the hive_metastore in past, i was able tu use file system and also use API certs.Now i'm using the unity catalog and i can't upload a certificate, can somebody help me?

2 kudos

12-13-2023 8:22:45 AM

6 More Replies

by RobinK • Contributor

11-08-2023 4:17:52 AM

17719 Views
5 replies
6 kudos

Resolved! How to set Python rootpath when deploying with DABs

We have structured our code according to the documentation (notebooks-best-practices). We use Jupyter notebooks and have outsourced logic to Python modules. Unfortunately, the example described in the documentation only works if you have checked out ...

Data Engineering

17719 Views
5 replies
6 kudos

11-08-2023 4:17:52 AM

View Replies

Latest Reply

Corbin
Databricks Employee

11-16-2023 6:37:23 AM

6 kudos

Hello Robin, You’ll have to either use wheel files to package your libs and use those (see docs here), to make imports work out of the box. Otherwise, your entry point file needs to add the bundle root directory (or whatever the lib directory is) to ...

6 kudos

11-16-2023 6:37:23 AM

4 More Replies

by Kumarashokjmu • New Contributor II

12-05-2023 2:20:03 PM

5309 Views
4 replies
0 kudos

need to ingest millions of csv files from aws s3

I have a need to ingest millions of csv files from aws s3 bucket. I am facing issue with aws s3 throttling issue and besides notebook process is running for 8 hours plus and sometimes failing. When looking at cluster performance, it is utilized 60%.I...

Data Engineering

5309 Views
4 replies
0 kudos

12-05-2023 2:20:03 PM

View Replies

Latest Reply

kulkpd
Contributor

12-12-2023 5:54:56 PM

0 kudos

If you want to load all the data at once use autoloader or DLT pipeline with directory listing if files are lexically ordered. ORIf you want to perform incremental load, divide the load into two job like historic data load vs live data load:Live data...

0 kudos

12-12-2023 5:54:56 PM

3 More Replies

Databricks Community

Forum Posts

Resolved! Regarding Exam got suspended

DBR 14.1 - foreachBatch in Spark Connect Shared Clusters are not supported in Unity Catalog.

CANNOT_RENAME_ACROSS_SCHEMA message error

List parameter in Python SQL connector 3.0.1

Retrieving Logs with Job API Get-outputs service

How to convert entity_id to notebook name or job

Define VIEW in Databricks Asset Bundles?

Merge Operation is very slow for S/4 Table ACDOCA

Resolved! Add policy init_scripts.*.volumes.destination for dlt not working

Databricks-Autoloader-S3-KMS

Databricks Autoloader - list only new files in an s3 bucket/directory

Filtering files for query

Resolved! How do we add a certificate file in Databricks for sparksubmit type of job?

Resolved! How to set Python rootpath when deploying with DABs

need to ingest millions of csv files from aws s3

Join Us as a Local Community Builder!

Encountering an error while setting up a single-no...

AUTO CDC API and sequence column

when automatic liquid clustering is enabled, how t...

Can't mergeSchema handle int and bigint?

Understanding least common type in databricks