cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

mkrish28
by New Contributor II
  • 2211 Views
  • 2 replies
  • 0 kudos

Resolved! Regarding Exam got suspended

Hello Team,I had a disappointing experience while attempting my first DataBricks certification. Abruptly, the proctor asked me to show my desk, and after complying. Eventually, they suspended my exam, citing excessive eye movement and other practices...

  • 2211 Views
  • 2 replies
  • 0 kudos
Latest Reply
Cert-Team
Databricks Employee
  • 0 kudos

@mkrish28 I'm sorry to hear you had this experience. Thank you for logging at ticket with the support team. They have informed me they have rescheduled your exam. Good luck!

  • 0 kudos
1 More Replies
samur
by New Contributor II
  • 2071 Views
  • 1 replies
  • 1 kudos

DBR 14.1 - foreachBatch in Spark Connect Shared Clusters are not supported in Unity Catalog.

I am getting this error on DBR 14.1AnalysisException: [UC_COMMAND_NOT_SUPPORTED.WITHOUT_RECOMMENDATION] The command(s): foreachBatch in Spark Connect Shared Clusters are not supported in Unity Catalog.This is the code: wstream = df.writeStream.foreac...

  • 2071 Views
  • 1 replies
  • 1 kudos
Iam
by New Contributor II
  • 1945 Views
  • 1 replies
  • 0 kudos

CANNOT_RENAME_ACROSS_SCHEMA message error

Hello...We enabled Unity Catalog and we are migrating schemas. When I ran the command sync schema catalog01.schema01 FROM hive_metastore.schema01 dry run  I got the error CANNOT_RENAME_ACROSS_CATALOG, reviewing your documentation it only said   CANNO...

  • 1945 Views
  • 1 replies
  • 0 kudos
PetitLepton
by New Contributor II
  • 8430 Views
  • 1 replies
  • 0 kudos

List parameter in Python SQL connector 3.0.1

Hi,up to recently in version of the Python SQL connector 2.9.3, I was using a list as a parameter in the cursor.execute(operation, parameters) method without any trouble. It seems that it is not possible anymore in version 3.0.1 as the parsing of par...

  • 8430 Views
  • 1 replies
  • 0 kudos
Latest Reply
PetitLepton
New Contributor II
  • 0 kudos

I should better read the documentation : https://github.com/databricks/databricks-sql-python/blob/v3.0.0/docs/parameters.md. 

  • 0 kudos
dcardenas
by New Contributor
  • 815 Views
  • 0 replies
  • 0 kudos

Retrieving Logs with Job API Get-outputs service

Hello,I would like to retrieve the logs of some job that where launched using the Job Rest Api 2.0. I see in the doc that can be done with the service get-ouputs, however each time I check the service I just get the metadata part of the response but ...

  • 815 Views
  • 0 replies
  • 0 kudos
ken2
by New Contributor II
  • 2663 Views
  • 3 replies
  • 0 kudos

How to convert entity_id to notebook name or job

Hi, Databricks developers!I use system.access.table_lineage refering to this page.It's difficult for us to recognize which notebook was indicated by the entity_id.How do I get the table to convert entity_ids to Job names or Notebook names?

  • 2663 Views
  • 3 replies
  • 0 kudos
Latest Reply
mlamairesse
Databricks Employee
  • 0 kudos

Workflows system tables are coming very soon. 

  • 0 kudos
2 More Replies
cg3
by New Contributor
  • 915 Views
  • 0 replies
  • 0 kudos

Define VIEW in Databricks Asset Bundles?

Is it possible to define a Unity Catalog VIEW in a Databricks Asset Bundle, or specify in the bundle that a specific notebook gets run once per deployment?

  • 915 Views
  • 0 replies
  • 0 kudos
Kishan1003
by New Contributor
  • 3600 Views
  • 1 replies
  • 0 kudos

Merge Operation is very slow for S/4 Table ACDOCA

Hello,we have a scenario in Databricks where every day  we get 60-70 million records  and it takes a lot of time to merge the data into 28 billion records which is already sitting there . The time taken to rewrite the files which are affected is too ...

  • 3600 Views
  • 1 replies
  • 0 kudos
Latest Reply
177991
New Contributor II
  • 0 kudos

Hi @Kishan1003  did you find something helpful? Im dealing with a similar situation, acdoca table on my side is around 300M (fairly smaller), and incoming daily data is usually around 1M. I have try partition using period, like fiscyearper column, zo...

  • 0 kudos
costi9992
by New Contributor III
  • 6327 Views
  • 6 replies
  • 0 kudos

Resolved! Add policy init_scripts.*.volumes.destination for dlt not working

Hi,I tried to create a policy to use it for DLTs that are ran with shared clusters, but when i run the DLT with this policy I have an error. Init-script is added to Allowed JARs/Init Scripts.DLT events error: Cluster scoped init script /Volumes/main/...

  • 6327 Views
  • 6 replies
  • 0 kudos
Latest Reply
ayush007
New Contributor II
  • 0 kudos

@costi9992I am facing same issue with UC enabled cluster with 13.3 Databricks Runtime.I have uploaded the init shell script in Volume with particular init script allowed by metastore admin.But I get the same error as you stated .When I looked in clus...

  • 0 kudos
5 More Replies
shivam-singh
by New Contributor
  • 1299 Views
  • 1 replies
  • 0 kudos

Databricks-Autoloader-S3-KMS

Hi, I am working on a requirement where I am using autoloader in a DLT pipeline to ingest new files as they come.This flow is working fine. However I am facing an issue, when we have the source bucket an s3 location, since the bucket is having a SSE-...

  • 1299 Views
  • 1 replies
  • 0 kudos
Latest Reply
kulkpd
Contributor
  • 0 kudos

Can you please paste the exact errors and check below things:check following if its related to KMS:1. IAM role policy and KMS policy should have allow permissions2. Did you use extraConfig while mounting the source-s3 bucket:If you have used IAM role...

  • 0 kudos
esalohs
by New Contributor III
  • 9952 Views
  • 6 replies
  • 4 kudos

Databricks Autoloader - list only new files in an s3 bucket/directory

I have an s3 bucket with a couple of subdirectories/partitions like s3a://Bucket/dir1/ and s3a://Bucket/dir2/. There is currently in the millions of files sitting in bucket in the various subdirectories/partitions. I'm getting new data in near real t...

  • 9952 Views
  • 6 replies
  • 4 kudos
Latest Reply
kulkpd
Contributor
  • 4 kudos

below option used while performing spark.readStream:::.option('cloudFiles.format', 'json').option('cloudFiles.inferColumnTypes', 'true').option('cloudFiles.schemaEvolutionMode', 'rescue').option('cloudFiles.useNotifications', True).option('skipChange...

  • 4 kudos
5 More Replies
Muhammed
by New Contributor III
  • 27743 Views
  • 13 replies
  • 0 kudos

Filtering files for query

Hi Team,While writing my data to datalake table I am getting 'filtering files for query', it would be stuck at writingHow can I resolve this issue

  • 27743 Views
  • 13 replies
  • 0 kudos
Latest Reply
kulkpd
Contributor
  • 0 kudos

My bad, somewhere in the screenshot I saw that but not able to find it now.Which source you are using to load the data, delta table, aws-s3, or azure-storage?

  • 0 kudos
12 More Replies
geetha_venkates
by New Contributor II
  • 12423 Views
  • 7 replies
  • 2 kudos

Resolved! How do we add a certificate file in Databricks for sparksubmit type of job?

How do we add a certificate file in Databricks for sparksubmit type of job? 

  • 12423 Views
  • 7 replies
  • 2 kudos
Latest Reply
nicozambelli
New Contributor II
  • 2 kudos

I have the same problem... when i worked with the hive_metastore in past, i was able tu use file system and also use API certs.Now i'm using the unity catalog and i can't upload a certificate, can somebody help me?

  • 2 kudos
6 More Replies
RobinK
by Contributor
  • 17719 Views
  • 5 replies
  • 6 kudos

Resolved! How to set Python rootpath when deploying with DABs

We have structured our code according to the documentation (notebooks-best-practices). We use Jupyter notebooks and have outsourced logic to Python modules. Unfortunately, the example described in the documentation only works if you have checked out ...

  • 17719 Views
  • 5 replies
  • 6 kudos
Latest Reply
Corbin
Databricks Employee
  • 6 kudos

Hello Robin, You’ll have to either use wheel files to package your libs and use those (see docs here), to make imports work out of the box. Otherwise, your entry point file needs to add the bundle root directory (or whatever the lib directory is) to ...

  • 6 kudos
4 More Replies
Kumarashokjmu
by New Contributor II
  • 5309 Views
  • 4 replies
  • 0 kudos

need to ingest millions of csv files from aws s3

I have a need to ingest millions of csv files from aws s3 bucket. I am facing issue with aws s3 throttling issue and besides notebook process is running for 8 hours plus and sometimes failing. When looking at cluster performance, it is utilized 60%.I...

  • 5309 Views
  • 4 replies
  • 0 kudos
Latest Reply
kulkpd
Contributor
  • 0 kudos

If you want to load all the data at once use autoloader or DLT pipeline with directory listing if files are lexically ordered. ORIf you want to perform incremental load, divide the load into two job like historic data load vs live data load:Live data...

  • 0 kudos
3 More Replies

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels