cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

PetitLepton
by New Contributor II
  • 8294 Views
  • 1 replies
  • 0 kudos

List parameter in Python SQL connector 3.0.1

Hi,up to recently in version of the Python SQL connector 2.9.3, I was using a list as a parameter in the cursor.execute(operation, parameters) method without any trouble. It seems that it is not possible anymore in version 3.0.1 as the parsing of par...

  • 8294 Views
  • 1 replies
  • 0 kudos
Latest Reply
PetitLepton
New Contributor II
  • 0 kudos

I should better read the documentation : https://github.com/databricks/databricks-sql-python/blob/v3.0.0/docs/parameters.md. 

  • 0 kudos
dcardenas
by New Contributor
  • 781 Views
  • 0 replies
  • 0 kudos

Retrieving Logs with Job API Get-outputs service

Hello,I would like to retrieve the logs of some job that where launched using the Job Rest Api 2.0. I see in the doc that can be done with the service get-ouputs, however each time I check the service I just get the metadata part of the response but ...

  • 781 Views
  • 0 replies
  • 0 kudos
ken2
by New Contributor II
  • 2577 Views
  • 3 replies
  • 0 kudos

How to convert entity_id to notebook name or job

Hi, Databricks developers!I use system.access.table_lineage refering to this page.It's difficult for us to recognize which notebook was indicated by the entity_id.How do I get the table to convert entity_ids to Job names or Notebook names?

  • 2577 Views
  • 3 replies
  • 0 kudos
Latest Reply
mlamairesse
Databricks Employee
  • 0 kudos

Workflows system tables are coming very soon. 

  • 0 kudos
2 More Replies
cg3
by New Contributor
  • 791 Views
  • 0 replies
  • 0 kudos

Define VIEW in Databricks Asset Bundles?

Is it possible to define a Unity Catalog VIEW in a Databricks Asset Bundle, or specify in the bundle that a specific notebook gets run once per deployment?

  • 791 Views
  • 0 replies
  • 0 kudos
Kishan1003
by New Contributor
  • 3517 Views
  • 1 replies
  • 0 kudos

Merge Operation is very slow for S/4 Table ACDOCA

Hello,we have a scenario in Databricks where every day  we get 60-70 million records  and it takes a lot of time to merge the data into 28 billion records which is already sitting there . The time taken to rewrite the files which are affected is too ...

  • 3517 Views
  • 1 replies
  • 0 kudos
Latest Reply
177991
New Contributor II
  • 0 kudos

Hi @Kishan1003  did you find something helpful? Im dealing with a similar situation, acdoca table on my side is around 300M (fairly smaller), and incoming daily data is usually around 1M. I have try partition using period, like fiscyearper column, zo...

  • 0 kudos
costi9992
by New Contributor III
  • 6100 Views
  • 6 replies
  • 0 kudos

Resolved! Add policy init_scripts.*.volumes.destination for dlt not working

Hi,I tried to create a policy to use it for DLTs that are ran with shared clusters, but when i run the DLT with this policy I have an error. Init-script is added to Allowed JARs/Init Scripts.DLT events error: Cluster scoped init script /Volumes/main/...

  • 6100 Views
  • 6 replies
  • 0 kudos
Latest Reply
ayush007
New Contributor II
  • 0 kudos

@costi9992I am facing same issue with UC enabled cluster with 13.3 Databricks Runtime.I have uploaded the init shell script in Volume with particular init script allowed by metastore admin.But I get the same error as you stated .When I looked in clus...

  • 0 kudos
5 More Replies
shivam-singh
by New Contributor
  • 1226 Views
  • 1 replies
  • 0 kudos

Databricks-Autoloader-S3-KMS

Hi, I am working on a requirement where I am using autoloader in a DLT pipeline to ingest new files as they come.This flow is working fine. However I am facing an issue, when we have the source bucket an s3 location, since the bucket is having a SSE-...

  • 1226 Views
  • 1 replies
  • 0 kudos
Latest Reply
kulkpd
Contributor
  • 0 kudos

Can you please paste the exact errors and check below things:check following if its related to KMS:1. IAM role policy and KMS policy should have allow permissions2. Did you use extraConfig while mounting the source-s3 bucket:If you have used IAM role...

  • 0 kudos
esalohs
by New Contributor III
  • 9684 Views
  • 6 replies
  • 4 kudos

Databricks Autoloader - list only new files in an s3 bucket/directory

I have an s3 bucket with a couple of subdirectories/partitions like s3a://Bucket/dir1/ and s3a://Bucket/dir2/. There is currently in the millions of files sitting in bucket in the various subdirectories/partitions. I'm getting new data in near real t...

  • 9684 Views
  • 6 replies
  • 4 kudos
Latest Reply
kulkpd
Contributor
  • 4 kudos

below option used while performing spark.readStream:::.option('cloudFiles.format', 'json').option('cloudFiles.inferColumnTypes', 'true').option('cloudFiles.schemaEvolutionMode', 'rescue').option('cloudFiles.useNotifications', True).option('skipChange...

  • 4 kudos
5 More Replies
Muhammed
by New Contributor III
  • 26755 Views
  • 13 replies
  • 0 kudos

Filtering files for query

Hi Team,While writing my data to datalake table I am getting 'filtering files for query', it would be stuck at writingHow can I resolve this issue

  • 26755 Views
  • 13 replies
  • 0 kudos
Latest Reply
kulkpd
Contributor
  • 0 kudos

My bad, somewhere in the screenshot I saw that but not able to find it now.Which source you are using to load the data, delta table, aws-s3, or azure-storage?

  • 0 kudos
12 More Replies
geetha_venkates
by New Contributor II
  • 12164 Views
  • 7 replies
  • 2 kudos

Resolved! How do we add a certificate file in Databricks for sparksubmit type of job?

How do we add a certificate file in Databricks for sparksubmit type of job? 

  • 12164 Views
  • 7 replies
  • 2 kudos
Latest Reply
nicozambelli
New Contributor II
  • 2 kudos

I have the same problem... when i worked with the hive_metastore in past, i was able tu use file system and also use API certs.Now i'm using the unity catalog and i can't upload a certificate, can somebody help me?

  • 2 kudos
6 More Replies
RobinK
by Contributor
  • 17306 Views
  • 5 replies
  • 6 kudos

Resolved! How to set Python rootpath when deploying with DABs

We have structured our code according to the documentation (notebooks-best-practices). We use Jupyter notebooks and have outsourced logic to Python modules. Unfortunately, the example described in the documentation only works if you have checked out ...

  • 17306 Views
  • 5 replies
  • 6 kudos
Latest Reply
Corbin
Databricks Employee
  • 6 kudos

Hello Robin, You’ll have to either use wheel files to package your libs and use those (see docs here), to make imports work out of the box. Otherwise, your entry point file needs to add the bundle root directory (or whatever the lib directory is) to ...

  • 6 kudos
4 More Replies
Kumarashokjmu
by New Contributor II
  • 5096 Views
  • 4 replies
  • 0 kudos

need to ingest millions of csv files from aws s3

I have a need to ingest millions of csv files from aws s3 bucket. I am facing issue with aws s3 throttling issue and besides notebook process is running for 8 hours plus and sometimes failing. When looking at cluster performance, it is utilized 60%.I...

  • 5096 Views
  • 4 replies
  • 0 kudos
Latest Reply
kulkpd
Contributor
  • 0 kudos

If you want to load all the data at once use autoloader or DLT pipeline with directory listing if files are lexically ordered. ORIf you want to perform incremental load, divide the load into two job like historic data load vs live data load:Live data...

  • 0 kudos
3 More Replies
leelee3000
by Databricks Employee
  • 1118 Views
  • 0 replies
  • 0 kudos

Dynamic Filtering Criteria for Data Streaming

One of the potential uses for DLT is a scenario where I have a large input stream of data and need to create multiple smaller streams based on dynamic and adjustable filtering criteria. The challenge is to allow non-engineering individuals to adjust ...

  • 1118 Views
  • 0 replies
  • 0 kudos
leelee3000
by Databricks Employee
  • 1687 Views
  • 0 replies
  • 0 kudos

Parameterizing DLT Jobs

I have observed the use of advanced configuration and creating a map as a way to parameterize notebooks, but these appear to be cluster-wide settings. Is there a recommended best practice for directly passing parameters to notebooks running on a DLT ...

  • 1687 Views
  • 0 replies
  • 0 kudos
Geoff
by New Contributor II
  • 1740 Views
  • 0 replies
  • 1 kudos

Bizarre Delta Tables pipeline error: ModuleNotFound

I received the following error when trying to import a function defined in a .py file into a .ipynb file. I would add code blocks, but the message keeps getting rejected for invalid HTML.# test_lib.py (same directory, in a subfolder)def square(x):ret...

  • 1740 Views
  • 0 replies
  • 1 kudos

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels