cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

jpwp
by New Contributor III
  • 27228 Views
  • 9 replies
  • 9 kudos

Resolved! How to specify entry_point for python_wheel_task?

Can someone provide me an example for a python_wheel_task and what the entry_point field should be?The jobs UI help popup says this about "entry_point":"Function to call when starting the wheel, for example: main. If the entry point does not exist in...

  • 27228 Views
  • 9 replies
  • 9 kudos
Latest Reply
MRMintechGlobal
New Contributor II
  • 9 kudos

Just want to confirm - my project uses PDM not poetryand as such uses[project.entry-points.packages]Rather than[tool.poetry.scripts]and the bundle is failing to run on the cluster - as it can't find the entry point - is this expected behavior?

  • 9 kudos
8 More Replies
ggsmith
by Contributor
  • 1821 Views
  • 1 replies
  • 2 kudos

Resolved! DLT create_table vs create_streaming_table

What is the difference between the create_table and create_streaming_table functions in dlt?For example, this is how I have created a table that streams data from kafka written as json files to a volume.  @Dlt.table( name="raw_orders", table_...

  • 1821 Views
  • 1 replies
  • 2 kudos
Latest Reply
filipniziol
Esteemed Contributor
  • 2 kudos

Hi @ggsmith ,If you check the examples, you will notice that dlt.create_streaming_table is more specialized and you may consider it to be your target.As per documentation:Check this example:https://www.reddit.com/r/databricks/comments/1b9jg3t/dedupin...

  • 2 kudos
Vishalakshi
by New Contributor II
  • 11053 Views
  • 5 replies
  • 0 kudos

Need to automatically rerun the failed jobs in databricks

Hi all, I need to retrigger the failed jobs automatically in data bricks, can you please help me with all the possible ways to make it possible 

  • 11053 Views
  • 5 replies
  • 0 kudos
Latest Reply
filipniziol
Esteemed Contributor
  • 0 kudos

Hi @Vishalakshi ,I have responded during the weekend, but it seems the responses were lost.You have here the run object. For example the current criteria is to return only runs where run[state][result_state] == "FAILED" so basically all failed jobs.W...

  • 0 kudos
4 More Replies
cadull
by New Contributor II
  • 1369 Views
  • 2 replies
  • 1 kudos

Permission Issue with IDENTIFIER clause

Hi all,we are parameterizing environment specific catalog names (like `mycatalog_dev` vs. `mycatalog_prd`) in Lakeview dashboard queries like this:SELECT * FROM IDENTIFIER(:catalog_name || '.myschema.mytable')Which works fine in most cases. We have o...

  • 1369 Views
  • 2 replies
  • 1 kudos
Latest Reply
madams
Contributor II
  • 1 kudos

I've had quite a bit of fun with UC and view permissions.  I don't think this is specific to using the IDENTIFIER() function, but I suspect it's related to UC permissions.  What you'll need to ensure:The user or group who owns the view on catalog_b h...

  • 1 kudos
1 More Replies
angel_ba
by New Contributor II
  • 4663 Views
  • 2 replies
  • 2 kudos

File Trigger using azure file share in unity Catalog

Hello, I have got the unity catalog eanbled in my workspace. the file srae manually copied by customers in azure file share(domain joint account, wabs) on adhoc basis. I would like to add a file trigger on the job so that as soon as file arrives in t...

  • 4663 Views
  • 2 replies
  • 2 kudos
Latest Reply
adriennn
Valued Contributor
  • 2 kudos

@Diego33 Kaniz is half-bot half-human, but unfortunately not gracing us with "sorry for the confusion" responses.After a quick search, I thought that maybe there's a possiblity to find use the web terminal and do a manual mount with the bash script t...

  • 2 kudos
1 More Replies
stevenayers-bge
by Contributor
  • 1922 Views
  • 2 replies
  • 3 kudos

DBUtils from databricks-connect and runtime are quite different libraries....

If you find yourself using dbutils in any of your code, and you're testing locally vs running on a cluster, there's a few gotchas to be very careful of when it comes to listing files in Volumes or files on DBFS.The DBUtils you'll use locally installe...

  • 1922 Views
  • 2 replies
  • 3 kudos
Latest Reply
szymon_dybczak
Esteemed Contributor III
  • 3 kudos

Hi @stevenayers-bge ,Thanks for sharing. I didn't know that these interfaces aren't align with each other.

  • 3 kudos
1 More Replies
tdk
by New Contributor III
  • 1558 Views
  • 2 replies
  • 0 kudos

Resolved! Cannot install jar to cluster: invalid authority.

Hi allI want to access on-prem Oracle Database data from the python notebooks. However, the install of the jar (ojdbc8.jar) results in an error, which occurs while the cluster is starting up.The error message:"Library installation attempted on the dr...

  • 1558 Views
  • 2 replies
  • 0 kudos
Latest Reply
Walter_C
Databricks Employee
  • 0 kudos

The error message suggests that the jar file located at abfss:/jars/ojdbc8.jar has an invalid authority. This could be due to a number of reasons such as incorrect file path, insufficient permissions, or network restrictions. Here are a few steps you...

  • 0 kudos
1 More Replies
juan_barreto
by New Contributor III
  • 2636 Views
  • 6 replies
  • 9 kudos

Problem with dropDuplicates in Databricks runtime 15.4LTS

Hi,I'm testing the latest version of the databricks runtime but I'm getting errors doing a simple dropDuplicates.Using the following codedata = spark.read.table("some_table") data.dropDuplicates(subset=['SOME_COLUMN']).count() I'm getting this error....

juan_barreto_0-1726153266526.png
  • 2636 Views
  • 6 replies
  • 9 kudos
Latest Reply
Witold
Honored Contributor
  • 9 kudos

Unless is was communicated as a breaking changes between major updates, it would be OK. But I can't find anything in the release notes, so it's a bug.

  • 9 kudos
5 More Replies
ossinova
by Contributor II
  • 3116 Views
  • 3 replies
  • 2 kudos

Reading data from S3 in Azure Databricks

Is it possible to create an external volume in Azure Databricks that points to an external S3 bucket so that I can read files for processing? Or is it only limited to ADLSv2?

  • 3116 Views
  • 3 replies
  • 2 kudos
Latest Reply
Ashley1
Contributor
  • 2 kudos

Yep, I'm keen to see this functionality as well.I think it is reasonable to expect external locations can be on diverse storage types (at least the big players). I can nicely control access to azure storage in UC but not S3.

  • 2 kudos
2 More Replies
del1000
by New Contributor III
  • 21475 Views
  • 8 replies
  • 3 kudos

Resolved! Is it possible to passthrough job's parameters to variable?

Scenario:I tried to run notebook_primary as a job with same parameters' map. This notebook is orchestrator for notebooks_sec_1, notebooks_sec_2, and notebooks_sec_3 and next. I run them by dbutils.notebook.run(path, timeout, arguments) function.So ho...

  • 21475 Views
  • 8 replies
  • 3 kudos
Latest Reply
nnalla
New Contributor II
  • 3 kudos

I am using getCurrentBindings(), but it returns an empty dictionary even though I passed parameters. I am running it in a scheduled workflow job

  • 3 kudos
7 More Replies
rendorHaevyn
by New Contributor III
  • 12676 Views
  • 5 replies
  • 0 kudos

Databricks SQL Warehouse did not auto stop after specified 90 minute interval - why not?

In this specific case, we're running a 2XSmall SQL Warehouse on Databricks SQL.In looking at the SQL Warehouse monitoring log for this cluster, we noticed:final query executed by user at 10:26 on 2023-06-20no activity for some time, yet cluster remai...

  • 12676 Views
  • 5 replies
  • 0 kudos
Latest Reply
jfid
New Contributor II
  • 0 kudos

Also dealing with the same issue! Anybody has any idea how to check it? There is no sort of logs and no actual query happens

  • 0 kudos
4 More Replies
VeeruK
by New Contributor III
  • 3329 Views
  • 7 replies
  • 0 kudos

Databricks Lakehouse Fundamentals BadgeI have successfully passed the test after completion of the course "Databricks Lakehouse Fundamentals&quot...

Databricks Lakehouse Fundamentals BadgeI have successfully passed the test after completion of the course "Databricks Lakehouse Fundamentals". But I have'nt recieved any badge. I have been provided with a certificate only. Please provide me with th...

  • 3329 Views
  • 7 replies
  • 0 kudos
Latest Reply
data_learner
New Contributor II
  • 0 kudos

I'm having the same issue

  • 0 kudos
6 More Replies
Sangram
by New Contributor III
  • 4774 Views
  • 4 replies
  • 2 kudos

Turn on full screen for databricks training videos

It seems full screen option for databricks training videos are turned off. How to turn it on ?

  • 4774 Views
  • 4 replies
  • 2 kudos
Latest Reply
bennner
New Contributor II
  • 2 kudos

It sounds like the full-screen option is disabled by the platform hosting the Databricks training videos. If that's the case, it may be out of your control. However, you could try these workarounds:Browser Zoom: Use the zoom feature (Ctrl + "+" on Wi...

  • 2 kudos
3 More Replies
Mario_D
by New Contributor III
  • 1723 Views
  • 2 replies
  • 1 kudos

Resolved! Foreign key constraint in a dlt pipeline

As primary/foreign key constraints are now supported/available in Databricks, how are foreign key constraints handled in a dlt pipeline, i.e if a foreign key constraint is violated, is the record logged as a data quality issue and still added to the ...

  • 1723 Views
  • 2 replies
  • 1 kudos
Latest Reply
RCo
New Contributor III
  • 1 kudos

Hi @Mario_D!While primary & foreign key constraints are generally available in Databricks Runtime 15.2 and above, they are strictly informational only.This means that a primary key will not prevent duplicates from being added to a table and a foreign...

  • 1 kudos
1 More Replies

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels