cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

confused_dev
by New Contributor II
  • 27918 Views
  • 7 replies
  • 5 kudos

Python mocking dbutils in unittests

I am trying to write some unittests using pytest, but I am coming accross the problem of how to mock my dbutils method when dbutils isn't being defined in my notebook.Is there a way to do this so that I can unit test individual functions that are uti...

  • 27918 Views
  • 7 replies
  • 5 kudos
Latest Reply
pavlosskev
New Contributor III
  • 5 kudos

Fermin_vicente's answer is pretty good already. Below is how you can do something similar with conftest.py# conftest.py import pytest from unittest.mock import MagicMock from pyspark.sql import SparkSession @pytest.fixture(scope="session") def dbuti...

  • 5 kudos
6 More Replies
hanspetter
by New Contributor III
  • 50156 Views
  • 19 replies
  • 4 kudos

Resolved! Is it possible to get Job Run ID of notebook run by dbutils.notbook.run?

When running a notebook using dbutils.notebook.run from a master-notebook, an url to that running notebook is printed, i.e.: Notebook job #223150 Notebook job #223151 Are there any ways to capture that Job Run ID (#223150 or #223151)? We have 50 or ...

  • 50156 Views
  • 19 replies
  • 4 kudos
Latest Reply
Rodrigo_Mohr
New Contributor II
  • 4 kudos

I know this is an old thread, but sharing what is working for me well in Python now, for retrieving the run_id as well and building the entire link to that job run:job_id = dbutils.notebook.entry_point.getDbutils().notebook().getContext().jobId().get...

  • 4 kudos
18 More Replies
mjbobak
by New Contributor III
  • 21072 Views
  • 5 replies
  • 9 kudos

Resolved! How to import a helper module that uses databricks specific modules (dbutils)

I have a main databricks notebook that runs a handful of functions. In this notebook, I import a helper.py file that is in my same repo and when I execute the import everything looks fine. Inside my helper.py there's a function that leverages built-i...

  • 21072 Views
  • 5 replies
  • 9 kudos
Latest Reply
amitca71
Contributor II
  • 9 kudos

Hi,i 'm facing similiar issue, when deploying via dbx.I have an helper notebook, that when executing it via jobs works fine (without any includes)while i deploy it via dbx (to same cluster), the helper notebook results withdbutils.fs.ls(path)NameEr...

  • 9 kudos
4 More Replies
GC-James
by Contributor II
  • 13855 Views
  • 15 replies
  • 5 kudos

Resolved! Lost memory when using dbutils

Why does copying a 9GB file from a container to the /dbfs lose me 50GB of memory? (Which doesn't come back until I restarted the cluster)

image
  • 13855 Views
  • 15 replies
  • 5 kudos
Latest Reply
AdrianP
New Contributor II
  • 5 kudos

Hi James,Did you get to the bottom of this? We are experiencing the same issue, and all the suggested solutions don't seem to work.Thanks,Adrian

  • 5 kudos
14 More Replies
Jain
by New Contributor III
  • 4663 Views
  • 4 replies
  • 4 kudos

Unable to use dbutils in Premium

I am unable to use dbutils commands and mkdir, etc also does not work after upgrading my Databricks Workspace from Standard tier to Premium tier.It throws the following error:py4j.security.Py4JSecurityException: Constructor public com.databricks.back...

  • 4663 Views
  • 4 replies
  • 4 kudos
Latest Reply
Anonymous
Not applicable
  • 4 kudos

Hi @Abhishek Jain​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Than...

  • 4 kudos
3 More Replies
opt
by New Contributor
  • 1125 Views
  • 1 replies
  • 1 kudos

how to execute "Build your Chat Bot with Dolly Demo" in my own VM?

I am trying to execute Build your Chat Bot with Dolly Demo using my own VM. At the first steps they are executing this command %run ./_resources/00-init $catalog=hive_metastore $db=dbdemos_llm  which is -as I understand- calling another python script...

  • 1125 Views
  • 1 replies
  • 1 kudos
Latest Reply
Anonymous
Not applicable
  • 1 kudos

Hi @alaa migdady​ Great to meet you, and thanks for your question! Let's see if your peers in the community have an answer to your question. Thanks.

  • 1 kudos
grazie
by Contributor
  • 2259 Views
  • 2 replies
  • 2 kudos

how to get dbutils in Runtime 13

We're using the following method (generated by using dbx) to access dbutils, e.g. to retrieve parameters from secret scopes: @staticmethod def _get_dbutils(spark: SparkSession) -> "dbutils": try: from pyspark.dbutils import...

  • 2259 Views
  • 2 replies
  • 2 kudos
Latest Reply
colt
New Contributor III
  • 2 kudos

We have something similar in our code. This worked using runtime 13 until last week. Also the Machine Learning DBR doesn't work either.

  • 2 kudos
1 More Replies
J_M_W
by Contributor
  • 4383 Views
  • 2 replies
  • 3 kudos

Resolved! Can you use %run or dbutils.notebook.run in a Delta Live Table pipeline?

Hi there, Can you use a %run or dbutils.notebook.run() in a Delta Live Table (DLT) pipeline?When I try, I get the following error: "IllegalArgumentException: requirement failed: To enable notebook workflows, please upgrade your Databricks subscriptio...

  • 4383 Views
  • 2 replies
  • 3 kudos
Latest Reply
J_M_W
Contributor
  • 3 kudos

Hi all.@Kaniz Fatma​ thanks for your answer. I am on the premium pricing tier in Azure.After digging around the logs it would seem that you cannot run magic commands in a Delta Live Table pipeline. Therefore, you cannot use %run in a DLT pipeline - w...

  • 3 kudos
1 More Replies
Josh_Stafford
by New Contributor II
  • 1971 Views
  • 2 replies
  • 1 kudos

Using dbutils.fs.ls on URI with square brackets results in error

Square brackets in ADLS are accepted, so why can't I list the files in the folder? I have tried escaping the square brackets manually, but then the escaped values are re-escaped from %5B to %255B and %5D to %255D. I get:URISyntaxException: Illegal ...

  • 1971 Views
  • 2 replies
  • 1 kudos
Latest Reply
Anonymous
Not applicable
  • 1 kudos

@Joshua Stafford​ :The URISyntaxException error you are encountering is likely due to the fact that square brackets are reserved characters in URIs (Uniform Resource Identifiers) and need to be properly encoded when used in a URL. In this case, it ap...

  • 1 kudos
1 More Replies
tessaickx
by New Contributor III
  • 2795 Views
  • 3 replies
  • 3 kudos

Using ipywidgets latest versions

Hello everyone,I upgraded my cluster to DBR 13.0, which comes with ipywidgets version 7.7.2 installed.However, I want to use the TagsInput widget, which is new since version 8.0.4.If i upgrade the ipywidgets package to version 8.0.4, none of the widg...

  • 2795 Views
  • 3 replies
  • 3 kudos
Latest Reply
Anonymous
Not applicable
  • 3 kudos

Hi @Tessa Ickx​ Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answers yo...

  • 3 kudos
2 More Replies
MetaRossiVinli
by Contributor
  • 4158 Views
  • 1 replies
  • 1 kudos

Resolved! Find root path to Repo for .py file import

I want to import a Python function stored in the following file path:`<repo>/lib/lib_helpers.py`I want to import the function from any file in my repo. For instance from these:`<repo>/notebooks/etl/bronze/dlt_bronze_elt``<repo>/workers/job_worker`It ...

  • 4158 Views
  • 1 replies
  • 1 kudos
Latest Reply
MetaRossiVinli
Contributor
  • 1 kudos

Ok, I figured it out. If you just make it a Python module by adding an empty `__init__.py`, Databricks will load it on start. Then, you can just import it.

  • 1 kudos
oriole
by New Contributor III
  • 8773 Views
  • 5 replies
  • 2 kudos

Resolved! Spark Driver Crash Writing Large Text

I'm working with a large text variable, working it into single line JSON where Spark can process beautifully. Using a single node 256 GB 32 core Standard_E32d_v4 "cluster", which should be plenty memory for this dataset (haven't seen cluster memory u...

  • 8773 Views
  • 5 replies
  • 2 kudos
Latest Reply
pvignesh92
Honored Contributor
  • 2 kudos

@David Toft​ Hi, The current implementation of dbutils.fs is single-threaded, performs the initial listing on the driver and subsequently launches a Spark job to perform the per-file operations. So I guess the put operation is running on a single cor...

  • 2 kudos
4 More Replies
GC-James
by Contributor II
  • 3870 Views
  • 6 replies
  • 10 kudos

Disable dbutils suggestion

I would like to turn off or suppress this message which is returned from the dbutils library. %r   files <- dbutils.fs.ls("/dbfs/tmp/")   For prettier results from dbutils.fs.ls(<dir>), please use `%fs ls <dir>`How can I do this?

  • 3870 Views
  • 6 replies
  • 10 kudos
Latest Reply
Vidula
Honored Contributor
  • 10 kudos

Hi @James Smith​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Thanks...

  • 10 kudos
5 More Replies
sudhanshu1
by New Contributor III
  • 621 Views
  • 0 replies
  • 0 kudos

Structured Streaming

I need some solution for below problem.We have set of json files which are keep coming to aws s3, these files contains details for a property . please note 1 property can have 10-12 rows in this json file. Attached is sample json file.We need to read...

  • 621 Views
  • 0 replies
  • 0 kudos
cvantassel
by New Contributor III
  • 6024 Views
  • 7 replies
  • 8 kudos

Is there any way to propagate errors from dbutils?

I have a master notebook that runs a few different notebooks on a schedule using the dbutils.notebook.run() function. Occasionally, these child notebooks will fail (due to API connections or whatever). My issue is, when I attempt to catch the errors ...

  • 6024 Views
  • 7 replies
  • 8 kudos
Latest Reply
wdphilli
New Contributor III
  • 8 kudos

I have the same issue. I see no reason that Databricks couldn't propagate the internal exception back through their WorkflowException

  • 8 kudos
6 More Replies
Labels