cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 
Data + AI Summit 2024 - Data Engineering & Streaming

Forum Posts

FG
by New Contributor II
  • 8755 Views
  • 5 replies
  • 1 kudos

Running unit tests from a different notebook (using Python unittest package) doesn't produce output (can't discover the test files)

I have a test file (test_transforms.py) which has a series of tests running using Python's unittest package. I can successfully run the tests inside of the file with expected output. But when I try to run this test file from a different notebook (run...

image.png image
  • 8755 Views
  • 5 replies
  • 1 kudos
Latest Reply
SpaceDC
New Contributor II
  • 1 kudos

Hello, I have exactly the same issue.In my case, using the ipytest library from Databricks clusters, this is the error that occurs when I try to run the tests:EEEEE [100%]============================================== ERRORS =========================...

  • 1 kudos
4 More Replies
epps
by New Contributor
  • 1490 Views
  • 1 replies
  • 0 kudos

400 Unable to load OAuth Config

I've enabled SSO for my Databricks account with Okta as the identity provider and tested the integration is working. I'm now trying to implement an on-behalf-of token exchange so that my API can make authenticate requests to Databricks's API (e.g. ) ...

  • 1490 Views
  • 1 replies
  • 0 kudos
Latest Reply
riyadh-ruhr
New Contributor II
  • 0 kudos

Hello ,Were able to fix the issue ? I'm trying to implement the same thing

  • 0 kudos
JUPin
by New Contributor II
  • 1304 Views
  • 3 replies
  • 0 kudos

REST API for Pipeline Events does not return all records

I'm using the REST API to retrieve Pipeline Events per the documentation:https://docs.databricks.com/api/workspace/pipelines/listpipelineeventsI am able to retrieve some records but the API stops after a call or two.  I verified the number of rows us...

  • 1304 Views
  • 3 replies
  • 0 kudos
Latest Reply
wise_owl
New Contributor III
  • 0 kudos

You can leverage this code base. It works as expected using "next_page_token" parameter-Don't forget to mark this solution as correct if this helped you  import requests token = 'your token' url = 'your URL' params = {'expand_tasks': 'true'} header...

  • 0 kudos
2 More Replies
himanshu_k
by New Contributor
  • 1464 Views
  • 1 replies
  • 0 kudos

Clarification Needed: Ensuring Correct Pagination with Offset and Limit in PySpark

Hi community,I hope you're all doing well. I'm currently engaged in a PySpark project where I'm implementing pagination-like functionality using the offset and limit functions. My aim is to retrieve data between a specified starting_index and ending_...

  • 1464 Views
  • 1 replies
  • 0 kudos
Latest Reply
wise_owl
New Contributor III
  • 0 kudos

You can leverage this code base. It works as expected using "next_page_token" parameter-Don't forget to mark this solution as correct if this helped you  import requests token = 'your token' url = 'your URL' params = {'expand_tasks': 'true'} header...

  • 0 kudos
tingwei
by New Contributor II
  • 2452 Views
  • 4 replies
  • 5 kudos

ISOLATION_STARTUP_FAILURE

Hi I'm getting error in my data pipeline[ISOLATION_STARTUP_FAILURE] Failed to start isolated execution environment. Please contact Databricks support. SQLSTATE: XXKSSit was working fine and suddenly it is keep failing. Please advice. 

  • 2452 Views
  • 4 replies
  • 5 kudos
Latest Reply
SP2
New Contributor II
  • 5 kudos

Hello Team, I'm unable to run UDF by using this this DBR. Has issue been fixed?

  • 5 kudos
3 More Replies
Jaku6
by New Contributor
  • 582 Views
  • 1 replies
  • 1 kudos

Run now with different parameters doesn't pass parameter to pipeline tasks

I have a job with some tasks. Some of the tasks are pipeline_task's some are notebook_task's.When I run the job with "Run now with different parameters" and enter a new key-value, I see that the key-value is available in the notebook_task's with dbut...

  • 582 Views
  • 1 replies
  • 1 kudos
Latest Reply
Walter_C
Honored Contributor
  • 1 kudos

As per docs it seems that pipeline task type is currently not supported to pass parameters: https://docs.databricks.com/en/jobs/create-run-jobs.html#pass-parameters-to-a-databricks-job-taskYou could create a notebook task that runs before your pipeli...

  • 1 kudos
Elderion
by New Contributor II
  • 404 Views
  • 3 replies
  • 1 kudos

Resolved! Delta Live Tables + Databricks Assets Bundles

Hi,I'm trying to setup CICD pipeline for Delta Live Table jobs using Databricks Bundles. I have a problem with path to notebook in pipeline. According to this example:https://docs.databricks.com/en/delta-live-tables/tutorial-bundles.htmlYAML file sho...

Elderion_0-1726152706169.png
  • 404 Views
  • 3 replies
  • 1 kudos
Latest Reply
ThierryBa
New Contributor III
  • 1 kudos

I had this error once.you need to specify the extension of your file. If you set the notebook to be python, then it must be .py at the end, likewise .sql if you used SQL     libraries:        - notebook:            path:  ${workspace.file_path}/datab...

  • 1 kudos
2 More Replies
Kurtis_R
by New Contributor II
  • 191 Views
  • 2 replies
  • 0 kudos

Excel Formula results

Hi all,Just wanted to raise a question regarding Databricks workbooks and viewing the results in the cells. For the example provided in the screenshot I want to view the results of an excel formula that has been applied to a cell in our workbooks. Fo...

Kurtis_R_0-1725568966348.png Kurtis_R_1-1725569630650.png
  • 191 Views
  • 2 replies
  • 0 kudos
Latest Reply
User16756723392
New Contributor III
  • 0 kudos

@Kurtis_R do you want to display the value of 45 or formula of how 45 is achieved.?

  • 0 kudos
1 More Replies
Sharmila_12
by New Contributor
  • 188 Views
  • 1 replies
  • 0 kudos

I don't have any Last name. what should give in the mandatory last name field?

Hi, I was about to register for Databricks certified Data engineer Associate exam. while registering for the exam, it is asking for Last name which is the mandatory field. But none of my Government proofs have last name, only first name is there. wha...

  • 188 Views
  • 1 replies
  • 0 kudos
Latest Reply
Anushree_Tatode
  • 0 kudos

Hi,To proceed with the registration, please enter a space or a full stop in the last name field. This should allow you to continue with the process, feel free to reach out if you need any further assistance.Best Regards,Anushree

  • 0 kudos
Enrique1987
by New Contributor III
  • 5005 Views
  • 1 replies
  • 0 kudos

Photon Benchmark

I'm conducting my own comparative study between a cluster with Photon enabled and a cluster without Photon to see what improvements occur. According to Databricks, there should be up to 12x better performance, but I'm only finding about a 20% improve...

  • 5005 Views
  • 1 replies
  • 0 kudos
Latest Reply
szymon_dybczak
Contributor III
  • 0 kudos

Hi @Enrique1987 ,You can find more information about photon in below whitepaper:https://people.eecs.berkeley.edu/~matei/papers/2022/sigmod_photon.pdf

  • 0 kudos
AmineHY
by Contributor
  • 8772 Views
  • 5 replies
  • 6 kudos

Resolved! How to read JSON files embedded in a list of lists?

HelloI am trying to read this JSON file but didn't succeed  You can see the head of the file, JSON inside a list of lists. Any idea how to read this file?

image image image
  • 8772 Views
  • 5 replies
  • 6 kudos
Latest Reply
adriennn
Contributor II
  • 6 kudos

The correct way to do this without using open, which will work only with local/mounted files is to read the files as binaryfile and then you will get the entire json string on each row, from there you can use from_json() and explode() to extract the ...

  • 6 kudos
4 More Replies
guangyi
by Contributor II
  • 543 Views
  • 4 replies
  • 0 kudos

Resolved! Unable to call UDF inside the Spark SQL: RuntimeError: SparkSession should be create

Here is how I define the UDF inside the file udf_define.py:from pyspark.sql.functions import length, udf from pyspark.sql.types import IntegerType from pyspark.sql import SparkSession spark = SparkSession.builder.getOrCreate() def strlen(s): ret...

  • 543 Views
  • 4 replies
  • 0 kudos
Latest Reply
guangyi
Contributor II
  • 0 kudos

And I tried getActiveSession() it is not working 

  • 0 kudos
3 More Replies
Personal1
by New Contributor II
  • 397 Views
  • 3 replies
  • 1 kudos

Problems with Azure Databricks

Hi,I want to use Databricks the first time, and am having many problems and confusions. Please help me resolve them.1. I created a free Databricks Community account on Azure and get error when creating the cluster/compute"Azure Quota Exceeded Excepti...

  • 397 Views
  • 3 replies
  • 1 kudos
Latest Reply
ThierryBa
New Contributor III
  • 1 kudos

you must have created some resources with public IP addresses in your azure subscription, ie: storage account, etc...Try to avoid using public IPs as much as possible to secure your tenant/subscription.try to find which of your Azure resources are us...

  • 1 kudos
2 More Replies
mac08_flo
by New Contributor
  • 174 Views
  • 1 replies
  • 1 kudos

Creation of logs in a file

Good afternoon.I am trying to add logs in the creation of my code. The issue is that I haven't yet found a way to write the logs to a separate file, rather than having them output to the terminal; I want them to be stored in a file (example.log).I ha...

  • 174 Views
  • 1 replies
  • 1 kudos
Latest Reply
filipniziol
New Contributor III
  • 1 kudos

Hi @mac08_flo ,Use logging library. You can configure to log to terminal, to files etc.https://www.highlight.io/blog/5-best-python-logging-libraries

  • 1 kudos

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group
Labels