cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

81528
by New Contributor II
  • 2312 Views
  • 2 replies
  • 3 kudos

The workers in the cluster uses old end-of-life Ubuntu 18.04.

I create a cluster or a pool with the runtime version 12.2 LTS or even with the latest 13.0According to the documentation the worker should use an image. with Ubuntu 20.04 https://docs.databricks.com/release-notes/runtime/12.2.html#system-environment...

ubuntu_ip-10-20-25-228___
  • 2312 Views
  • 2 replies
  • 3 kudos
Latest Reply
Priyag1
Honored Contributor II
  • 3 kudos

Contact support team

  • 3 kudos
1 More Replies
DavyN
by New Contributor II
  • 3925 Views
  • 3 replies
  • 3 kudos

Resolved! Unable to take Lakehouse Fundamentals Quiz

Hi,I watched the videos for Lakehouse Fundamentals. However, when I click on "Take the quiz" it opens another tab that says I don't have permission to access the page.I've done all the necessary signing up.Could someone please help. Thanks!

  • 3925 Views
  • 3 replies
  • 3 kudos
Latest Reply
MandatoryNickna
New Contributor II
  • 3 kudos

This still seems to be unavailable. Very annoying.

  • 3 kudos
2 More Replies
Oliver_Angelil
by Valued Contributor II
  • 2136 Views
  • 2 replies
  • 2 kudos

Automated CI code checks using workflows when PR is raised

I'm familiar with Github Actions workflows to automate code checks whenever a PR is raised to a specified branch. For example for Python code, very useful is if unit tests (e.g. pytest), syntax (flake8), and code formatting (black formatter), type h...

  • 2136 Views
  • 2 replies
  • 2 kudos
Latest Reply
Priyag1
Honored Contributor II
  • 2 kudos

In a typical software development workflow (e.g. Github flow), a feature branch is created based on the master branch for feature development. A notebook can be synced to the feature branch via Github integration. Or a notebook can be exported from D...

  • 2 kudos
1 More Replies
DeviJaviya
by New Contributor II
  • 3247 Views
  • 2 replies
  • 1 kudos

Trying to build subquery in Databricks notebook, similar to SQL in a data frame with the Top(1)

Hello Everyone,I am new to Databricks, so I am at the learning stage. It would be very helpful if someone helps in resolving the issue or I can say helped me to fix my code.I have built the query that fetches the data based on CASE, in Case I have a ...

  • 3247 Views
  • 2 replies
  • 1 kudos
Latest Reply
DeviJaviya
New Contributor II
  • 1 kudos

Hello Rishabh,Thank you for your suggestion, we tried to limit 1 but the output values are coming the same for all the dates. which is not correct.

  • 1 kudos
1 More Replies
Joey
by New Contributor II
  • 14483 Views
  • 3 replies
  • 0 kudos

How to fix the error on INVALID_PARAMETER_VALUE when using mlflow for tracking a yolo model training?

I'm new to databricks, and I'm trying to train yolo model and use mlflow to track the parameters and log the models. I keep getting this error related to the requirements.txt file path: INVALID_PARAMETER_VALUE: Invalid value '/Shared/YOLOv8/requireme...

  • 14483 Views
  • 3 replies
  • 0 kudos
Latest Reply
Joey
New Contributor II
  • 0 kudos

Thanks for the reply, @Suteja Kanuri​ . I tried the proposed solution. This time got this message:Invalid artifact path: '/Shared/YOLOv8'. Names may be treated as files in certain cases, and must not resolve to other names when treated as such. This ...

  • 0 kudos
2 More Replies
Hubert-Dudek
by Esteemed Contributor III
  • 1489 Views
  • 2 replies
  • 6 kudos

Have you ever wondered how to automate your #databricks jobs and workflows without using the UI? If you want to manage your Databricks resources as co...

Have you ever wondered how to automate your #databricks jobs and workflows without using the UI? If you want to manage your Databricks resources as code, you should check out Terraform.Here is a simple example of creating a job that runs a notebook o...

  • 1489 Views
  • 2 replies
  • 6 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 6 kudos

or use ADF Still waiting for actual added value on using Databricks Workflow over ADF.

  • 6 kudos
1 More Replies
AL1
by Contributor
  • 2247 Views
  • 3 replies
  • 2 kudos

In the spirit of the Holiday season, share us a picture of reward/s you received from Databricks Community Rewards Store below!  

In the spirit of the Holiday season, share us a picture of reward/s you received from Databricks Community Rewards Store below! 

databricks shirt
  • 2247 Views
  • 3 replies
  • 2 kudos
Latest Reply
Priyag1
Honored Contributor II
  • 2 kudos

Your tshirt is super cool n awesome

  • 2 kudos
2 More Replies
PriyaAnanthram
by Contributor III
  • 5637 Views
  • 6 replies
  • 0 kudos

Resolved! change data feed on delta live tables

I have a delta live table where I am reading cdc data and merging this data in silver using apply changes. In silver can I find out what all data has changed since the last run similar to change data feed table_changes?

  • 5637 Views
  • 6 replies
  • 0 kudos
Latest Reply
PriyaAnanthram
Contributor III
  • 0 kudos

I also have a requirment where i write to a live table (materialized view) and have cdf enabled i want to see the changes but here to i see overwrites happening after dlt pipeline runs

  • 0 kudos
5 More Replies
rlink
by New Contributor II
  • 3609 Views
  • 3 replies
  • 2 kudos

Resolved! Data Science & Engineering Dashboard Refresh Issue Using Databricks

Hi everyone,I create a Data Science & Engineering notebook in databricks to display some visualizations and also set up a schedule for the notebook to run every hour. I can see that the scheduled run is successful every hour, but the dashboard I crea...

  • 3609 Views
  • 3 replies
  • 2 kudos
Latest Reply
luis_herrera
Databricks Employee
  • 2 kudos

To schedule a dashboard to refresh at a specified interval, schedule the notebook that generates the dashboard graphs.PS: Check #DAIS2023 talks

  • 2 kudos
2 More Replies
Prannu
by New Contributor II
  • 2252 Views
  • 2 replies
  • 1 kudos

Location of files previously uploaded on DBFS

I have uploaded a csv data file and used it in a spark job three months back. I am now running the same spark job with a new cluster created. Program is running properly. I want to know where I can see the previously uploaded csv data file.

  • 2252 Views
  • 2 replies
  • 1 kudos
Latest Reply
karthik_p
Esteemed Contributor
  • 1 kudos

@Pranay Gupta​ you can see that in dbfs root directory, based on path you provided in job. please check .please go to data explorer and select below option that i shown in screen shot

  • 1 kudos
1 More Replies
Gopal269673
by Contributor
  • 2176 Views
  • 2 replies
  • 0 kudos

Calling jobs inside another job

Hi All.. I had created 2 job flows and one for transaction layer and another for datamart layer. I need to specify the job dependency between job1 vs Job2 and need to trigger the job2 after completing job1 without using any other orchestration tool o...

  • 2176 Views
  • 2 replies
  • 0 kudos
Latest Reply
Priyag1
Honored Contributor II
  • 0 kudos

Verify with documentation

  • 0 kudos
1 More Replies
SK21
by New Contributor II
  • 2452 Views
  • 3 replies
  • 1 kudos

CICD for Jobs @ WorkFlows

I had created Jobs to trigger the respective notebooks in Databricks Workflow.Now I need to move them to further environments.Would you please help me with an CICD process to promote jobs to further environments.

  • 2452 Views
  • 3 replies
  • 1 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 1 kudos

Please use jobs API 2.1 You can get job and save JSON with that jobs to git.In git then set variables defining databricks workspaces (URL and token) and after push define that API call is triggered with your json stored in git.

  • 1 kudos
2 More Replies
fijoy
by Contributor
  • 7749 Views
  • 1 replies
  • 2 kudos

Resolved! Using widget values in a shell script cell

I have a Databricks notebook containing a mix of SQL, Python, and shell script cells. I know I can retrieve and use values of widgets in Python cells using dbutils.widgets.get('key') and in SQL cells using ${key}.How can I use widget values in shell ...

  • 7749 Views
  • 1 replies
  • 2 kudos
Latest Reply
fijoy
Contributor
  • 2 kudos

For those interested, I found and am for now using this workaround:https://stackoverflow.com/questions/54662605/how-to-pass-a-python-variables-to-shell-script-in-azure-databricks-notebookbleswhile I wait for a more direct method.

  • 2 kudos
AmanSehgal
by Honored Contributor III
  • 21556 Views
  • 6 replies
  • 15 kudos

Job cluster vs All purpose cluster

Environment: AzureI've a workflow that takes approximately a minute to execute and I want to run the job every 2 minutes.. All purpose cluster:On attaching all purpose cluster to the job, it takes approx. 60 seconds to execute.Using job cluster:On at...

  • 21556 Views
  • 6 replies
  • 15 kudos
Latest Reply
Priyag1
Honored Contributor II
  • 15 kudos

Thanks for sharing

  • 15 kudos
5 More Replies
Siddu07
by New Contributor II
  • 6078 Views
  • 3 replies
  • 1 kudos

How to change the audit log delivery Service Account?

Hi Team,I'm trying to set up Audit log delivery based on the documentation "https://docs.gcp.databricks.com/administration-guide/account-settings-gcp/log-delivery.html". As per the document, I've created a multi-region storage bucket however I'm not ...

  • 6078 Views
  • 3 replies
  • 1 kudos
Latest Reply
Priyag1
Honored Contributor II
  • 1 kudos

Documentation helps in many tasks

  • 1 kudos
2 More Replies

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels