cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

delta_bravo
by New Contributor
  • 10299 Views
  • 2 replies
  • 0 kudos

Cluster termination issue

I am using Databricks as a Community Edition user with a limited cluster (just 1 Driver: 15.3 GB Memory, 2 Cores, 1 DBU). I am trying to run some custom algorithms for continuous calculations and writing results to the delta table every 15 minutes al...

  • 10299 Views
  • 2 replies
  • 0 kudos
Latest Reply
NandiniN
Databricks Employee
  • 0 kudos

If you set the "Terminate after" setting to 0 minutes during the creation of an all-purpose compute, it means that the auto-termination feature will be turned off. This is because the "Terminate after" setting is used to specify an inactivity period ...

  • 0 kudos
1 More Replies
curiousoctopus
by New Contributor III
  • 6793 Views
  • 4 replies
  • 4 kudos

Run multiple jobs with different source code at the same time with Databricks asset bundles

Hi,I am migrating from dbx to databricks asset bundles. Previously with dbx I could work on different features in separate branches and launch jobs without issue of one job overwritting the other. Now with databricks asset bundles it seems like I can...

  • 6793 Views
  • 4 replies
  • 4 kudos
Latest Reply
mo_moattar
New Contributor III
  • 4 kudos

We have the same issue. We might have multiple open PR on the bundles that are deploying the code, pipelines, jobs, etc. to the same workspace before the merge and they keep overwriting each other in the workspace.The jobs already have a separate ID ...

  • 4 kudos
3 More Replies
narenderkumar53
by Databricks Partner
  • 2177 Views
  • 3 replies
  • 2 kudos

can we parameterize the tags in the job compute

I want to monitor the cost better for the databricks job computes.I am using tags in the cluster to monitor cost.The tag values is static as of now.can we parameterize the compute the job cluster so that I can pass the tag values during the runtime a...

  • 2177 Views
  • 3 replies
  • 2 kudos
Latest Reply
szymon_dybczak
Esteemed Contributor III
  • 2 kudos

Hi @,If you're using ADF you can look at below article:Applying Dynamic Tags To Databricks Job Clusters in Azure Data Factory | by Kyle Hale | MediumIf not, I think you can try to write some code that will use below endpoint. The idea is, before exec...

  • 2 kudos
2 More Replies
Jeewan
by New Contributor
  • 1095 Views
  • 0 replies
  • 0 kudos

Partition In Spark with subqeury which include Union

I have a SQL query like this:select ... from table1 where id in (slect id from table 1 where (some condition) UNION select id from table2 where (some condition)) table1I have made a partition of 200 where upper bound is 200 and lower bound is 0 and p...

  • 1095 Views
  • 0 replies
  • 0 kudos
Prashanth24
by New Contributor III
  • 3234 Views
  • 3 replies
  • 3 kudos

Resolved! Databricks workflow each task cost

Suppose if we have 4 tasks (3 notebooks and 1 normal python code) in a workflow then i would like to know the cost incurred for each task in the Databricks workflow. Please let me know the any way to find out this details.

  • 3234 Views
  • 3 replies
  • 3 kudos
Latest Reply
Edthehead
Contributor III
  • 3 kudos

If each of the tasks are sharing the same cluster then no, you cannot differentiate the costs between the tasks.  However, if you setup each task to have its own job cluster, then pass some custom tags and you can then differentiate/report the costs ...

  • 3 kudos
2 More Replies
guangyi
by Contributor III
  • 909 Views
  • 0 replies
  • 0 kudos

Confuse about large memory usage of cluster

We set up a demo DLT pipeline with no data involved:  @Dlt.table( name="demo" ) def sample(): df = spark.sql("SELECT 'silver' as Layer") return df However, when we check the metric of the cluster, it looks like 10GB memory has already be...

  • 909 Views
  • 0 replies
  • 0 kudos
DBMIVEN
by Databricks Partner
  • 1302 Views
  • 0 replies
  • 0 kudos

Ingesting data from SQL Server foreign tables

I have created a connection to a SQL server DB, and set up a catalog for it. i can now view all the tables, and query them. I want to ingest some of the tables into our ADLS gen 2 that we set up with Unity Catalog. What is the best approach here? Lak...

Data Engineering
Data ingestion
Foreign catalogs
Incremental Data Ingestion
LakeFlow
SQL Server
  • 1302 Views
  • 0 replies
  • 0 kudos
ayush19
by New Contributor III
  • 1771 Views
  • 1 replies
  • 0 kudos

Running jar on Databricks cluster from Airflow

Hello,I have a jar file which is installed on a cluster. I need to run this jar from Airflow using DatabricksSubmitRunOperator. I followed the standard instructions as available on Airflow docshttps://airflow.apache.org/docs/apache-airflow-providers-...

ayush19_0-1722491889219.png ayush19_1-1722491926724.png ayush19_2-1722491964523.png ayush19_3-1722492023707.png
  • 1771 Views
  • 1 replies
  • 0 kudos
ruoyuqian
by New Contributor II
  • 3243 Views
  • 0 replies
  • 0 kudos

dbt writting into different schema

I have a unity catalog and it goes like `catalogname.schemaname1`& `catalogname.schemaname2`. and I am trying to write tables into schemaname2 with dbt, the current setup in the dbt profiles.yml is   prj_dbt_databricks: outputs: dev: cata...

  • 3243 Views
  • 0 replies
  • 0 kudos
Fernando_Messas
by Databricks Partner
  • 13853 Views
  • 6 replies
  • 3 kudos

Resolved! Error writing data to Google Bigquery

Hello, I'm facing some problems while writing data to Google BigQuery. I'm able to read data from the same table, but when I try to append data I get the following error.Error getting access token from metadata server at: http://169.254.169.254/compu...

  • 13853 Views
  • 6 replies
  • 3 kudos
Latest Reply
asif5494
New Contributor III
  • 3 kudos

Sometime this error occur when your Private key or your service account key is not going in request header, So if you are using Spark or Databricks then you have to configure the JSON Key in Spark config so it will be added in request header.

  • 3 kudos
5 More Replies
colette_chavali
by Databricks Employee
  • 2961 Views
  • 1 replies
  • 6 kudos

Nominations are OPEN for the Databricks Data Team Awards!

Databricks customers - nominate your data team and leaders for one (or more) of the six Data Team Award categories: Data Team Transformation AwardData Team for Good AwardData Team Disruptor AwardData Team Democratization AwardData Team Visionary Awar...

Data Team Awards
  • 2961 Views
  • 1 replies
  • 6 kudos
Latest Reply
Sai_Mani
New Contributor II
  • 6 kudos

Hello! where can I find more details about award nomination requirements, eligibility criteria, application entry & deadline dates for nominations? Judging criteria?  

  • 6 kudos
tobi
by New Contributor III
  • 2697 Views
  • 3 replies
  • 2 kudos

Deleted workspace

Hello guys, I have a question. We have databricks on gcp, we forgot to pay for subscription and they removed our workspace. I had code notebooks on that workspace. If there is any way to reproduce this code? Or maybe it’s automatically saved this cod...

  • 2697 Views
  • 3 replies
  • 2 kudos
Latest Reply
szymon_dybczak
Esteemed Contributor III
  • 2 kudos

Hi @tobi ,Unfortunately, if the data was stored directly within the Workspace and not backed up externally, there is no much you can do.Once a Databricks subscription is cancelled, all workspaces associated with that account are deleted and this dele...

  • 2 kudos
2 More Replies
mannepk85
by New Contributor III
  • 892 Views
  • 0 replies
  • 0 kudos

Databricks academy courses are defaulting to hive metastore

So far, I started 2 Databricks Academy Courses. In both the course, the default is hive-metastore where the schema is created. In my org, hive metastore is blocked and we have been asked to use Unity Catalog. Is there a way the course material in dat...

  • 892 Views
  • 0 replies
  • 0 kudos
CaptainJack
by New Contributor III
  • 6824 Views
  • 4 replies
  • 1 kudos

Get taskValue from job as task, and then pass it to next task.

I have workflow like this.1 task: job as a task. Inside this job there is task which is seting parameter x as taskValue using dbutils.jobs.taskValues.set. 2. task dependent on previous job as a task. I would like to access this parameter x. I tried t...

  • 6824 Views
  • 4 replies
  • 1 kudos
Latest Reply
NandiniN
Databricks Employee
  • 1 kudos

I see, I have requested for someone else to guide you on this. cc: @Retired_mod 

  • 1 kudos
3 More Replies
Labels