Hi @gauravmahajan,
The cost will be associated with the respective compute type you are choosing when running in Databricks workflows, and it doesn't incur separate SQL Analytics DBU in your cost if your code is wrapped with spark.sql("select * from table").
For better choosing the cluster configuration required for your workload depending upon the data volume and transformations involved, you can refer the following documentation to know more about different compute type options: https://docs.databricks.com/aws/en/compute/cluster-config-best-practices
Additionally, you can leverage the below query to find under which SKU your jobs are tagged when deploying the jobs in Databricks workflow:
%sql
with job_run_timeline_with_cost as (
SELECT
t1.*,
t1.usage_metadata.job_id as job_id,
t1.identity_metadata.run_as as run_as,
t1.usage_quantity * list_prices.pricing.default AS list_cost
FROM system.billing.usage t1
INNER JOIN system.billing.list_prices list_prices
ON
t1.cloud = list_prices.cloud AND
t1.sku_name = list_prices.sku_name AND
t1.usage_start_time >= list_prices.price_start_time AND
(t1.usage_end_time <= list_prices.price_end_time or list_prices.price_end_time is NULL)
WHERE
t1.billing_origin_product = 'JOBS'
),
most_recent_jobs as (
SELECT
*,
ROW_NUMBER() OVER(PARTITION BY workspace_id, job_id ORDER BY change_time DESC) as rn
FROM
system.lakeflow.jobs QUALIFY rn=1
)
SELECT
t2.name
,t1.workspace_id
,t1.job_id
,t1.sku_name
,t1.run_as
,t1.spend
FROM
(
SELECT
workspace_id,
job_id,
run_as,
sku_name,
SUM(list_cost) AS spend
FROM job_run_timeline_with_cost
GROUP BY ALL
) t1
LEFT JOIN most_recent_jobs t2 USING (workspace_id, job_id)
Please let us know if this is helpful.
Best Regards,
Nivethan V