cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Pragat
by New Contributor
  • 1304 Views
  • 1 replies
  • 0 kudos

Databricks job parameterization

I am configuring an Databricks jobs using multiple notebooks having dependency with each other. All the notebooks are parameterized and using similiar parameters. How can i configure the parameter on global level so that all the notebooks can consume...

  • 1304 Views
  • 1 replies
  • 0 kudos
Latest Reply
Aviral-Bhardwaj
Esteemed Contributor III
  • 0 kudos

actually, it is very hard but if you want to use an alternative option you have to change your code and use a widget feature of data bricks.May be this is not a right option but you can still explore this doc for testing purpose https://docs.databric...

  • 0 kudos
Netty
by New Contributor III
  • 4033 Views
  • 1 replies
  • 2 kudos

What's the crontab notation for every other week for Databricks Workflow scheduling?

Hello,I need to schedule some of my jobs within Databricks Workflow every other week (or every 4 weeks). I've scoured a few forums for find what this notation would be, but I've been unfruitful in my searches.Is this scheduling possible in crontab? I...

  • 4033 Views
  • 1 replies
  • 2 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 2 kudos

For every seven days starting from Monday, you need to use 2/7. From my experience, that generator works best with databricks https://www.freeformatter.com/cron-expression-generator-quartz.html 

  • 2 kudos
Heman2
by Valued Contributor II
  • 2673 Views
  • 6 replies
  • 19 kudos

Can anyone let me know, Is there anyway In which we can access different workspace delta tables in a workspace where we run the pipelines using python...

Can anyone let me know, Is there anyway In which we can access different workspace delta tables in a workspace where we run the pipelines using python?​

  • 2673 Views
  • 6 replies
  • 19 kudos
Latest Reply
Harish2122
Contributor
  • 19 kudos

@Hemanth A​ go to the workspace you want data from, in warehouse tab you will find connectivity in that copy host name, http path and generate token for it, by this credentials you can access the data of this workspace in any other workspace.

  • 19 kudos
5 More Replies
dulu
by New Contributor III
  • 3697 Views
  • 3 replies
  • 15 kudos

Resolved! How to count the number of campaigns per day based on the start and end dates of the campaigns in SQL Spark Databrick

I need to count the number of campaigns per day based on the start and end dates of the campaignsInput Table: Out needed (result):How do I need to write the SQL command in databricks to get the above result? thanks all

image 1 image 2
  • 3697 Views
  • 3 replies
  • 15 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 15 kudos

Just create an array with sequence, explode it, and then group and count:WITH cte AS (SELECT `campaign name`, explode(sequence(`Start date`, `End date`, interval 1 day)) as `Date` FROM `campaigns`) SELECT Count(`campaign name`) as `count uni...

  • 15 kudos
2 More Replies
183530
by New Contributor III
  • 539 Views
  • 0 replies
  • 1 kudos

Needed a regex to (CC)

SELECT '(CC) ABC' REGEXP '\\b\\(CC\\)\\b' AS TEST1,    'A(CC) ABC' REGEXP '\\b\\(CC\\)\\b' AS TEST2,    'A (CC)A ABC' REGEXP '\\b\\(CC\\)\\b' AS TEST3,    'A (CC) A ABC' REGEXP '\\b\\(CC\\)\\b' AS TEST4,    'A ABC (CC)' REGEXP '\\b\\(CC\\)\\b' AS TES...

  • 539 Views
  • 0 replies
  • 1 kudos
seberino
by New Contributor III
  • 1232 Views
  • 0 replies
  • 1 kudos

How revoke SELECT permissions on a table in Data Explorer when it only lets me revoke new explicit grants I've added myself?

I'm able to make it to the Permission page of the schema and table I'm trying to do access control on within the Data Explorer page.At first you can only grant permissions but not revoke anything. Only after you have made new grants can you revoke w...

  • 1232 Views
  • 0 replies
  • 1 kudos
andrew0117
by Contributor
  • 1106 Views
  • 1 replies
  • 2 kudos

How to sync the meta store info with the real data for external delta table

if I manually delete some parque files in location which the real data is stored in, so spark catalog still has the old version. How can I sync them?Thanks!

  • 1106 Views
  • 1 replies
  • 2 kudos
Latest Reply
youssefmrini
Databricks Employee
  • 2 kudos

You just need to create a new table and specify the location of the data for your case it's going to be an ADLS, S3...Example​Create table customer using delta location 'mnt/data./'

  • 2 kudos
KellenO
by New Contributor II
  • 2388 Views
  • 2 replies
  • 8 kudos

Resolved! How can I use cluster autoscaling with intensive subprocess calls?

I have a custom application/executable that I upload to DBFS and transfer to my cluster's local storage for execution. I want to call multiple instances of this application in parallel, which I've only been able to successfully do with Python's subpr...

  • 2388 Views
  • 2 replies
  • 8 kudos
Latest Reply
Anonymous
Not applicable
  • 8 kudos

Autoscaling works for spark jobs only. It works by monitoring the job queue, which python code won't go into. If it's just python code, try single node.https://docs.databricks.com/clusters/configure.html#cluster-size-and-autoscaling

  • 8 kudos
1 More Replies
Taha_Hussain
by Databricks Employee
  • 6995 Views
  • 5 replies
  • 5 kudos

Connect a BI Tool: How do I access my lakehouse data from my BI tool?

You can find a rich ecosystem of tools that allow you to work with all your data in-place and deliver real-time business insights faster.This post will help you connect your existing tools like dbt, Fivetran, PowerBI, Tableau or SAP to ingest, transf...

Screen Shot 2022-08-09 at 11.54.56 PM
  • 6995 Views
  • 5 replies
  • 5 kudos
Latest Reply
Axserv
New Contributor II
  • 5 kudos

Hello Taha, here is a fairly recent video provided by Databricks on conncecting Power BI : Demo Video: Connect to Power BI Desktop from Databricks - YouTube

  • 5 kudos
4 More Replies
ranged_coop
by Valued Contributor II
  • 1912 Views
  • 2 replies
  • 3 kudos

Equivalent Machine Types between Databricks on Azure and GCP

Hi All,Hope everyone is doing well.We are currently validating Databricks on GCP and Azure.We have a python notebook that does some ETL (Copy, extract zip files and process files within the zip files)Our Cluster Config on AzureDBX Runtime - 10.4 - Dr...

  • 1912 Views
  • 2 replies
  • 3 kudos
Latest Reply
ranged_coop
Valued Contributor II
  • 3 kudos

hi @Tunde Abib​ , I have gone through the links while updating, but did not see any major documented slow downs mentioned in them. 

  • 3 kudos
1 More Replies
Sujitha
by Databricks Employee
  • 2134 Views
  • 6 replies
  • 5 kudos

KB Feedback Discussion  In addition to the Databricks Community, we have a Support team that maintains a Knowledge Base (KB). The KB contains answers ...

KB Feedback Discussion In addition to the Databricks Community, we have a Support team that maintains a Knowledge Base (KB). The KB contains answers to common questions about Databricks, as well as information on optimisation and troubleshooting.Thes...

  • 2134 Views
  • 6 replies
  • 5 kudos
Latest Reply
Ajay-Pandey
Esteemed Contributor III
  • 5 kudos

Thanks for sharing @Sujitha Ramamoorthy​ 

  • 5 kudos
5 More Replies
Netty
by New Contributor III
  • 5117 Views
  • 5 replies
  • 7 kudos

Resolved! What's the easiest way to upsert data into a table? (Azure ADLS Gen2)

I had been trying to upsert rows into a table in Azure Blob Storage (ADLS Gen 2) based on two partitions (sample code below). insert overwrite table new_clicks_table partition(client_id, mm_date) select click_id ,user_id ,click_timestamp_gmt ,ca...

  • 5117 Views
  • 5 replies
  • 7 kudos
Latest Reply
Ajay-Pandey
Esteemed Contributor III
  • 7 kudos

Below code might help youPython- (df.write .mode("overwrite") .option("partitionOverwriteMode", "dynamic") .saveAsTable("default.people10m") )   SQL- SET spark.sql.sources.partitionOverwriteMode=dynamic; INSERT OVERWRITE TABLE default.people10m...

  • 7 kudos
4 More Replies
KVNARK
by Honored Contributor II
  • 9020 Views
  • 11 replies
  • 8 kudos

Resolved! Databricks lakehouse platform administration accreditation

How to complete the Databricks lakehouse platform administration for free just like Lakehouse fundamentals. How to get the accreditation for platform administrator like lakehouse fundamentals.

  • 9020 Views
  • 11 replies
  • 8 kudos
Latest Reply
KVNARK
Honored Contributor II
  • 8 kudos

Through community partner account only I tried.

  • 8 kudos
10 More Replies
Rishabh-Pandey
by Esteemed Contributor
  • 1146 Views
  • 1 replies
  • 7 kudos

Regarding my free lake house 100 points

hi @Christy Seto​, i have cleared the lake house exam before 30 november 2022 and was eligible to get a 100 community points , i have cleared with the email id of manpreet.kaur@celebaltech.com but till now i havent get 100 points . i have edited my e...

  • 1146 Views
  • 1 replies
  • 7 kudos
Latest Reply
Ajay-Pandey
Esteemed Contributor III
  • 7 kudos

Hi @Rishabh Pandey​ Please raise a request via this link this might help you.

  • 7 kudos
ncouture
by Contributor
  • 5497 Views
  • 3 replies
  • 1 kudos

Resolved! How to install a JAR library via a global init script?

I have a JAR I want to be installed as a library on all clusters. I have tried both wget /databricks/jars/ some_repoandcp /dbfs/FileStore/jars/name_of_jar.jar /databricks/jars/clusters start up but the JAR is not installed as a library. I am aware th...

  • 5497 Views
  • 3 replies
  • 1 kudos
Latest Reply
ncouture
Contributor
  • 1 kudos

Found a solution.echo /databricks/databricks-hive /databricks/jars /databricks/glue | xargs -n 1 cp /dbfs/FileStore/jars/NAME_OF_THE_JAR.jarhad to first add the jar as a library through the GUI via Create -> Library then uploaded the downloaded JAR. ...

  • 1 kudos
2 More Replies

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group
Labels