โ01-26-2024 01:47 AM
Hi team,
I wanted to know if there is a way to connect a piece of my pyspark code running in ECS to Databricks cluster and leverage the databricks compute using Databricks connect?
I see Databricks connect is for connecting local ide code to databricks cluster, but do we have a way to connect code running in ecs with databricks?
โ01-31-2024 12:53 AM
โ01-29-2024 08:36 AM
In addition to the answer from @Retired_mod I would also add that your result set that would come back from a Databricks query may be too large to process in-memory on your ECS container node. Spark often excels when it comes to asynchronous workloads, not immediate result sets.
If you could briefly explain your use-case it would help to make a better recommendation.
โ01-31-2024 12:52 AM
โ01-30-2024 07:07 PM - edited โ01-30-2024 07:10 PM
Noted @Retired_mod @RonDeFreitas.
I am currently using Databricks runtime v12.2 (which is < v13.0). I followed this doc (Databricks Connect for Databricks Runtime 12.2 LTS and below) and connected my local terminal to Databricks cluster and was able to execute a sample spark code utilising my cluster compute from the terminal. Parallelly was also able to execute code on remote jupyter notebook following docs.
Though I have a 1 questions regarding this.
Current architecture of our system for context:
Question(s):
Approach(s):
Passionate about hosting events and connecting people? Help us grow a vibrant local communityโsign up today to get started!
Sign Up Now