Showing results for 
Search instead for 
Did you mean: 
Valued Contributor
since ‎02-07-2022

User Stats

  • 50 Posts
  • 1 Solutions
  • 8 Kudos given
  • 26 Kudos received

User Activity

Hi! I have several tiny jobs that run in parallel and I want them to run on the same cluster:- Tasks type Python Script: I send the parameters this way to run the pyspark scripts.- Job compute cluster created as (copied JSON from Databricks Job UI)Ho...
Hi, I'm currently starting to use SQL Warehouse, and we have most of our lake in a compression different than snappy.How can I set the SQL warehouse to use a compression like gzip, zstd, on CREATE, INSERT, etc?Tried this:set spark.sql.parquet.compre...
Hi! I'm optimizing several Tb of partitioned data on ZSTD lvl 9.It surprises me the level of shuffle write, it could make sense because of ZORDER but I want to be sure that I'm not missing something, here is some context: Could I be missing something...
Hi! I currently have this as an old generic template with amends over time to optimize Databricks Spark execution, can you help me to know if this still makes sense for v10-11-12 or if there are new recommendations? Maybe some of this is making my pr...
Hi, I have a Pyspark job that takes about an hour to complete, when looking at the SQL tab on Spark UI I see this:Those processes run for more than 1 minute on a 60-minute process.This is Ganglia for that period (the last snapshot, will look into a l...