cancel
Showing results for 
Search instead for 
Did you mean: 
Klusener
Contributor
since ‎08-27-2021
2 weeks ago

User Stats

  • 15 Posts
  • 0 Solutions
  • 12 Kudos given
  • 4 Kudos received

User Activity

Hello,Currently we have delta tables in TBs partitioned by year, month, day. We perform dynamic partition overwrite using partitionOverwriteMode  as dynamic to handle rerun/corrections.With liquid clustering, since explicit partitions are not require...
I have a pyspark job reading the input data volume of just ~50-55GB Parquet data from a delta table on Databricks. Job is using n2-highmem-4 GCP VM and 1-15 worker with autoscaling on databricks. Each workerVM of type n2-highmem-4 has 32GB memory and...
I was referring to the doc - https://kb.databricks.com/clusters/spark-executor-memory.In general total off heap memory is  =  spark.executor.memoryOverhead + spark.offHeap.size.  The off-heap mode is controlled by the properties spark.memory.offHeap....
I have a spark pipeline which reads selected data from a table_1 as view and performs few aggregation via group by in next step and writes to target table. table_1 has large data ~30GB, compressed csv.Step-1:create or replace temporary view base_data...
On Databricks created a job task with task type as Python script from s3. However, when arguments are passed via Parameters option, running into unrecognized arguments' error.Code in s3 file:import argparse def parse_arguments(): parser = argpar...