cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

brickster_2018
by Databricks Employee
  • 2686 Views
  • 1 replies
  • 0 kudos

Resolved! Is it recommended to turn on Spark speculative execution permanently

I had a job where the last step will get stuck forever. Turning on spark speculative execution did magic and resolved the issue. Is it safe to turn on Spark speculative execution permanently.

  • 2686 Views
  • 1 replies
  • 0 kudos
Latest Reply
brickster_2018
Databricks Employee
  • 0 kudos

It's not recommended to turn of Spark speculative execution permanently. For jobs where tasks are running slow or stuck because of transient network or storage issues, speculative execution can be very handy. However, it suppresses the actual problem...

  • 0 kudos
User16790091296
by Contributor II
  • 2586 Views
  • 1 replies
  • 0 kudos

If possible, how can I update R Version on Azure Databricks?

Azure Databricks currently runs R version 3.4.4 (2018-03-15), which is unacceptable in my opinion since the latest R version on CRAN is 3.5.2 (2018-12-20).My question is: Is it possible for me to upgrade and install R version 3.5.2 on Azure Databrick...

  • 2586 Views
  • 1 replies
  • 0 kudos
Latest Reply
User16752239289
Databricks Employee
  • 0 kudos

You can change the R version by following this document we have https://docs.microsoft.com/en-us/azure/databricks/kb/r/change-r-versionThe R version comes with each DBR (Databricks Runtime) can be find in the release note https://docs.microsoft.com/e...

  • 0 kudos
brickster_2018
by Databricks Employee
  • 2041 Views
  • 1 replies
  • 0 kudos
  • 2041 Views
  • 1 replies
  • 0 kudos
Latest Reply
brickster_2018
Databricks Employee
  • 0 kudos

It's the use case that decides the use of Shallow clone or DEEP Clone Data is physically copied to the clone table in the case of a Deep clone. A deep clone is very useful to copy the data and have a backup of the data in another region/env. The typ...

  • 0 kudos
User16790091296
by Contributor II
  • 1414 Views
  • 0 replies
  • 0 kudos

SAS Files in Databricks (Stack Overflow)

I am trying to convert SAS files to CSV in Azure Databricks. SAS files are in Azure Blob. I am successfully able to mount the azure blob in Databricks, but when I read from it, it has no files even though there are files on Blob. Has anyone done this...

  • 1414 Views
  • 0 replies
  • 0 kudos
User16790091296
by Contributor II
  • 1557 Views
  • 1 replies
  • 1 kudos
  • 1557 Views
  • 1 replies
  • 1 kudos
Latest Reply
Ryan_Chynoweth
Esteemed Contributor
  • 1 kudos

The open source spark connector for Snowflake is available by default in the Databricks runtime. To connect you can use the following code: # Use secrets DBUtil to get Snowflake credentials. user = dbutils.secrets.get("<scope>", "<secret key>") passw...

  • 1 kudos
User16790091296
by Contributor II
  • 1953 Views
  • 1 replies
  • 0 kudos

How to Prevent Duplicate Entries to enter to delta lake of Azure Storage?

I Have a Dataframe stored in the format of delta into Adls, now when im trying to append new updated rows to that delta lake it should, Is there any way where i can delete the old existing record in delta and add the new updated Record.There is a uni...

  • 1953 Views
  • 1 replies
  • 0 kudos
Latest Reply
Ryan_Chynoweth
Esteemed Contributor
  • 0 kudos

To achieve this you should use a merge command that will update rows that are existing with the unique ID. This will update the rows that already exist and insert the rows that do not. If you want to do it manually, you could delete rows using the DE...

  • 0 kudos
User16790091296
by Contributor II
  • 1135 Views
  • 0 replies
  • 1 kudos

How to get access to Databricks SQL analytics?

am trying to do this tutorial about databricks sql analytics (https://docs.microsoft.com/en-us/azure/databricks/sql/get-started/admin-quickstart) but when i create my databricks workspace i do not have the icon at the bottom of the sidebar to access ...

  • 1135 Views
  • 0 replies
  • 1 kudos
brickster_2018
by Databricks Employee
  • 5703 Views
  • 1 replies
  • 0 kudos

Resolved! Do ganglia report incorrect memory stats?

I am looking at the memory utilization of the executors and I see the heap utilization of the executor is far less than what is reported in the Ganglia. Why do ganglia report incorrect memory details.

  • 5703 Views
  • 1 replies
  • 0 kudos
Latest Reply
brickster_2018
Databricks Employee
  • 0 kudos

Ganglia reports the memory utilization at the system level. Say for example if the JVM has Xmx value of 100 GB. At some point, it will occupy 100GB and then with a Garbage collection, it will clear off the heap. Once the GC frees up the memory, th...

  • 0 kudos
User16790091296
by Contributor II
  • 2123 Views
  • 0 replies
  • 1 kudos

What is the most efficient way to read in a partitioned parquet file with pyspark?

I work with parquet files stored in AWS S3 buckets. They are multiple TB in size and partitioned by a numeric column containing integer values between 1 and 200, call it my_partition. I read in and perform compute actions on this data in Databricks w...

  • 2123 Views
  • 0 replies
  • 1 kudos
brickster_2018
by Databricks Employee
  • 3113 Views
  • 1 replies
  • 0 kudos

Resolved! Is it mandatory to checkpoint my streaming query.

I have ad-hoc one-time streaming queries where I believe checkpoint won't give any value add. Should I still use checkpointing

  • 3113 Views
  • 1 replies
  • 0 kudos
Latest Reply
brickster_2018
Databricks Employee
  • 0 kudos

It's not mandatory. But the strong recommendation is to use Checkpointing for Streaming irrespective of your use case. This is because the default checkpoint location can get a lot of files over time as there is no graceful guaranteed cleaning in pla...

  • 0 kudos

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels