Email Notifications
Do customers need to allow-list the control plane IP ranges to get email notifications about jobs?
- 1041 Views
- 0 replies
- 0 kudos
Do customers need to allow-list the control plane IP ranges to get email notifications about jobs?
I have a use case where I need to delete the data completely and load new data to the existing Delta table.
It's recommended to use the overwrite option. Overwrite the table data and run a VACUUM command. To Delete the data from a Managed Delta table, the DROP TABLE command can be used. If it's an external table, then run a DELETE query on the table and th...
Customer wants to understand our strategy for breaking cluster logs into different partitions and files. They want to be able to ingest these logs into a tool that needs to understand this. They have indicated that the logs used to all be in one file...
Best practices for Databricks pools — Databricks DocumentationLearn best practices for configuring and using Databricks pools.https://docs.databricks.com/clusters/instance-pools/pool-best-practices.htmlBest practices for Azure Databricks pools - Azur...
Pre-emption is by default turned on Databricks cluster. Turning on or turning off pre-emption would make more sense on a high concurrency cluster. Pre-emption ensures that the job starting for resources gets a fair share of the resource available on ...
I have a bunch of libraries that I want to uninstall. All of them are marked as auto-install.
1) Find the corresponding library definition from an existing cluster using "libraries/cluster-status?cluster_id=<cluster_id>".$ curl -X GET 'https://cust-success.cloud.databricks.com/api/2.0/libraries/cluster-status?cluster_id=1226-232931-cuffs129' ...
If someone saves a flat file from a cell without specifying any location, where does it save?
In this case they are writing to a directory on the driver.
I have a high concurrency cluster where multiple users are running. However, I see the queries are running very slow. I did debug the logs and see more time is spent on the Spark driver. on the Spark UI, I do not see slowness.
It's possible the connectivity to hive metastore is causing the delay here. When there is a high degree of concurrency and contention for metastore access. Interactive clusters in DBR are configured to use up to 5 (spark.databricks.hive.metastore.cli...
You could follow the steps in this article to set up your deployment in a HIPAA-compliant manner.
Does writing to a Delta table create a versioning for every micro-batch of stream
Yes it is correct - Every commits to the delta create a version so definitely each micro batch create a version More Info -: https://databricks.com/blog/2019/02/04/introducing-delta-time-travel-for-large-scale-data-lakes.html
I have data written in Delta on ADLS. As I understand the delta also internal file in parquet format but when Iread the file in different format I got different record countspark.read.parquet() or spark.read.format('delta').load()df = spark.read.for...
I think you have written in delta twice using overwrite mode .But Delta is versioned data format - when you use overwrite, it doesn't delete previous data, it just writes new files, and don't delete files immediately - they are just marked as delete...
Databricks does not charge DBUs while instances are idle in the pool. Instance provider billing does apply.Please refer here for more information - https://docs.databricks.com/clusters/instance-pools/index.html
I have some parquet data in a temporary directory. Can I copy them into the delta table directly, what are the best options.
The easiest solution is to use the COPY INTO command. The COPY INTO command ensures idempotency, so even if the operation fails there are no data inconsistencies. COPY INTO command utilizes the resources on the Spark cluster hence completes faster. h...
I used to download the SQL query output from the Notebook UI. but right now I am unable to download files now
This is a workspace-level configuration. Probably your workspace admin disabled it. If you have admin privilege on your workspace you can enable it from the Admin Console -> Workspace Settings
I have a delta table in adls and for the same table, I have defined an external table in hive After creating the hive table and generating manifests, I am loading the partitions using MSCK REPAIR TABLE. All the partition columns are in same But s...
Can you please check partition column order, does it in same sequence as before or it has changed
Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!
Sign Up NowUser | Count |
---|---|
1612 | |
768 | |
348 | |
286 | |
252 |