- 9270 Views
- 4 replies
- 2 kudos
Optimizing Costs in Databricks by Dynamically Choosing Cluster Sizes
Databricks is a popular unified data analytics platform known for its powerful data processing capabilities and seamless integration with Apache Spark. However, managing and optimizing costs in Databricks can be challenging, especially when it comes ...
- 9270 Views
- 4 replies
- 2 kudos
- 2 kudos
@Second Reply You’re right just printing out selected_pool isn’t enough to actually leverage dynamic cluster sizing at runtime. In practice, the value of selected_pool would feed directly into your Databricks cluster creation API or workflow automati...
- 2 kudos
- 541 Views
- 0 replies
- 1 kudos
Handling the Chaos: Data Quality Strategies with PySpark Ingestion
Tips and Techniques for Ingesting Large JSON files with PySparkIntroductionSuppose you’ve ever struggled or grappled with consuming massive JSON files with PySpark. In that case, you are aware that insufficient data can always creep in and silently d...
- 541 Views
- 0 replies
- 1 kudos
- 338 Views
- 0 replies
- 1 kudos
SQL Scripting in Apache Spark™ 4.0
The Apache Spark™ 4.0 introduces a new feature for SQL developers and data engineers: SQL Scripting. As such, this feature enhances the power and extends the flexibility of Spark SQL, enabling users to write procedural code within SQL queries, with t...
- 338 Views
- 0 replies
- 1 kudos
- 1150 Views
- 0 replies
- 0 kudos
Implementing data contracts on Databricks for industrial AI pipelines
Enforce schema consistency using declarative contracts on Databricks Lakehouse.Industrial AI is transforming how operations are optimized, from forecasting equipment failure to streamlining supply chains. But even the most advanced models are only as...
- 1150 Views
- 0 replies
- 0 kudos
- 1384 Views
- 2 replies
- 6 kudos
🔐 How Do I Prevent Users from Accidentally Deleting Tables in Unity Catalog? 🔐
Question:I have a role called dev-dataengineer with the following privileges on the catalog dap_catalog_dev:APPLY TAGCREATE FUNCTIONCREATE MATERIALIZED VIEWCREATE TABLECREATE VOLUMEEXECUTEREAD VOLUMEREFRESHSELECTUSE SCHEMAWRITE VOLUMEDespite this, u...
- 1384 Views
- 2 replies
- 6 kudos
- 6 kudos
Managing assets in UC is always a overhead maintenance. We have this access controls in terraform codes and it is always hard to see what level of access is given to different personas in the org. We are building an audit dashboard for it.
- 6 kudos