cancel
Showing results for 
Search instead for 
Did you mean: 
Hubert-Dudek
Esteemed Contributor III
since ‎10-12-2021
yesterday

User Stats

  • 1,438 Posts
  • 281 Solutions
  • 1,007 Kudos given
  • 2,219 Kudos received

User Activity

Real-time mode is a breakthrough that lets Spark utilize all available CPUs to process records with single-millisecond latency, while decoupling checkpointing from per-record processing.
Databricks goes native on Excel. You can now ingest + query .xls/.xlsx directly in Databricks (SQL + PySpark, batch and streaming), with auto schema/type inference, sheet + cell-range targeting, and evaluated formulas, no extra libraries anymore.
Tags, whether manually assigned or automatically assigned by the “data classification” service, can be protected using policies. Column masking can automatically mask columns with a given tag for all except some with elevated access.
Data classification automatically tags Unity Catalog tables and is now available in system tables as well.
Imagine all a data engineer or analyst needs to do to read from a REST API is use spark.read(), no direct request calls, no manual JSON parsing - just spark .read. That’s the power of a custom Spark Data Source. Soon, we will see a surge of open-sour...