cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

brickster_2018
by Databricks Employee
  • 1526 Views
  • 1 replies
  • 0 kudos

Resolved! Does Table ACL support column-level security like Ranger?

I have used Ranger in Apache Hadoop and it works fine for my use case. Now that I am migrating my workloads from Apache Hadoop to Databricks

  • 1526 Views
  • 1 replies
  • 0 kudos
Latest Reply
brickster_2018
Databricks Employee
  • 0 kudos

Currently, Table ACL does not support column-level security. There are several tools like Privcera which has better integration with Databricks.

  • 0 kudos
User16752240150
by New Contributor II
  • 7821 Views
  • 1 replies
  • 1 kudos

When to use cache vs checkpoint?

I've seen .cache() and .checkpoint() used similarly in some workflows I've come across. What's the difference, and when should I use one over the other?

  • 7821 Views
  • 1 replies
  • 1 kudos
Latest Reply
Srikanth_Gupta_
Databricks Employee
  • 1 kudos

Caching is extremely useful than checkpointing when you have lot of available memory to store your RDD or Dataframes if they are massive.Caching will maintain the result of your transformations so that those transformations will not have to be recomp...

  • 1 kudos
User16826994223
by Honored Contributor III
  • 1144 Views
  • 1 replies
  • 0 kudos
  • 1144 Views
  • 1 replies
  • 0 kudos
Latest Reply
User16826994223
Honored Contributor III
  • 0 kudos

1. We want a venue in which we can rapidly iterate and make new releases. The overhead of making a release as a separate project is minuscule (in the order of minutes). A release on Spark takes a lot longer (in the order of days)2. Koalas takes a dif...

  • 0 kudos
User16826994223
by Honored Contributor III
  • 1542 Views
  • 1 replies
  • 0 kudos
  • 1542 Views
  • 1 replies
  • 0 kudos
Latest Reply
User16826994223
Honored Contributor III
  • 0 kudos

Different projects have different focuses. Spark is already deployed in virtually every organization, and often is the primary interface to the massive amount of data stored in data lakes. Koalas was inspired by Dask, and aims to make the transition ...

  • 0 kudos
User16826994223
by Honored Contributor III
  • 4932 Views
  • 1 replies
  • 0 kudos

Do login sessions into Databricks have an idle timeout?

Do login sessions into Databricks have an idle timeout?

  • 4932 Views
  • 1 replies
  • 0 kudos
Latest Reply
User16826994223
Honored Contributor III
  • 0 kudos

Short Answer:YesDetailed Answer:User sessions automatically timeout after six hours of idle time. This timeout is not configurable. User sessions are terminated if the user is removed from the workspace. To trigger session end for users who were remo...

  • 0 kudos
Anonymous
by Not applicable
  • 1406 Views
  • 1 replies
  • 0 kudos
  • 1406 Views
  • 1 replies
  • 0 kudos
Latest Reply
User16826994223
Honored Contributor III
  • 0 kudos

for any other non-private previews, they can check out admin console --> advanced tab. there are tons of toggles there to enable/disable features. if it’s not there, there usually isn’t an easy (or direct) way of disabling

  • 0 kudos
Anonymous
by Not applicable
  • 1214 Views
  • 1 replies
  • 2 kudos
  • 1214 Views
  • 1 replies
  • 2 kudos
Latest Reply
User16826994223
Honored Contributor III
  • 2 kudos

Scala Use JVM to run its code, Scala cannot run different applications at a time with complete isolation of each task inside single jvm , that is the reason Scala doesn't support high concurrency cluster, I don't think it is on road map

  • 2 kudos
MoJaMa
by Databricks Employee
  • 1192 Views
  • 1 replies
  • 0 kudos
  • 1192 Views
  • 1 replies
  • 0 kudos
Latest Reply
MoJaMa
Databricks Employee
  • 0 kudos

That’s only available at Premium and Enterprise SKUs in AWS.See the "Enterprise Security" section here:https://databricks.com/product/aws-pricing

  • 0 kudos
User16783853501
by Databricks Employee
  • 2114 Views
  • 1 replies
  • 0 kudos

What types of files does autoloader support for streaming ingestion ? I see good support for CSV and JSON, how can I ingest files like XML, avro, parquet etc ? would XML rely on Spark-XML ?

What types of files does autoloader support for streaming ingestion ? I see good support for CSV and JSON, how can I ingest files like XML, avro, parquet etc ? would XML rely on Spark-XML ? 

  • 2114 Views
  • 1 replies
  • 0 kudos
Latest Reply
sajith_appukutt
Honored Contributor II
  • 0 kudos

Please raise a feature request via ideas portal for XML support in autoloader As a workaround, you could look at reading this with wholeTextFiles (which loads the data into a PairRDD with one record per input file) and parsing it with from_xml from ...

  • 0 kudos
User16790091296
by Contributor II
  • 2393 Views
  • 1 replies
  • 1 kudos

Using Databricks Connect (DBConnect)

I'd like to edit Databricks notebooks locally using my favorite editor, and then use Databricks Connect to run the notebook remotely on a Databricks cluster that I usually access via the web interface.I run "databricks-connect configure" , as suggest...

  • 2393 Views
  • 1 replies
  • 1 kudos
Latest Reply
sajith_appukutt
Honored Contributor II
  • 1 kudos

Here is the link to the configuration properties https://docs.databricks.com/dev-tools/databricks-connect.html#step-2-configure-connection-properties

  • 1 kudos

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels