cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 
Data + AI Summit 2024 - Data Engineering & Streaming

Forum Posts

NM
by New Contributor III
  • 1253 Views
  • 1 replies
  • 0 kudos

Resolved! How to do deterministic encryption on databricks?

How can I do deterministic encryption on databricks? And protect PI columns.

  • 1253 Views
  • 1 replies
  • 0 kudos
Latest Reply
01_binary
New Contributor III
  • 0 kudos

In order to deterministic encryption, we need to use aes encryption. Using AES the encrypted text will always remain the same for same input. Databricks recently implemented aes_encrypt and aes_decrypt functions and is the recommended way to perform ...

  • 0 kudos
mikesilva
by New Contributor
  • 638 Views
  • 1 replies
  • 1 kudos

Resolved! Where can I learn more about Live tables

I keep hearing about Live tables. Where can I read more about it?​

  • 638 Views
  • 1 replies
  • 1 kudos
Latest Reply
Fgkimball
New Contributor III
  • 1 kudos

Hello,​These are some good places to start!​Getting started docs: https://databricks.com/discover/pages/getting-started-with-delta-live-tables​​Notebook examples to walk through: https://github.com/databricks/delta-live-tables-notebooks

  • 1 kudos
Raymond_Garcia
by Contributor II
  • 2275 Views
  • 2 replies
  • 5 kudos

Resolved! Databricks Job is slower.

Hello, I have a data bricks question. A Dataframe job that writes in an s3 bucket usually takes 8 minutes to finish, but now it takes from 8 to 9 hours to complete. Does anybody have some clues about this behavior?the data frame size is about 300 or ...

  • 2275 Views
  • 2 replies
  • 5 kudos
Latest Reply
" src="" />
This widget could not be displayed.
This widget could not be displayed.
This widget could not be displayed.
  • 5 kudos

This widget could not be displayed.
Hello, I have a data bricks question. A Dataframe job that writes in an s3 bucket usually takes 8 minutes to finish, but now it takes from 8 to 9 hours to complete. Does anybody have some clues about this behavior?the data frame size is about 300 or ...

This widget could not be displayed.
  • 5 kudos
This widget could not be displayed.
1 More Replies
Bomberone
by New Contributor II
  • 1256 Views
  • 1 replies
  • 2 kudos

Resolved! Autoloader checkpoint issue

Hello guys, anyone issuing problems with autoloader checkpoints on azure?​

  • 1256 Views
  • 1 replies
  • 2 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 2 kudos

Me not, but it is good to specify both the autoloader checkpoint and streaming (for write) checkpoint.And it happened to me during some experiments that I had to delete everything from the autoloader directory.

  • 2 kudos
DanielWhite
by New Contributor II
  • 910 Views
  • 2 replies
  • 2 kudos

Brilliant idea of the concept.! I want to learn much more about DataBricks!!

Brilliant idea of the concept.! I want to learn much more about DataBricks!!

  • 910 Views
  • 2 replies
  • 2 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 2 kudos

Try https://customer-academy.databricks.com/ is excellent

  • 2 kudos
1 More Replies
Anonymous
by Not applicable
  • 1016 Views
  • 2 replies
  • 4 kudos

Become a Community Champion! As you know, Data + AI Summit 2022 is June 27-30. During this time we want to see as many of our virtual and in-person at...

Become a Community Champion! As you know, Data + AI Summit 2022 is June 27-30. During this time we want to see as many of our virtual and in-person attendees becoming the best communtiy version of themselves: a Community Champion. Because the qualifi...

Image Image
  • 1016 Views
  • 2 replies
  • 4 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 4 kudos

Cooler will be great for summer picnics

  • 4 kudos
1 More Replies
Leladams
by New Contributor III
  • 9567 Views
  • 9 replies
  • 2 kudos

What is the best way to read in a ms access .accdb database into Databricks from a mounted drive?

I am currently trying to read in .accdb files from a mounted drive. Based on my research it looks like I would have to use a package like JayDeBeApi with ucanaccess drivers or pyodbc with ms access drivers.Will this work?Thanks for any help.

  • 9567 Views
  • 9 replies
  • 2 kudos
Latest Reply
Anonymous
Not applicable
  • 2 kudos

Hi @Leland Adams​ Hope you are doing well. Thank you for posting your question and giving us additional information. Do you think you were able to solve the query?We'd love to hear from you.

  • 2 kudos
8 More Replies
abd
by Contributor
  • 5621 Views
  • 2 replies
  • 2 kudos

Resolved! How spark will handles 1TB data if it has cluster of 1GB memory ?

If my cluster memory is 1GB for example and my data is 1TB how Spark will handle it?If it is in memory computing how does it handles the data that is greater than the memory size ?

  • 5621 Views
  • 2 replies
  • 2 kudos
Latest Reply
abd
Contributor
  • 2 kudos

@Kaniz Fatma​ @Cedric Law Hing Ping​ 

  • 2 kudos
1 More Replies
User16826990884
by New Contributor III
  • 1403 Views
  • 1 replies
  • 0 kudos

Disable managed tables on Azure Databricks

When a user creates a table without a path, it writes it as a managed table in the root bucket. Can this functionality be disabled so users are forced to provide a storage path and follow our organization best practices?

  • 1403 Views
  • 1 replies
  • 0 kudos
Latest Reply
florent
New Contributor III
  • 0 kudos

Hi,I couldn't find the option to turn it off. It might be worth banning dbfs. However, you can configure the default location using a cluster policy by adding the following configuration:"spark_conf.spark.sql.warehouse.dir": {  "type": "fixed",  "val...

  • 0 kudos
rbarata
by New Contributor II
  • 2198 Views
  • 1 replies
  • 5 kudos

Resolved! Pyspark environment and python packages on executors

At my company we use conda-pack to make certain packages available on the spark executors. Is there a better a better alternative to get away from creating a new environment and pack it every time I need a new python lib to be available for the execu...

  • 2198 Views
  • 1 replies
  • 5 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 5 kudos

databricks provides library installation in the form of PyPi packages, or wheel/egg.If you install the packages like that on the cluster, they are automatically sent to all executors.

  • 5 kudos

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group
Labels