cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Hemendra_Singh
by New Contributor II
  • 2598 Views
  • 1 replies
  • 0 kudos

Unity catalog - external table and managed table

do the external tables which we create or manage through unity catalog supports acid properties and time traveling, and if we go for the performance issue which is more faster to query and why ?

  • 2598 Views
  • 1 replies
  • 0 kudos
Latest Reply
MoJaMa
Databricks Employee
  • 0 kudos

External Tables and UC Managed Tables are similar from the perspective of Delta's properties. So, Time Travel and ACID properties are identical. It comes down to 2 high-level differences. 1. UC External: You decide the exact path for the table at ta...

  • 0 kudos
yumnus
by New Contributor III
  • 541 Views
  • 1 replies
  • 1 kudos

Resolved! Data Types - varchar(2147483647)

Hi,when I read a PostgreSQL table containing a custom datatype, it gets translated to VARCHAR(2147483647).I would like to understand how Databricks and Delta handle this scenario. Specifically, does Delta store all the bytes for the maximum length of...

  • 541 Views
  • 1 replies
  • 1 kudos
Latest Reply
Alberto_Umana
Databricks Employee
  • 1 kudos

Hello @yumnus, Delta does not allocate storage for the entire length (2147483647) characters if you only use a portion of it. Instead, Delta stores only the bytes for the actual characters used. 

  • 1 kudos
Syed-SnapLogic
by New Contributor
  • 379 Views
  • 1 replies
  • 1 kudos

Does Databricks support the password grant type?

Hi,For my azure databricks instance, I am able to generate an access token using client_credentials and authorization_code grant types. I would like to know if Databricks supports the password grant type or not.  Is there any document or reference to...

  • 379 Views
  • 1 replies
  • 1 kudos
Latest Reply
Alberto_Umana
Databricks Employee
  • 1 kudos

Hello @Syed-SnapLogic, Databricks does not support the password grant type for generating access tokens. The supported grant types for generating access tokens in Databricks are client_credentials and authorization_code. For more information, you can...

  • 1 kudos
octo_8RT
by New Contributor II
  • 849 Views
  • 3 replies
  • 1 kudos

Resolved! Can't access Azure Databricks

Hi,I've been trying to access my notebooks for already 2 hours in Azure Databricks. I have tried different browsers and deleted cookies, restarted computer. I still get the same erro after connecting with SSO (see image bellow)I am in Belgium, but al...

octo_8RT_0-1732712599795.png
  • 849 Views
  • 3 replies
  • 1 kudos
Latest Reply
thomas-ikt4u
New Contributor III
  • 1 kudos

And it works again!

  • 1 kudos
2 More Replies
Michael_Appiah
by Contributor
  • 2115 Views
  • 2 replies
  • 0 kudos

Delta Tables: Time-To-Live

I have seen somewhere (might have been in a Databricks Tech Talk) a Delta Table feature which allows to specify the "expiration date" of data stored in Delta Tables. Once rows surpass their time-to-live, they are automatically deleted or archived. Do...

  • 2115 Views
  • 2 replies
  • 0 kudos
Latest Reply
diana45peters
New Contributor II
  • 0 kudos

@Michael_Appiah wrote:I have seen somewhere (might have been in a Databricks Tech Talk) a Delta Table feature which allows to specify the "expiration date" of data stored in Delta Tables. Once rows surpass their time-to-live, they are automatically d...

  • 0 kudos
1 More Replies
jeremy98
by Honored Contributor
  • 1119 Views
  • 4 replies
  • 0 kudos

Resolved! how to share a unity PROD catalog to STAGING workspace

Hello Community,I’m looking for a secure way to share a production Unity Catalog with the staging workspace. My goal is to sync data from a schema in the production catalog to the staging workspace, enabling it to read the data and write it into some...

  • 1119 Views
  • 4 replies
  • 0 kudos
Latest Reply
yumnus
New Contributor III
  • 0 kudos

Hi!A potential solution to your issue could be configuring read-only access to the schema in your production catalog. This approach allows you to securely share the production catalog with your staging workspace while ensuring that users in the stagi...

  • 0 kudos
3 More Replies
Riccardo96
by New Contributor II
  • 1162 Views
  • 3 replies
  • 0 kudos

Dataframe Count before and after write command do not match

Hi,I have noticed a strange behaviour in a notebook where I am developing. When I use the notebook to read a single file the notebook works correctly, but when I set it to read multiple files at once, using the option recursive lookup, I have noticed...

  • 1162 Views
  • 3 replies
  • 0 kudos
Latest Reply
Riccardo96
New Contributor II
  • 0 kudos

I just found out I was populating a column with random variables, these variables are filtered in a join...so at each write and count those numbers change  

  • 0 kudos
2 More Replies
jeft
by New Contributor II
  • 467 Views
  • 2 replies
  • 0 kudos

mongodb ingest data into databricks error

spark = SparkSession.builder \.appName("MongoDBToDatabricks") \.config("spark.jars.packages", "org.mongodb.spark:mongo-spark-connector_2.12:10.4.0") \.config("spark.mongodb.read.connection.uri", mongodb_uri) \.config("spark.mongodb.write.connection.u...

  • 467 Views
  • 2 replies
  • 0 kudos
Latest Reply
Nam_Nguyen
Databricks Employee
  • 0 kudos

Hello @jeft , will you be able to share some screenshots of the driver logs?

  • 0 kudos
1 More Replies
Dom1
by New Contributor III
  • 3971 Views
  • 5 replies
  • 3 kudos

Show log4j messages in run output

Hi,I have an issue when running JAR jobs. I expect to see logs in the output window of a run. Unfortunately, I can only see messages of that are generated with "System.out.println" or "System.err.println". Everything that is logged via slf4j is only ...

Dom1_0-1713189014582.png
  • 3971 Views
  • 5 replies
  • 3 kudos
Latest Reply
dbal
New Contributor III
  • 3 kudos

Any update on this? I am also facing this issue.

  • 3 kudos
4 More Replies
Volker
by Contributor
  • 1116 Views
  • 4 replies
  • 0 kudos

Failed job with "A fatal error has been detected by the Java Runtime Environment"

Hi community,I have a question regarding an error that I get sometimes when running a job.# # A fatal error has been detected by the Java Runtime Environment: # # SIGSEGV (0xb) at pc=0x00007fc941e74996, pid=940, tid=0x00007fc892dff640 # # JRE versio...

  • 1116 Views
  • 4 replies
  • 0 kudos
Latest Reply
Volker
Contributor
  • 0 kudos

In the last run there has been additional information in the error message:# # A fatal error has been detected by the Java Runtime Environment: # # SIGSEGV (0xb) at pc=0x00007f168e094210, pid=1002, tid=0x00007f15dd1ff640 # # JRE version: OpenJDK Run...

  • 0 kudos
3 More Replies
dimsh
by Contributor
  • 17473 Views
  • 13 replies
  • 10 kudos

How to overcome missing query parameters in Databricks SQL?

Hi, there! I'm trying to build up my first dashboard based on Dataabricks SQL. As far as I can see if you define a query parameter you can't skip it further. I'm looking for any option where I can make my parameter optional. For instance, I have a ta...

  • 17473 Views
  • 13 replies
  • 10 kudos
Latest Reply
techg
New Contributor II
  • 10 kudos

Is there any solution for the above mentioned post?

  • 10 kudos
12 More Replies
LasseL
by New Contributor III
  • 3639 Views
  • 6 replies
  • 3 kudos

Resolved! The best practice to remove old data from DLT pipeline created tables

Hi, didn't find any "reasonable" way to clean old data from DLT pipeline tables. In DLT we have used materialized views and streaming tables (scd1, append only). What is the best way to delete old data from the tables (storage size increases linearly...

  • 3639 Views
  • 6 replies
  • 3 kudos
Latest Reply
TinasheChinyati
New Contributor III
  • 3 kudos

@LasseL 1. Enable Change Data Capture (CDC):Enable CDC before deleting data to ensure Delta tables track inserts, updates, and deletes. This allows downstream pipelines to handle deletions correctly. ALTER TABLE your_table SET TBLPROPERTIES ('delta.e...

  • 3 kudos
5 More Replies
Thor
by New Contributor III
  • 625 Views
  • 1 replies
  • 1 kudos

Resolved! Asynchronous progress tracking with foreachbatch

Hello,currently the doc says that async progress tracking is available only for Kafka sink:https://docs.databricks.com/en/structured-streaming/async-progress-checking.htmlI would like to know if it would work for any sink that is "exactly once"?I exp...

  • 625 Views
  • 1 replies
  • 1 kudos
Latest Reply
cgrant
Databricks Employee
  • 1 kudos

Asynchronous progress tracking is a feature designed for ultra low latency use cases. You can read more in the open source SPIP doc here, but the expected gain in time is in the hundreds of milliseconds, which seems insignificant when doing merge ope...

  • 1 kudos
Krishna2110
by New Contributor II
  • 358 Views
  • 1 replies
  • 0 kudos

Catalog Sample Data is not visible with all purpose cluster

Hi All,I need one help even i have the cluster access and i can able to run it attaching with the notebook, still when im going in catalog to see the sample data im able to see an error, Here is the error,@ipriyanksingh , FYRCan anyone please help us...

Krishna2110_0-1732639768443.png
  • 358 Views
  • 1 replies
  • 0 kudos
Latest Reply
Alberto_Umana
Databricks Employee
  • 0 kudos

Hi @Krishna2110, Based on the error are you using any token? Ensure that the access token is valid and has not expired.  Is your workspace Unity Catalog enabled? and which are your cluster settings to browse through the data?

  • 0 kudos

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels