cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

yvishal519
by Contributor
  • 1371 Views
  • 2 replies
  • 3 kudos

Resolved! Databricks DLT with Hive Metastore and ADLS Access Issues

We are currently working on Databricks DLT tables to transform data from bronze to silver. we are specifically instructed us not to use mount paths for accessing data from ADLS Gen 2. To comply, I configured storage credentials and created an externa...

yvishal519_0-1721908544085.png
  • 1371 Views
  • 2 replies
  • 3 kudos
Latest Reply
szymon_dybczak
Esteemed Contributor III
  • 3 kudos

Hi @yvishal519 ,Since you're using hive metastore you have no other option than mount points. Storage credentials and external locations are only supported in Unity Catalog

  • 3 kudos
1 More Replies
helghe
by New Contributor II
  • 1121 Views
  • 3 replies
  • 3 kudos

Unavailable system schemas

When I list the available schemas I get the following:{"schemas":[{"schema":"storage","state":"AVAILABLE"},{"schema":"operational_data","state":"UNAVAILABLE"},{"schema":"access","state":"AVAILABLE"},{"schema":"billing","state":"ENABLE_COMPLETED"},{"s...

  • 1121 Views
  • 3 replies
  • 3 kudos
Latest Reply
hle
New Contributor II
  • 3 kudos

I have the same issue for the compute schema. Workspace is UC enabled and I'm account admin. 

  • 3 kudos
2 More Replies
Amit_Dass_Chmp
by New Contributor III
  • 662 Views
  • 1 replies
  • 0 kudos

Auto-tuning capability available for external tables?

If I am using Databricks Runtime 11.3 and above to create managed Delta tables cataloged in Unity Catalog (Databricks’ data catalog), I don’t need to worry about optimizing the underlying file sizes or configuring a target file size for my Delta tabl...

  • 662 Views
  • 1 replies
  • 0 kudos
Latest Reply
szymon_dybczak
Esteemed Contributor III
  • 0 kudos

Hi @Amit_Dass_Chmp ,Yep, according to documentation. As of second question, such capability will be available in the future. If you are using Databricks Runtime 11.3 and above to create managed Delta tables cataloged in Unity Catalog (Databricks’ dat...

  • 0 kudos
dpc
by New Contributor III
  • 2142 Views
  • 3 replies
  • 4 kudos

Resolved! Approach to monthly data snapshots

HelloI'm building a datawarehouse with all the usual facts and dimensionsIt will flush (truncate) and rebuild on a monthly basisUsers have the need to not only view the data now but also view it historically i.e. what it was a point in timeMy initial...

  • 2142 Views
  • 3 replies
  • 4 kudos
Latest Reply
dpc
New Contributor III
  • 4 kudos

Great, thanks

  • 4 kudos
2 More Replies
angel531
by New Contributor II
  • 1503 Views
  • 3 replies
  • 3 kudos

Resolved! getting error while accessing dbfs from databricks community account and couldnt upload any files

Hi, I have enabled dbfs in my databricks community account and started the cluster. while accessing dbfs its throwing an error.

doubt.png
  • 1503 Views
  • 3 replies
  • 3 kudos
Latest Reply
satyakiguha
New Contributor III
  • 3 kudos

Hi @Retired_mod I am no longer facing this issue, Thanks to the team for fixing it !  

  • 3 kudos
2 More Replies
fdeba
by New Contributor
  • 1410 Views
  • 2 replies
  • 0 kudos

DatabricksSession and SparkConf

Hi,I want to initialize a Spark session using `DatabricksSession`. However, it seems not possible to call `.config()` and pass it a `SparkConf` instance. The following works:# Initialize the configuration for the Spark session confSettings = [ ("...

  • 1410 Views
  • 2 replies
  • 0 kudos
Latest Reply
Witold
Honored Contributor
  • 0 kudos

In almost all cases you don't need to create a new spark session, as Databricks will do it for you automatically.If it's only about spark configurations, there are multiple ways to set it:Cluster settingsspark.conf.set

  • 0 kudos
1 More Replies
mkd
by New Contributor II
  • 5857 Views
  • 3 replies
  • 3 kudos

Resolved! CSV import error

Upload ErrorError occurred when processing file tips1.csv: [object Object].  I've been trying to import a csv file from my local machine to the databricks. The above mentioned error couldn't be resolved. Anyone pls help me in this regard.

  • 5857 Views
  • 3 replies
  • 3 kudos
Latest Reply
clentin
Contributor
  • 3 kudos

@Retired_mod - this is now fixed. Thank you so much for your prompt action. Appreciate it. 

  • 3 kudos
2 More Replies
kwinsor5
by New Contributor II
  • 2881 Views
  • 2 replies
  • 0 kudos

Delta Live Table autoloader's inferColumnTypes does not work

I am experimenting with DLTs/Autoloader. I have a simple, flat JSON file that I am attempting to load into a DLT (following this guide) like so:  CREATE OR REFRESH STREAMING LIVE TABLE statistics_live COMMENT "The raw statistics data" TBLPROPERTIES (...

  • 2881 Views
  • 2 replies
  • 0 kudos
Latest Reply
pavlos_skev
New Contributor III
  • 0 kudos

I had the same issue with a similar JSON structure as yours. Adding the option "multiLine" set to true fixed it for me.df = (spark.readStream.format("cloudFiles") .option("multiLine", "true") .option("cloudFiles.schemaLocation", schemaLocation) ...

  • 0 kudos
1 More Replies
kapilb
by New Contributor III
  • 2824 Views
  • 5 replies
  • 2 kudos

Resolved! Regarding problem in accessing table and uploading files

Hello team,I am new to databricks. I am using databricks community edition. Few days back I was able to access my tables and create tables by uploading csv files. But now I am getting error as "File Browsing Error" .It says Workspace is not set in Cu...

  • 2824 Views
  • 5 replies
  • 2 kudos
Latest Reply
kapilb
New Contributor III
  • 2 kudos

The issue has been resolved

  • 2 kudos
4 More Replies
Prajwal_082
by New Contributor II
  • 816 Views
  • 0 replies
  • 0 kudos

DLT apply_changes_from_snapshot

Is there way to get auto incremented value for next version parameter like next version = previous version + 1 in that case how to get the previous version value.Below code is from documentation "Historical snapshot processing". The APPLY CHANGES API...

Prajwal_082_0-1721887681729.png
  • 816 Views
  • 0 replies
  • 0 kudos
aalanis
by New Contributor II
  • 1183 Views
  • 3 replies
  • 2 kudos

Issues reading json files with databricks vs oss pyspark

Hi Everyone, I'm currently developing an application in which I read json files with nested structure. I developed my code locally on my laptop using the opensource version of pyspark (3.5.1) using a similar code to this:sample_schema:schema = Struct...

  • 1183 Views
  • 3 replies
  • 2 kudos
Latest Reply
sushmithajk
New Contributor II
  • 2 kudos

Hi, I'd like to try the scenario and find a solution. Would you mind sharing a sample file? 

  • 2 kudos
2 More Replies
Splush
by New Contributor II
  • 2066 Views
  • 1 replies
  • 1 kudos

Resolved! Row Level Security while streaming data with Materialized views

Hey,I have the following problem when trying to add row level security to some of our Materialized views. According to the documentation this feature is still in preview - nevertheless, I try to understand why this doesnt work and how it would be sup...

  • 2066 Views
  • 1 replies
  • 1 kudos
Latest Reply
raphaelblg
Databricks Employee
  • 1 kudos

Hello @Splush, There are 2 ways to create materialized views at the current moment:1. Through Databricks SQL: Use Materialized Views in Databricks SQL. These are the limitations. 2. Through DLT: Materialized View (DLT). All DLT tables are subject to ...

  • 1 kudos
robertkoss
by New Contributor III
  • 2866 Views
  • 7 replies
  • 1 kudos

Autoloader Schema Hint are not taken into consideration in schema file

I am using Autoloader with Schema Inference to automatically load some data into S3.I have one column that is a Map which is overwhelming Autoloader (it tries to infer it as struct -> creating a struct with all keys as properties), so I just use a sc...

  • 2866 Views
  • 7 replies
  • 1 kudos
Latest Reply
Witold
Honored Contributor
  • 1 kudos

Sorry, I didn't mean that your solution is poorly designed. I was only referring to the one of the main definitions of your bronze layer: You want to have a defined and optimized data layout, which is  source-driven at the same time. In other words: ...

  • 1 kudos
6 More Replies
phanindra
by New Contributor III
  • 4532 Views
  • 3 replies
  • 5 kudos

Resolved! Support for Varchar data type

In the official documentation for supported data types, Varchar is not listed. But in the product, we are allowed to create a field of varchar data type. We are building an integration with Databricks and we are confused if we should support operatio...

  • 4532 Views
  • 3 replies
  • 5 kudos
Latest Reply
szymon_dybczak
Esteemed Contributor III
  • 5 kudos

Hi @phanindra ,To be precise, Delta Lake format is based on parquet files. For strings, Parquet only has one data type: StringTypeSo, basically varchar(n) data type under the hood is represented as string with check constraint on the length of the st...

  • 5 kudos
2 More Replies

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels