cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

osas
by New Contributor II
  • 2953 Views
  • 7 replies
  • 3 kudos

databricks academy setup error -data engineering

am trying to run the set up notebook  "_COMMON" for my academy data engineering,am getting the below error: "Configuration dbacademy.deprecation.logging is not available."

  • 2953 Views
  • 7 replies
  • 3 kudos
Latest Reply
iFoxz17
New Contributor
  • 3 kudos

Databricks is passing from the Community Edition to the Free Edition, which I am currently using.Inspecting the code, the problem seems to be related to the spark.conf.get() method, which is declared as follows in the documentation:------------------...

  • 3 kudos
6 More Replies
Hatter1337
by New Contributor III
  • 3885 Views
  • 6 replies
  • 3 kudos

Resolved! Write Spark DataFrame into OpenSearch

Hi Databricks Community,I'm trying to read an index from OpenSearch or write a DataFrame into an OpenSearch index using the native Spark OpenSearch connector:host = dbutils.secrets.get(scope="opensearch", key="host") port = dbutils.secrets.get(scope=...

  • 3885 Views
  • 6 replies
  • 3 kudos
Latest Reply
SayedAbdallah
New Contributor
  • 3 kudos

Hi,I am getting the same error and i also was able to connect using opensearch-py.I also founded in this doc https://github.com/opensearch-project/opensearch-hadoop/blob/main/README.md#requirements that i need to have some jars i already add it witho...

  • 3 kudos
5 More Replies
Oumeima
by New Contributor III
  • 2048 Views
  • 5 replies
  • 3 kudos

Resolved! I can't use my own .whl package in Databricks app with databricks asset bundles

I am building a databricks app using databricks asset bundles. I need to use a helpers packages that i built as an artifact and using in other resources outside the app. The only way to use it is to have the built package inside the app source code f...

  • 2048 Views
  • 5 replies
  • 3 kudos
Latest Reply
nk-five1
New Contributor
  • 3 kudos

Thank you very much. I hope translate your tips to my case which does not use asset bundles.

  • 3 kudos
4 More Replies
Rohit_hk
by New Contributor
  • 73 Views
  • 2 replies
  • 1 kudos

DLT Autoloader schemaHints from JSON file instead of inline list?

Hi @Witold, @Hubert-Dudek,I’m using a DLT pipeline to ingest realtime data from Parquet files in S3 into Delta tables using Auto Loader. The pipeline is written in SQL notebooks.Problem:Sometimes decimal columns in the Parquet files get inferred as I...

  • 73 Views
  • 2 replies
  • 1 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 1 kudos

- dlt use automatically cloudFiles.schemaLocation So the schema is stored automatically, and in many cases, it will be stable, but it does not - keep using cloudFiles.schemaHints, but just load JSON to a variable and pass that variable (I guess you w...

  • 1 kudos
1 More Replies
TimB
by New Contributor III
  • 11442 Views
  • 7 replies
  • 0 kudos

Foreign catalog - Connections using insecure transport are prohibited --require_secure_transport=ON

I have added a connection to a MySql database in Azure, and I have created a foreign catalog in Databricks. But when I go to query the database I get the following error;Connections using insecure transport are prohibited while --require_secure_trans...

  • 11442 Views
  • 7 replies
  • 0 kudos
Latest Reply
Joe_Breath1
New Contributor III
  • 0 kudos

That message usually shows up when a system blocks unsafe connections, which is a good security step. It’s similar to how runt por protects users by only allowing secure access when checking vehicle data, so info stays private and safe.

  • 0 kudos
6 More Replies
cdn_yyz_yul
by Contributor
  • 111 Views
  • 3 replies
  • 2 kudos

Resolved! how to avoid extra column after retry upon UnknownFieldException

 With autoloader.option("cloudFiles.schemaEvolutionMode", "addNewColumns") I have done retry after getting org.apache.spark.sql.catalyst.util.UnknownFieldException: [UNKNOWN_FIELD_EXCEPTION.NEW_FIELDS_IN_FILE] Encountered unknown fields during par...

  • 111 Views
  • 3 replies
  • 2 kudos
Latest Reply
cdn_yyz_yul
Contributor
  • 2 kudos

Hi @Hubert-Dudek the input is csv. readStream reads csv with .option("cloudFiles.inferColumnTypes", "true"). then, df.toDF( ) is called to rename the column name. The original csv header has space, that's why error message has "test 1_2 Prime". The r...

  • 2 kudos
2 More Replies
Garrus990
by New Contributor II
  • 1792 Views
  • 5 replies
  • 2 kudos

How to run a python task that uses click for CLI operations

Hey,in my application I am using click to facilitate CLI operations. It works locally, in notebooks, when scripts are run locally, but it fails in Databricks. I defined a task that, as an entrypoint, accepts the file where the click-decorated functio...

  • 1792 Views
  • 5 replies
  • 2 kudos
Latest Reply
Garrus990
New Contributor II
  • 2 kudos

Hey guys,I think I managed to find a workaround. I will leave it here for everyone that is seeking the same answers, including future me.What I did is basically this piece of code:def main(): try: assign_variants(standalone_mode=False) ...

  • 2 kudos
4 More Replies
tnyein_99
by New Contributor
  • 163 Views
  • 4 replies
  • 6 kudos

Resolved! ONLY PNG format is available for databricks dashboard table download

I couldn't download the data straight from databricks dashboards in csv format starting from last night (night of Dec 1st, 2025). The only format that is available right now is PNG. I've tried downloading the data on multiple browsers but only the PN...

Screenshot 2025-12-02 at 8.47.55 AM.png
  • 163 Views
  • 4 replies
  • 6 kudos
Latest Reply
random_user77
New Contributor
  • 6 kudos

Hey @Advika  you saya quick workaround is to right-click and download the CSV from there.What do you mean? Where? I am right clicking all over my dashboard widget and don't see CSV download option. Can you be more specific?

  • 6 kudos
3 More Replies
a_user12
by New Contributor III
  • 169 Views
  • 7 replies
  • 2 kudos

Declarative Pipelines: set Merge Schema to False

Dear Team!I want to prevent at a certain table that the schema is automatically updated. With plain strucutred streaming I can do the following:silver_df.writeStream \ .format("delta") \ .option("mergeSchema", "false") \ .option("checkpoi...

  • 169 Views
  • 7 replies
  • 2 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 2 kudos

It is automatic in DLT. If there are significant schema changes, you need to full refresh. Maybe consider storing everything (the whole JSON) in a single VARIANT column and unpacking only what is necessary later - this way you will have it under cont...

  • 2 kudos
6 More Replies
dpc
by Contributor
  • 177 Views
  • 3 replies
  • 2 kudos

Resolved! API Call to return more than 100 jobs

Hello I have around 150 jobs and this is likely to increase.I use this call to get all the jobs and write them into a list called json.My logic here is to match a name to a job id and run the job using the job id. response = requests.get(hostHTTPS, j...

  • 177 Views
  • 3 replies
  • 2 kudos
Latest Reply
dpc
Contributor
  • 2 kudos

Looping using next_page_token works well, thanks @bianca_unifeye 

  • 2 kudos
2 More Replies
SanjeevPrasad
by New Contributor II
  • 133 Views
  • 3 replies
  • 5 kudos

Resolved! user standard serverless with asset bundle on Azure

Anyone running into issues with using standard serverless with Asset bundle we tried all options with below line       performance_target: STANDARDbut it ignore above value and uses performance optimized cluster which is not expected any lead with ri...

  • 133 Views
  • 3 replies
  • 5 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 5 kudos

resources: jobs: my_dabs: performance_target: STANDARD Please check whether it is on the correct level in the YAML. Also consider updating the CLI. I've just tested it, and it worked properly.  

  • 5 kudos
2 More Replies
Adrianj
by New Contributor III
  • 20190 Views
  • 19 replies
  • 12 kudos

Databricks Bundles - How to select which jobs resources to deploy per target?

Hello, My team and I are experimenting with bundles, we follow the pattern of having one main file Databricks.yml and each job definition specified in a separate yaml for modularization. We wonder if it is possible to select from the main Databricks....

  • 20190 Views
  • 19 replies
  • 12 kudos
Latest Reply
Dimitry
Valued Contributor
  • 12 kudos

Experiencing the same issue. Solved partially by placing high level targets in the job yml file, but this only works if the job has to go only one environment. If this is for two environments, but not the third, there is no way to avoid duplicating t...

  • 12 kudos
18 More Replies
Penguin_eye
by New Contributor
  • 99 Views
  • 3 replies
  • 4 kudos

Getting below error when trying to create a Data Quality Monitor for the table. ‘Cannot create Monit

Getting below error when trying to create a Data Quality Monitor for the table.‘Cannot create Monitor because it exceeds the number of limit 500'.

Data Engineering
Databricks Lakehouse monitoring
  • 99 Views
  • 3 replies
  • 4 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 4 kudos

Maybe this is not a situation, but trial accounts have lower quotas. Tried to find the quota related to the monitor in "databricks resource-quotas list-quotas" but couldn't find it.Your account contact in databricks can probably adjust it or find wit...

  • 4 kudos
2 More Replies
ScottH
by New Contributor III
  • 148 Views
  • 4 replies
  • 4 kudos

Resolved! How to create a Unity Catalog Connection to SQL Server using service principal??

I am trying to use the Databricks Python SDK (v 0.63.0) to create a Unity Catalog connection to a Azure-hosted SQL Server database using an Azure service principal to authenticate. I have successfully done this via the Workspace UI, but I am trying t...

  • 148 Views
  • 4 replies
  • 4 kudos
Latest Reply
szymon_dybczak
Esteemed Contributor III
  • 4 kudos

Hi @ScottH ,You need to configure it in following way (I've tested it and it works). In a place where a red arrow is pointing you need to provide your own tenant_id: 

  • 4 kudos
3 More Replies
chad_woodhead
by New Contributor
  • 3589 Views
  • 5 replies
  • 0 kudos

Unity Catalog is missing column in Catalog Explorer

I have just altered one of my tables and added a column.ALTER TABLE tpch.customer ADD COLUMN C_CUSTDETAILS struct<key:string,another_key:string,boolean_key:boolean,extra_key:string,int_key:long,nested_object:struct<more:long,arrayOne:array<string>>>A...

chad_woodhead_0-1706220653227.png chad_woodhead_1-1706220693600.png
  • 3589 Views
  • 5 replies
  • 0 kudos
Latest Reply
GoToJDenman
New Contributor II
  • 0 kudos

I had this error just recently. I did basically the same table transformation 4 times over the course of 2 days. Added two new fields to two different tables using the same SQL syntax. It worked 3 out of 4 times, but for 1 the column is not available...

  • 0 kudos
4 More Replies

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels