cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

HamidHamid_Mora
by New Contributor II
  • 4631 Views
  • 4 replies
  • 3 kudos

ganglia is unavailable on DBR 13.0

We created a library in databricks to ingest ganglia metrics for all jobs in our delta tables;However end point 8652 is no more available on DBR 13.0is there any other endpoint available ? since we need to log all metrics for all executed jobs not on...

  • 4631 Views
  • 4 replies
  • 3 kudos
Latest Reply
h_h_ak
Contributor
  • 3 kudos

You should have a look here: https://community.databricks.com/t5/data-engineering/azure-databricks-metrics-to-prometheus/td-p/71569

  • 3 kudos
3 More Replies
amanda3
by New Contributor II
  • 1145 Views
  • 3 replies
  • 0 kudos

Flattening JSON while also keep embedded types

I'm attempting to create DLT tables from a source table that includes an "data" column that is a JSON string. I'm doing something like this: sales_schema = StructType([ StructField("customer_id", IntegerType(), True), StructField("order_numbers",...

  • 1145 Views
  • 3 replies
  • 0 kudos
Latest Reply
Walter_C
Databricks Employee
  • 0 kudos

To ensure that the "value" field retains its integer type, you can explicitly cast it after parsing the JSON. from pyspark.sql.functions import col, from_json, expr from pyspark.sql.types import StructType, StructField, IntegerType, ArrayType, LongTy...

  • 0 kudos
2 More Replies
xhudik
by New Contributor III
  • 1220 Views
  • 1 replies
  • 1 kudos

Resolved! does stream.stop() generates "ERROR: Query termination received for []" automatically?

Whenever code contains stream.stop() in STDERR (in cluster logs) I get an error like:ERROR: Query termination received for [id=b7e14d07-f8ad-4ae6-99de-8a7cbba89d86, runId=5c01fd71-2d93-48ca-a53c-5f46fab726ff]No other message, even if I try to try-cat...

  • 1220 Views
  • 1 replies
  • 1 kudos
Latest Reply
MuthuLakshmi
Databricks Employee
  • 1 kudos

@xhudik does stream.stop() generates "ERROR: Query termination received for []" automatically?Yes, this is generated when there is stream.stop() in stdderIs ERROR: Query termination received for [] dangerous, or it is just ans info stream was closed?...

  • 1 kudos
roberta_cereda
by New Contributor
  • 985 Views
  • 1 replies
  • 0 kudos

Describe history operationMetrics['materializeSourceTimeMs']

Hi, during some checks on MERGE execution , I was running the describe history command and in the operationMetrics column I noticed this information :  operationMetrics['materializeSourceTimeMs'] .I haven't found that metric in the documentation so I...

  • 985 Views
  • 1 replies
  • 0 kudos
Latest Reply
MuthuLakshmi
Databricks Employee
  • 0 kudos

@roberta_cereda  If it’s specific to “materializeSourceTimeMs” then it’s “time taken to materialize source (or determine it's not needed)”

  • 0 kudos
pranav_k1
by New Contributor III
  • 2290 Views
  • 3 replies
  • 1 kudos

Resolved! Error while loading mosaic in notebook - TimeoutException: Futures timed out after [80 seconds]

I am working on reading spatial data with mosaic and gdal Previously I used databricks mosaic = 0.3.9 version with databricks cluster = 12.2 LTS version With following command - %pip install databricks-mosaic==0.3.9 --quiet Now It's giving timeout er...

  • 2290 Views
  • 3 replies
  • 1 kudos
Latest Reply
Alberto_Umana
Databricks Employee
  • 1 kudos

Hi @pranav_k1,Thanks for confirming it worked for you now!I see that the usual %pip install databricks-mosaic cannot install due to the fact that it has thus far allowed geopandas to essentially install the latest... As of geopandas==0.14.4, the vers...

  • 1 kudos
2 More Replies
DmitriyLamzin
by New Contributor II
  • 4853 Views
  • 2 replies
  • 1 kudos

applyInPandas function hangs in runtime 13.3 LTS ML and above

Hello, recently I've tried to upgrade my runtime env to the 13.3 LTS ML and found that it breaks my workload during applyInPandas.My job started to hang during the applyInPandas execution. Thread dump shows that it hangs on direct memory allocation: ...

Data Engineering
pandas udf
  • 4853 Views
  • 2 replies
  • 1 kudos
Latest Reply
Marcin_Milewski
New Contributor II
  • 1 kudos

Hi @Debayan the link just redirects to the same thread? Is there any update on this issue?We share some similar issue on job hanging using mapInPandas.   

  • 1 kudos
1 More Replies
Sanjeev
by New Contributor II
  • 1567 Views
  • 3 replies
  • 1 kudos

Unverified Commits via Databricks Repos: Seeking Solution for GitHub Verification

The team is integrating Databricks Repos with Personal Access Tokens (PAT) to commit code directly to GitHub. Our organization requires signed commits for verification purposes.Issue: When committing via Databricks Repos, the commits appear as unveri...

Data Engineering
data engineering
  • 1567 Views
  • 3 replies
  • 1 kudos
Latest Reply
Sanjeev
New Contributor II
  • 1 kudos

Can you please share the link to this doc DB-I-3082. I couldn't find it.

  • 1 kudos
2 More Replies
Danny_Lee
by Valued Contributor
  • 1073 Views
  • 1 replies
  • 0 kudos

UI improvement - open multiple workspace notebooks

Hi all,I have an idea for a feature to open multiple workbooks.  Currently, right-clicking a notebook in the Workspace will allow you to "Open in new tab".  If I multi-select notebooks, I only have option to Move or Move to trash.  Why not allow a us...

  • 1073 Views
  • 1 replies
  • 0 kudos
Latest Reply
Walter_C
Databricks Employee
  • 0 kudos

Many thanks for your feedback and great idea. We have created idea DBE-I-1544, this will be analyzed by our team and if approved it can be implemented in the near future.

  • 0 kudos
MOlliver
by New Contributor
  • 6814 Views
  • 1 replies
  • 0 kudos

DBT or Delta Live Tables

Quick question, when would people use DBT over Delta Live Tables? Or better yet can you use DBT to create Delta Live Tables?

  • 6814 Views
  • 1 replies
  • 0 kudos
Latest Reply
Walter_C
Databricks Employee
  • 0 kudos

Delta Live Tables (DLT): DLT is an ETL (Extract, Transform, Load) framework designed to simplify the creation and management of data pipelines. It uses a declarative approach to build reliable data pipelines and automatically manages infrastructure a...

  • 0 kudos
Vishwanath_Rao
by New Contributor II
  • 2281 Views
  • 2 replies
  • 0 kudos

Photon plan invariant violated Error

We've run into a niche error where we get the below message only on our non prod environment, with the same data, with the same code as our prod environment.org.apache.spark.sql.execution.adaptive.InvalidAQEPlanException: Photon plan invariant violat...

  • 2281 Views
  • 2 replies
  • 0 kudos
Latest Reply
JAC2703
New Contributor II
  • 0 kudos

Hey, did you raise a ticket and get a resolution to this? 

  • 0 kudos
1 More Replies
felix_immanuel
by New Contributor III
  • 5311 Views
  • 4 replies
  • 2 kudos

Resolved! Error while Deploying Asset Bundle using Azure Devops

Hi,I'm trying to deploy the Asset Bundle using Azure DevOps, it is giving me this error Step: databricks bundle validate -t dev========================== Starting Command Output ===========================2024-09-02T05:41:19.9113254Z Error: failed du...

  • 5311 Views
  • 4 replies
  • 2 kudos
Latest Reply
sampo
New Contributor II
  • 2 kudos

I had similar error message but then using correct environment variables in the pipeline solved the problem. Especially setting DATABRICKS_HOST point to the account. More detailed description is here Databricks Asset Bundle OAuth Authentication in Az...

  • 2 kudos
3 More Replies
Kibour
by Contributor
  • 9836 Views
  • 3 replies
  • 2 kudos

Resolved! Import from repo

Hi all,I am trying the new "git folder" feature, with a repo that works fine from the "Repos". In the new folder location, my imports from my own repo don't work anymore. Anyone faced something similar?Thanks in advance for sharing your experience

  • 9836 Views
  • 3 replies
  • 2 kudos
Latest Reply
Yuki
Contributor
  • 2 kudos

Hello,When I had the same issue even using 14.3+, I tried this codes.```pythonimport sysimport pprintpprint.pprint(sys.path)```And I noticed that the path was throwed by old(Legacy) Repos folder. I managed the same name of the folder & Repos.Then I r...

  • 2 kudos
2 More Replies
vyasakhilesh
by New Contributor
  • 792 Views
  • 1 replies
  • 0 kudos

Error Creating Table from delta location dbfs

[UC_FILE_SCHEME_FOR_TABLE_CREATION_NOT_SUPPORTED] Creating table in Unity Catalog with file scheme dbfs is not supported. Instead, please create a federated data source connection using the CREATE CONNECTION command for the same table provider, then ...

  • 792 Views
  • 1 replies
  • 0 kudos
Latest Reply
agallard2
New Contributor III
  • 0 kudos

Hi @vyasakhilesh,The error you're seeing, [UC_FILE_SCHEME_FOR_TABLE_CREATION_NOT_SUPPORTED], occurs because Unity Catalog in Databricks does not support creating tables directly from DBFS (Databricks File System) locations.In this case, you're trying...

  • 0 kudos
SethParker
by New Contributor III
  • 1632 Views
  • 6 replies
  • 0 kudos

Resolved! SQL View Formatting in Catalog - Can you turn it off?

It appears as though Databricks now formats SQL View definitions when showing them in the Catalog.  Our solution is based on views, and we have comment tags in those views.  We format these views so that it is easy for us to find and update parts of ...

SethParker_0-1730757882698.png SethParker_1-1730757945105.png
  • 1632 Views
  • 6 replies
  • 0 kudos
Latest Reply
SethParker
New Contributor III
  • 0 kudos

Thank you!  I will submit that request.In case anyone else stumbles upon this post, here is a function you can add that will return the view definition from information_schema, unformatted with an ALTER statement at the top:DROP FUNCTION IF EXISTS <c...

  • 0 kudos
5 More Replies
John_Rotenstein
by New Contributor II
  • 6697 Views
  • 7 replies
  • 2 kudos

ODBC on Windows -- Where to specify Catalog name?

We are attempting to connect a Windows ODBC application to Unity Catalog.The Configure the Databricks ODBC and JDBC drivers documentation has a section titled "ODBC configuration and connection parameters" that mentions a configuration parameter call...

  • 6697 Views
  • 7 replies
  • 2 kudos
Latest Reply
PiotrU
Contributor II
  • 2 kudos

It's quite interesting - I am using Mac, Simba spark ODBC 2.8.2 - and, If I will not add "Catalog" parameter - in UI, I will only see default one (If I will have access to it) - that doesn't mean I cannot query other one, It's just not listed in the ...

  • 2 kudos
6 More Replies

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels