cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

akisugi
by New Contributor III
  • 1367 Views
  • 5 replies
  • 0 kudos

Resolved! Is it possible to control the ordering of the array values created by array_agg()?

Hi! I would be glad to ask you some questions.I have the following data. I would like to get this kind of result. I want `move` to correspond to the order of `hist`.Therefore, i considered the following query.```with tmp as (select * from (values(1, ...

スクリーンショット 2024-04-06 23.08.15.png スクリーンショット 2024-04-06 23.07.34.png
  • 1367 Views
  • 5 replies
  • 0 kudos
Latest Reply
akisugi
New Contributor III
  • 0 kudos

Hi @ThomazRossito This is a great idea. It can solve my problem.Thank you.

  • 0 kudos
4 More Replies
57410
by New Contributor
  • 686 Views
  • 1 replies
  • 0 kudos

Deploy python application with submodules - Poetry library management

Hi,I'm using DBX (I'll soon move to Databricks Asset Bundle, but it doesn't change anything in my situation) to deploy a Python application to Databricks. I'm also using Poetry to manage my libraries and dependencies.My project looks like this :Proje...

  • 686 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz
Community Manager
  • 0 kudos

Hi @57410, It seems you’re transitioning from DBX to Databricks Asset Bundles (DABs) for managing your complex data, analytics, and ML projects on the Databricks platform. Let’s dive into the details and address the issue you’re facing. Databricks...

  • 0 kudos
cool_cool_cool
by New Contributor
  • 269 Views
  • 2 replies
  • 2 kudos

Trigger Dashboard Update At The End of a Workflow

Heya I have a workflow that computes some data and writes to a delta table, and I have a dashboard that is based on the table. How can I trigger refresh on the dashboard once the workflow is finished? Thanks!

  • 269 Views
  • 2 replies
  • 2 kudos
Latest Reply
Kaniz
Community Manager
  • 2 kudos

Hi @cool_cool_cool, To ensure your dashboard reflects the most up-to-date data after your Databricks workflow completes, consider the following options: Scheduled Notebook Refresh: Dashboards do not automatically live-refresh when presented from ...

  • 2 kudos
1 More Replies
mikeagicman
by New Contributor
  • 273 Views
  • 1 replies
  • 0 kudos

Handling Unknown Fields in DLT Pipeline

HiI'm working on a DLT pipeline where I read JSON files stored in S3.I'm using the auto loader to identify the file schema and adding schema hints for some fields to specify their type.When running it against a single data file that contains addition...

  • 273 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz
Community Manager
  • 0 kudos

Hi @mikeagicman, When you encounter the error message 'terminated with exception: [UNKNOWN_FIELD_EXCEPTION.NEW_FIELDS_IN_RECORD_WITH_FILE_PATH] Encountered unknown fields during parsing.', it means that the data file contains fields that are not defi...

  • 0 kudos
EdemSeitkh
by New Contributor III
  • 1493 Views
  • 4 replies
  • 0 kudos

Resolved! Pass catalog/schema/table name as a parameter to sql task

Hi, i am trying to pass catalog name as a parameter into query for sql task, and it pastes it with single quotes, which results in error. Is there a way to pass raw value or other possible workarounds? query:INSERT INTO {{ catalog }}.pas.product_snap...

  • 1493 Views
  • 4 replies
  • 0 kudos
Latest Reply
lathaniel
New Contributor II
  • 0 kudos

@EdemSeitkh  can you elaborate on your workaround? Curious how you were able to implement an enum paramter in DBSQL.I'm running into this same issue now.

  • 0 kudos
3 More Replies
939772
by New Contributor III
  • 658 Views
  • 1 replies
  • 0 kudos

Resolved! DLT refresh unexpectedly failing

We're hitting an error with a delta live table refresh since yesterday; nothing has changed in our system yet there appears to be a configuration error: { ... "timestamp": "2024-04-08T23:00:10.630Z", "message": "Update b60485 is FAILED.",...

  • 658 Views
  • 1 replies
  • 0 kudos
Latest Reply
939772
New Contributor III
  • 0 kudos

Apparently the `custom_tags` of `ResourceClass` is now extraneous -- removing it from config corrected our problem.

  • 0 kudos
brian_zavareh
by New Contributor III
  • 1617 Views
  • 5 replies
  • 4 kudos

Resolved! Optimizing Delta Live Table Ingestion Performance for Large JSON Datasets

I'm currently facing challenges with optimizing the performance of a Delta Live Table pipeline in Azure Databricks. The task involves ingesting over 10 TB of raw JSON log files from an Azure Data Lake Storage account into a bronze Delta Live Table la...

Data Engineering
autoloader
bigdata
delta-live-tables
json
  • 1617 Views
  • 5 replies
  • 4 kudos
Latest Reply
standup1
New Contributor III
  • 4 kudos

Hey @brian_zavareh , see this document. I hope this can help.https://learn.microsoft.com/en-us/azure/databricks/compute/cluster-config-best-practicesJust keep in mind that there's some extra cost from Azure VM side, check your Azure Cost Analysis for...

  • 4 kudos
4 More Replies
standup1
by New Contributor III
  • 911 Views
  • 2 replies
  • 0 kudos

Recover a deleted DLT pipeline

Hello,does anyone know how to recover a deleted dlt pipeline, or at least recover deleted tables that were managed by the dlt pipeline ? We have a pipeline that stopped working and throwing all kind of errors, so we decided to create a new one and de...

  • 911 Views
  • 2 replies
  • 0 kudos
Latest Reply
standup1
New Contributor III
  • 0 kudos

Thank you, Kanzi. Just to confirm that I understood you correctly. If the pipeline is deleted [like in our case] without having version control, backup configuration..etc already implemented. There's no way to recover those tables, not the pipeline. ...

  • 0 kudos
1 More Replies
Adrianj
by New Contributor III
  • 2505 Views
  • 9 replies
  • 5 kudos

Databricks Bundles - How to select which jobs resources to deploy per target?

Hello, My team and I are experimenting with bundles, we follow the pattern of having one main file Databricks.yml and each job definition specified in a separate yaml for modularization. We wonder if it is possible to select from the main Databricks....

  • 2505 Views
  • 9 replies
  • 5 kudos
Latest Reply
HrushiM
New Contributor II
  • 5 kudos

Hi @Adrianj , Please refer this medium.com post. I have tried explaining how dynamically you can change the content of the databricks.yml for each of the environment by maintaining single databricks.yml file with adequate level of parameters. In your...

  • 5 kudos
8 More Replies
Shas_DataE
by New Contributor II
  • 694 Views
  • 2 replies
  • 0 kudos

Alerts and Dashboard

Hi Team,In my Databricks workspace, i have created an alerts using the query in such a way the schedule will run on daily basis and the results will get populated to dashboard. The results from dashboard will be notified via email, but i am seeing re...

  • 694 Views
  • 2 replies
  • 0 kudos
Latest Reply
Ayushi_Suthar
Honored Contributor
  • 0 kudos

HI @Shas_DataE, Good Day!  Could you please check and confirm if there are any special characters in the table column? At this moment, special characters are compatible with Excel.  If yes then please drop the column that has that special character a...

  • 0 kudos
1 More Replies
Kibour
by Contributor
  • 898 Views
  • 3 replies
  • 0 kudos

Resolved! date_format 'LLLL' returns '1'

Hi all,In my notebook, when I run my cell with following code%sqlselect date_format(date '1970-01-01', "LLL");I get '1', while I expect 'Jan' according to the dochttps://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html I would also expect t...

  • 898 Views
  • 3 replies
  • 0 kudos
Latest Reply
Kibour
Contributor
  • 0 kudos

Hi @Kaniz ,Turns out it was actually a Java 8 bug:IllegalArgumentException: Java 8 has a bug to support stand-alone form (3 or more 'L' or 'q' in the pattern string). Please use 'M' or 'Q' instead, or upgrade your Java version. For more details, plea...

  • 0 kudos
2 More Replies
Kibour
by Contributor
  • 403 Views
  • 1 replies
  • 0 kudos

Resolved! Trigger one workflow after completion of another workflow

Hi there,Is it possible to trigger one workflow conditionnally on the completion of another workflow? Typically, I would like to have my workflow W2 to start automatically once the workflow W1 has successfully completed.Thanks in advance for your ins...

  • 403 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kibour
Contributor
  • 0 kudos

Found it: you build a new workflow where you connect W1 and W2 (each as a Run Job).

  • 0 kudos
wilco
by New Contributor II
  • 429 Views
  • 2 replies
  • 0 kudos

SQL Warehouse: Retrieving SQL ARRAY Type via JDBC driver

Hi all,we are currently running into the following issuewe are using serverless SQL warehousein a JAVA application we are using the latest Databricks JDBC driver (v2.6.36)we are querying the warehouse with a collect_list function, which should return...

  • 429 Views
  • 2 replies
  • 0 kudos
Latest Reply
Kaniz
Community Manager
  • 0 kudos

Hi @wilco, It appears that you’re encountering an issue with the Databricks JDBC driver when retrieving an ARRAY type using the collect_list function. Let’s explore some steps to address this: JDBC Driver Version: Ensure that you’re using the la...

  • 0 kudos
1 More Replies
Braxx
by Contributor II
  • 4644 Views
  • 6 replies
  • 2 kudos

Resolved! issue with group by

I am trying to group by a data frame by "PRODUCT", "MARKET" and aggregate the rest ones specified in col_list. There are much more column in the list but for simplification lets take the example below.Unfortunatelly I am getting the error:"TypeError:...

  • 4644 Views
  • 6 replies
  • 2 kudos
Latest Reply
Ralphma
New Contributor II
  • 2 kudos

The error you're encountering, "TypeError: unhashable type: 'Column'," is likely due to the way you're defining exprs. In Python, sets use curly braces {}, but they require their items to be hashable. Since the result of sum(x).alias(x) is not hashab...

  • 2 kudos
5 More Replies
Labels
Top Kudoed Authors