cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 
Data + AI Summit 2024 - Data Engineering & Streaming

Forum Posts

Fikrat
by New Contributor II
  • 50 Views
  • 2 replies
  • 0 kudos

Can SQL task pass its outputs to ForEach task?

Hi there,If I understood correctly, Roland said output SQL task can be used as input to ForEach task in Workflows. I tried that and used the expression sqlTaskName.output.rows, but Databricks rejected that expression. Anyone know how to do that? 

  • 50 Views
  • 2 replies
  • 0 kudos
Latest Reply
Walter_C
Databricks Employee
  • 0 kudos

Can you confirm if this are the steps being followed: Create the SQL Task: Ensure your SQL task is correctly set up and produces the desired output. For example: SELECT customer_name, market FROM example_customers; Reference the SQL Task Output i...

  • 0 kudos
1 More Replies
vijaypodili
by New Contributor II
  • 45 Views
  • 3 replies
  • 0 kudos

databricks job taking longer time to load 2.3 gb data from bolb to ssms table

df_CorpBond= spark.read.format("parquet").option("header", "true").load(f"/mnt/{container_name}/raw_data/dsl.corporate.parquet") df_CorpBond.repartition(20).write\ .format("jdbc")\ .option("url", url_connector)\ .option("dbtable", "MarkIt...

Data Engineering
datarbricks
performance
  • 45 Views
  • 3 replies
  • 0 kudos
Latest Reply
vijaypodili
New Contributor II
  • 0 kudos

i'm trying to say my cluster have enough storage space 94gb so it can easy handle 2.3 gb data but my job taking longer time time job 2  and 3 completed with in the 3 min but job 4 taking longer to to complete its tasks

  • 0 kudos
2 More Replies
jeremy98
by New Contributor III
  • 128 Views
  • 8 replies
  • 0 kudos

How to deploy unique workflows that running on production

Hello, community!I have a question about deploying workflows in a production environment. Specifically, how can we deploy a group of workflows to production so that they are created only once and cannot be duplicated by others?Currently, if someone d...

  • 128 Views
  • 8 replies
  • 0 kudos
Latest Reply
jeremy98
New Contributor III
  • 0 kudos

 I had this night another issue:run failed with error message Unable to access the notebook "/Workspace/Users/<user email>/.bundle/rnc_data_pipelines/prod/files/notebook/prod/db_sync_initial_wip". Either it does not exist, or the identity used to run...

  • 0 kudos
7 More Replies
zuzsad
by New Contributor
  • 63 Views
  • 2 replies
  • 0 kudos

Azure Asset Bundle deploy removes the continous: true configuration

I have this pipeline configuration that I'm deploying using Azure Asset Bundles:ingest-pipeline.test.yml```resources:  pipelines:    ingest-pipeline-test:      name: ingest-pipeline-test-2      clusters:        - label: default          node_type_id:...

  • 63 Views
  • 2 replies
  • 0 kudos
Latest Reply
Walter_C
Databricks Employee
  • 0 kudos

Which CLI version are you using? Is it the latest version?

  • 0 kudos
1 More Replies
robbe
by New Contributor III
  • 1603 Views
  • 3 replies
  • 1 kudos

Resolved! Get job ID from Asset Bundles

When using Asset Bundles to deploy jobs, how does one get the job ID of the resources that are created?I would like to deploy some jobs through asset bundles, get the job IDs, and then trigger these jobs programmatically outside the CI/CD pipeline us...

  • 1603 Views
  • 3 replies
  • 1 kudos
Latest Reply
nvashisth
New Contributor II
  • 1 kudos

Refer this answer and this can be a solution to above scenario -> https://community.databricks.com/t5/data-engineering/getting-job-id-dynamically-to-create-another-job-to-refer-as-job/m-p/102860/highlight/true#M41252

  • 1 kudos
2 More Replies
David_Billa
by New Contributor II
  • 33 Views
  • 1 replies
  • 0 kudos

Unable to convert to date from datetime string with AM and PM

Any help to understand why it's showing 'null' instead of the date value? It's showing null only for 12:00:00 AM and for any other values it's showing date correctlyTO_DATE("12/30/2022 12:00:00 AM", "MM/dd/yyyy HH:mm:ss a") AS tsDate 

  • 33 Views
  • 1 replies
  • 0 kudos
Latest Reply
Alberto_Umana
Databricks Employee
  • 0 kudos

Hi @David_Billa, Can you try with: TO_TIMESTAMP("12/30/2022 12:00:00 AM", "MM/dd/yyyy hh:mm:ss a") AS tsDate The issue you are encountering with the TO_DATE function returning null for the value "12:00:00 AM" is likely due to the format string not ma...

  • 0 kudos
Svish
by New Contributor
  • 96 Views
  • 3 replies
  • 0 kudos

Resolved! DLT: Schema mismatch error

HiI am encountering the following error when writing a DLT pipeline. Here is my workflow:Read a bronze delta tableCheck Data Quality RulesWrite clean records to a silver table with defined schema. I use TRY_CAST for columns where there is mismatch be...

  • 96 Views
  • 3 replies
  • 0 kudos
Latest Reply
filipniziol
Contributor III
  • 0 kudos

Hi @Svish ,You have one line that differs:JOB_CERTREP_CONTRACT_INT: string (nullable = true) vs. JOB_CERTREP_CONTRACT_NUMBER: string (nullable = true) 

  • 0 kudos
2 More Replies
stevewb
by New Contributor
  • 76 Views
  • 2 replies
  • 1 kudos

Resolved! databricks bundle deploy fails when job includes dbt task and git_source

I am trying to deploy a dbt task as part of a databricks job using databricks asset bundles.However, there seems to be a clash that occurs when specifying a job that includes a dbt task that causes a bizarre failure.I am using v0.237.0 of the CLI.Min...

  • 76 Views
  • 2 replies
  • 1 kudos
Latest Reply
madams
Contributor
  • 1 kudos

Thanks for providing that whole example, it was really easy to fiddle with.  I think I've found your solution.  Update the original two tasks on the job (if you want to keep them) like this: tasks: - task_key: notebook_task job...

  • 1 kudos
1 More Replies
Taja
by New Contributor II
  • 37 Views
  • 0 replies
  • 0 kudos

Delta Live Tables: large use

Does anyone use Delta Live Table on large scale in production pipelines ? Are they satisfied with the product ?Recently, I´ve started a PoC to evaluate the DLT and notice some concerns:- Excessive use of compute resources when you check the cluster m...

  • 37 Views
  • 0 replies
  • 0 kudos
filipniziol
by Contributor III
  • 92 Views
  • 5 replies
  • 2 kudos

Magic Commands (%sql) Not Working with Databricks Extension for VS Code

Hi Community,I’ve encountered an issue with the Databricks Extension for VS Code that seems to contradict the documentation. According to the Databricks documentation, the extension supports magic commands like %sql when used with Databricks Connect:...

filipniziol_0-1734692630751.png
  • 92 Views
  • 5 replies
  • 2 kudos
Latest Reply
Walter_C
Databricks Employee
  • 2 kudos

Got it, I will check with my internal team to validate if there is any issue around.

  • 2 kudos
4 More Replies
tinai_long
by New Contributor III
  • 9015 Views
  • 10 replies
  • 4 kudos

Resolved! How to refresh a single table in Delta Live Tables?

Suppose I have a Delta Live Tables framework with 2 tables: Table 1 ingests from a json source, Table 2 reads from Table 1 and runs some transformation.In other words, the data flow is json source -> Table 1 -> Table 2. Now if I find some bugs in the...

  • 9015 Views
  • 10 replies
  • 4 kudos
Latest Reply
cpayne_vax
New Contributor III
  • 4 kudos

Answering my own question: nowadays (February 2024) this can all be done via the UI.When viewing your DLT pipeline there is a "Select tables for refresh" button in the header. If you click this, you can select individual tables, and then in the botto...

  • 4 kudos
9 More Replies
HoussemBL
by New Contributor II
  • 33 Views
  • 1 replies
  • 0 kudos

Impact of deleting workspace on associated catalogs

Hello Community,I have a specific scenario regarding Unity Catalog and workspace deletion that I'd like to clarify:Current Setup:Two DataBricks workspaces: W1 and W2Single Unity Catalog instanceCatalog1: Created in W1, shared and accessible in W2Cata...

  • 33 Views
  • 1 replies
  • 0 kudos
Latest Reply
Alberto_Umana
Databricks Employee
  • 0 kudos

Hi @HoussemBL  When you delete a Databricks workspace, it does not directly impact the Unity Catalog or the data within it. Unity Catalog is a separate entity that manages data access and governance across multiple workspaces. Here’s what happens in ...

  • 0 kudos
thisisthemurph
by New Contributor II
  • 40 Views
  • 1 replies
  • 0 kudos

Databricks dashboards across multiple Databricks instances

We have multiple Databricks instances, one per environment (Dev-UK, Live-UK Live-EU, Live-US, etc), and we would like to create dashboards to present stats on our data in each of these environments. Each of these environments also has a differently n...

  • 40 Views
  • 1 replies
  • 0 kudos
Latest Reply
Walter_C
Databricks Employee
  • 0 kudos

Hello, as you have mentioned you could create an script in Python that uses the api call https://docs.databricks.com/api/workspace/lakeview/create to generate the dashboard for each environment, the process to create the visualizations will be comple...

  • 0 kudos
dollyb
by Contributor
  • 112 Views
  • 5 replies
  • 0 kudos

Accessing Workspace / Repo file works in notebook, but not from job

In a notebook attached to.a normal personal cluster I can successfully do this:%fs ls file:/Workspace/Repos/$userName/$repoName/$folderNameWhen I run an init-script on a UC volume that the does the same thing, I'm getting this error:ls: cannot access...

  • 112 Views
  • 5 replies
  • 0 kudos
Latest Reply
Alberto_Umana
Databricks Employee
  • 0 kudos

Hi @dollyb, Can you try with just "ls /Workspace/Repos/my_user_name@company.com/my_repo_name/my_folder_name" I'm not sure dbutils will be useful in an init script, I will try to test it out

  • 0 kudos
4 More Replies

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group
Labels