cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

sreedata
by New Contributor III
  • 4954 Views
  • 4 replies
  • 5 kudos

Resolved! Databricks -->Workflows-->Job Runs

In Databricks -->Workflows-->Job Runs we have a column "Run As".From where does this value come. We are getting a user id here but need to change it to a generic account. Any help would be appreciated. Thanks

  • 4954 Views
  • 4 replies
  • 5 kudos
Latest Reply
Leon_K
New Contributor II
  • 5 kudos

I'm surprised why there no options to select "Run as" as something like "system user". Why all this complication with Service Principal? Where to report this ?@DataBricks  

  • 5 kudos
3 More Replies
Kayla
by Valued Contributor II
  • 682 Views
  • 1 replies
  • 0 kudos

Version Control For Alerts, Queries

Is there any inbuilt option for version control for Databricks SQL Queries and Alerts? Tried moving the files into a repo and Git did not recognize the file types.

  • 682 Views
  • 1 replies
  • 0 kudos
Latest Reply
Walter_C
Databricks Employee
  • 0 kudos

Currently, Databricks does not have an inbuilt option for version control specifically for SQL Queries and Alerts.

  • 0 kudos
Sanjeev
by New Contributor II
  • 904 Views
  • 1 replies
  • 0 kudos

Resolved! Sending customized mail with databricks notebook with images

How can i send customized message from within the databricks notebook. SQL alerts is not helping.

  • 904 Views
  • 1 replies
  • 0 kudos
Latest Reply
Walter_C
Databricks Employee
  • 0 kudos

You can refer to KB https://kb.databricks.com/en_US/notebooks/send-email  

  • 0 kudos
blobbles78
by Databricks Partner
  • 2555 Views
  • 6 replies
  • 2 kudos

Resolved! SQL run on cluster creates table different to SQL Warehouse endpoint

I have a Personal cluster version 15.4 LTS (includes Apache Spark 3.5.0, Scala 2.12) and a SQL Warehouse in a databricks environment.  When I use the following code to create a table in a catalog, it gives me different column types when run on the cl...

  • 2555 Views
  • 6 replies
  • 2 kudos
Latest Reply
Walter_C
Databricks Employee
  • 2 kudos

It seems that as per docs as of now this setting is only true by default in warehouses in clusters it is set to false: https://docs.databricks.com/en/sql/language-manual/sql-ref-ansi-compliance.html#ansi-compliance-in-databricks-runtime 

  • 2 kudos
5 More Replies
SureshRajG
by Databricks Partner
  • 856 Views
  • 1 replies
  • 0 kudos
  • 856 Views
  • 1 replies
  • 0 kudos
Latest Reply
Stefan-Koch
Databricks Partner
  • 0 kudos

HiYou can achieve that with pandas. See the following example code:%pip install openpyxlimport pandas as pd file_location_xls = "path/to/excel/1.xlsx" # read the sheet with Name Financials1" into a pandas dataframe pdf = pd.read_excel(file_location...

  • 0 kudos
NhanNguyen
by Contributor III
  • 4534 Views
  • 1 replies
  • 0 kudos

How to handle timeout exception in Error Handle Task

Dear team, I have a workflow like this, task_a, task_b and handle_error. How I handle any timeout exception from task_a and task_b or any future task in future and log into handle error task at the end.Best regards,Jensen Nguyen 

NhanNguyen_1-1727750059373.png
  • 4534 Views
  • 1 replies
  • 0 kudos
Latest Reply
VZLA
Databricks Employee
  • 0 kudos

@NhanNguyen thanks for your question! Have you maybe consider to: Define a Global Error Task: Add a handle_error task in the workflow that runs conditionally on task failure.Set Failure Conditions: In the UI or through JSON, configure the handle_erro...

  • 0 kudos
NaeemS
by New Contributor III
  • 4687 Views
  • 1 replies
  • 0 kudos

Static Parameters in Feature Functions

Hi,I'm implementing a machine learning pipeline using feature stores and I'm running into a limitation with feature functions. I'd like to perform multiple calculations on my columns with some minor adjustments, but I need to pass a static parameter ...

  • 4687 Views
  • 1 replies
  • 0 kudos
Latest Reply
VZLA
Databricks Employee
  • 0 kudos

Hi @NaeemS thanks for your question! Yes, you can pass a static parameter to a feature function to control its behavior in Databricks Feature Store. This allows you to perform multiple calculations on your columns with minor adjustments without defin...

  • 0 kudos
Kaviprakash
by New Contributor
  • 1614 Views
  • 1 replies
  • 0 kudos

ORA-01830: date format picture ends before converting entire input string

Hi,Recently we are migrating our hive metastore workloads to unity catalog. As part of which, se are running into an following error with 15.4 DBR (UC) version where as it's working with 10.4 DBR (Hive). The requirement is to read the data from a tab...

  • 1614 Views
  • 1 replies
  • 0 kudos
Latest Reply
VZLA
Databricks Employee
  • 0 kudos

@Kaviprakash thanks for your question! Is this also probably specific to a cluster type ? Shared vs Single user? If Shared Mode, can you please try restarting your cluster with the following Spark Configuration:spark.connect.perserveOptionCasing true...

  • 0 kudos
dh
by New Contributor
  • 8074 Views
  • 1 replies
  • 1 kudos

Data Lineage without Spark, but with Polars (and Delta Lake) instead

Some context: I am completely new to Databricks; have heard good stuff, but also some things that worry me.One thing that worries me is the performance (and eventual costs) of running Spark with smaller (sub 1TB) datasets. However, one requirement fr...

  • 8074 Views
  • 1 replies
  • 1 kudos
Latest Reply
VZLA
Databricks Employee
  • 1 kudos

Hi @dh thanks for your question! I believe It’s possible to run Polars with Delta Lake on Databricks, but automatic data lineage tracking is not native outside of Spark jobs. You would likely need to implement custom lineage tracking or integrate ext...

  • 1 kudos
cmilligan
by Contributor II
  • 7219 Views
  • 4 replies
  • 4 kudos

Dropdown for parameters in a job

I want to be able to denote the type of run from a predetermined list of values that a user can choose from when kicking off a run using different parameters. Our team does standardized job runs on a weekly cadence but can have timeframes that change...

  • 7219 Views
  • 4 replies
  • 4 kudos
Latest Reply
Leon_K
New Contributor II
  • 4 kudos

I'm looking to this too. Wonder if there a way to make as a drop down for a job parameter

  • 4 kudos
3 More Replies
Mithos
by New Contributor
  • 848 Views
  • 1 replies
  • 0 kudos

ZCube Tags not present in Databricks Delta Tables

The design doc for Liquid Clustering for Delta refer to Z-Cube to enable  incremental clustering in batches. This is the link - https://docs.google.com/document/d/1FWR3odjOw4v4-hjFy_hVaNdxHVs4WuK1asfB6M6XEMw/edit?pli=1&tab=t.0.It is also mentioned th...

  • 848 Views
  • 1 replies
  • 0 kudos
Latest Reply
VZLA
Databricks Employee
  • 0 kudos

Hi @Mithos thanks for the question! This is the OSS version of LC applicable to OSS Delta. Databricks has a different implementation, so you won't be able to find it in a liquid table written by DBR. 

  • 0 kudos
templier2
by Databricks Partner
  • 2705 Views
  • 3 replies
  • 0 kudos

Log jobs stdout to an Azure Logs Analytics workspace

Hello,I have enabled cluster logs sending through an mspnp/spark-monitoring, but I don't see there stdout/stderr/log4j logs.Is it supported?

  • 2705 Views
  • 3 replies
  • 0 kudos
Latest Reply
VZLA
Databricks Employee
  • 0 kudos

Hi @templier2  If it works, it’s not duct tape and chewing gum; it’s a paperclip away from advanced engineering!  You're right, I forgot this option is only there for AWS/S3. So, yeah I think that's the current and only way, mount points.

  • 0 kudos
2 More Replies
theanhdo
by New Contributor III
  • 3993 Views
  • 3 replies
  • 1 kudos

Run continuous job for a period of time

Hi there,I have a job where the Trigger type is configured as Continuous. I want to only run the Continuous job for a period of time per day, e.g. 8AM - 5PM. I understand that we can achieve it by manually starting and cancelling the job on the UI, o...

  • 3993 Views
  • 3 replies
  • 1 kudos
Latest Reply
theanhdo
New Contributor III
  • 1 kudos

Hi @MuthuLakshmi , thank you for your answer. However, your answer doesn't help with my question. Let me rephrase my question.In short, my question is how to configure a Continuous job to run for a period of time, e.g. from 8AM to 5PM every day, and ...

  • 1 kudos
2 More Replies
jkb7
by New Contributor III
  • 2912 Views
  • 6 replies
  • 2 kudos

Resolved! Keep history of task runs in Databricks Workflows while moving it from one job to another

We are using Databricks Asset Bundles (DAB) to orchestrate multiple workflow jobs, each containing multiple tasks.The execution schedules is managed on the job level, i.e., all tasks within a job start together.We often face the issue of rescheduling...

  • 2912 Views
  • 6 replies
  • 2 kudos
Latest Reply
Walter_C
Databricks Employee
  • 2 kudos

You can submit it through https://docs.databricks.com/en/resources/ideas.html#ideas

  • 2 kudos
5 More Replies
vickytscv
by New Contributor II
  • 1707 Views
  • 3 replies
  • 0 kudos

Adobe query support from databricks

Hi Team,     We are working with Adobe tool for campaign metrics. which needs to pull data from AEP using explode option, when we pass query it is taking long time and performance is also very. Is there any better way to pull data from AEP, Please le...

  • 1707 Views
  • 3 replies
  • 0 kudos
Latest Reply
jodbx
Databricks Employee
  • 0 kudos

https://github.com/Adobe-Marketing-Cloud/aep-cloud-ml-ecosystem 

  • 0 kudos
2 More Replies
Labels