cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Katalin555
by New Contributor II
  • 898 Views
  • 1 replies
  • 0 kudos

Found a potential bug in Job Details/Schedule and Trigger section

One of our jobs is scheduled to run at 4:30 AM based on GMT+1 timezone, which is visible if we click on the Edit trigger (Picture1), but under job details it is show as if it was schedule to run at 4:30 AM UTC time (Picture 2).Based on previous runs ...

Katalin555_0-1743586717525.jpeg Katalin555_1-1743586768998.jpeg Katalin555_3-1743587093136.png
  • 898 Views
  • 1 replies
  • 0 kudos
Latest Reply
Isi
Honored Contributor III
  • 0 kudos

Hey @Katalin555 Even though in the “Edit Trigger” panel (Picture 2) the time is shown in local timezone (e.g. GMT+1), once the schedule is saved and viewed under job details (Picture 1), Databricks always displays it as UTC — without making it visual...

  • 0 kudos
Guigui
by New Contributor II
  • 978 Views
  • 2 replies
  • 0 kudos

Package installation for multi-tasks job

I have a job with the same task to be executed twice with two sets of parameters. In each task is run after cloning a git repo then installing it locally and running a notebook from this repo. However, as each task clones the same repo, I was wonderi...

  • 978 Views
  • 2 replies
  • 0 kudos
Latest Reply
Guigui
New Contributor II
  • 0 kudos

That what I've done, but I find it less elegant that setup an environment and sharing it across multiple tasks. It seems to be impossible (unless I build a wheel file and I dont want to) as tasks do not share environment, but anyway, as they run in p...

  • 0 kudos
1 More Replies
Eric_Kieft
by New Contributor III
  • 3085 Views
  • 5 replies
  • 4 kudos

Centralized Location of Table History/Timestamps in Unity Catalog

Is there a centralized location in Unity Catalog that retains the table history, specifically the last timestamp, for managed delta tables?DESCRIBE HISTORY will provide it for a specific table, but I would like to get it for a number of tables.inform...

  • 3085 Views
  • 5 replies
  • 4 kudos
Latest Reply
Priyanka_Biswas
Databricks Employee
  • 4 kudos

Hi @Eric_Kieft @noorbasha534  system.access.table_lineage includes a record for each read or write event on a Unity Catalog table or path. This includes but is not limited to job runs, notebook runs, and dashboards updated with the read or write even...

  • 4 kudos
4 More Replies
William_Scardua
by Valued Contributor
  • 2227 Views
  • 1 replies
  • 1 kudos

Resolved! Upsert from Databricks to CosmosDB

Hi guys,I'm adjusting a data upsert process from Databricks to CosmosDB using the .jar connector. As the load is very large, do you know if it's possible to change only the fields that have been modified?best regards

  • 2227 Views
  • 1 replies
  • 1 kudos
Latest Reply
Louis_Frolio
Databricks Employee
  • 1 kudos

Yes, you can update only the modified fields in your Cosmos DB documents from Databricks using the Partial Document Update feature (also known as Patch API). This is particularly useful for large documents where sending the entire document for update...

  • 1 kudos
397973
by New Contributor III
  • 3909 Views
  • 1 replies
  • 1 kudos

Resolved! What's the best way to get from Python dict > JSON > PySpark and apply as a mapping to a dataframe?

I'm migrating code from Python Linux to Databricks PySpark. I have many mappings like this: {    "main": {    "honda": 1.0,    "toyota": 2.9,    "BMW": 5.77,    "Fiat": 4.5,    },}I exported using json.dump, saved to s3 and was able to import with sp...

397973_0-1743620626332.png
  • 3909 Views
  • 1 replies
  • 1 kudos
Latest Reply
Louis_Frolio
Databricks Employee
  • 1 kudos

For migrating your Python dictionary mappings to PySpark, you have several good options. Let's examine the approaches and identify the best solution. Using F.create_map (Your Current Approach) Your current approach using `F.create_map` is actually qu...

  • 1 kudos
srinum89
by New Contributor III
  • 1049 Views
  • 1 replies
  • 0 kudos

Resolved! Workflow job failing with source as Git Provider (remote github repo) with SP

Facing issue using Github App when running job with source as "Git provider" using Service Principle. Since we can't use PAT with SP on github, I am using Github app for authentication. Followed below documentation but still giving permission issue. ...

  • 1049 Views
  • 1 replies
  • 0 kudos
Latest Reply
Louis_Frolio
Databricks Employee
  • 0 kudos

When running a Databricks workflow with a Git provider source using a Service Principal, you’re encountering permission issues despite using the GitHub App for authentication. This is a common challenge because Service Principals cannot use Personal ...

  • 0 kudos
bricks3
by New Contributor III
  • 2906 Views
  • 5 replies
  • 0 kudos

Resolved! How to schedule workflow in python script

I saw how to schedule a workflow using UI but python script, can someone help me to find how to schedule workflow hourly in python script ? Thank you.

  • 2906 Views
  • 5 replies
  • 0 kudos
Latest Reply
Isi
Honored Contributor III
  • 0 kudos

Hey @bricks3,Exactly, as far as I know you define the workflow configuration in the YAML file, and under the hood, DABS handles the API calls to Databricks (including scheduling).To run your workflow hourly, you just need to include the schedule bloc...

  • 0 kudos
4 More Replies
minhhung0507
by Valued Contributor
  • 4396 Views
  • 8 replies
  • 6 kudos

CANNOT_UPDATE_TABLE_SCHEMA

I'm encountering a puzzling schema merge issue with my Delta Live Table. My setup involves several master tables on Databricks, and due to a schema change in the source database, one of my Delta Live Tables has a column (e.g., "reference_score") that...

  • 4396 Views
  • 8 replies
  • 6 kudos
Latest Reply
Brahmareddy
Esteemed Contributor
  • 6 kudos

Dear Hung,Thank you so much for the kind words—I’m really glad the suggestions were helpful! You're absolutely doing the right thing by trying those options first before going for a full table drop. Testing with a new table and checking schema hints ...

  • 6 kudos
7 More Replies
databicky
by Contributor II
  • 1378 Views
  • 2 replies
  • 0 kudos

How to acknowledge the incidents from the databricks

I want to connect to the service now from the databricks and i want to acknowledge the incidents like assigned to the person and i need to change the status of the incident and i want to update the worknotes as well, how can i achieve this by the hel...

  • 1378 Views
  • 2 replies
  • 0 kudos
Latest Reply
Isi
Honored Contributor III
  • 0 kudos

Hi @databicky Just a quick clarification: this is the Databricks community forum, not official Databricks Support. Think of it like a place where users can share ideas, help each other, or sometimes just browse, responses are not guaranteed here.If y...

  • 0 kudos
1 More Replies
whalebone711
by New Contributor
  • 1019 Views
  • 1 replies
  • 0 kudos

Issue with Delta Lake Table Optimization

Hi Databricks Community, I’m currently working on optimizing a Delta Lake table, but I’m encountering some performance issues during the vacuum process. I’ve attached a screenshot of the error message I’m receiving, along with the relevant code snipp...

  • 1019 Views
  • 1 replies
  • 0 kudos
Latest Reply
Isi
Honored Contributor III
  • 0 kudos

Hey @whalebone711 Could you re-attach the screenshot? Looks like it doesn't appear in the post.Best,Isi

  • 0 kudos
Jorge3
by New Contributor III
  • 6782 Views
  • 2 replies
  • 2 kudos

Databricks Asset Bundle artifacts with module out of the bundle root (sync path)

Hello everyone!I’m currently working on a project with shared functionalities across different Databricks bundles. I have separate folders for each bundle, along with a common libs/ folder that holds some Python modules intended to be shared across b...

  • 6782 Views
  • 2 replies
  • 2 kudos
Latest Reply
VZLA
Databricks Employee
  • 2 kudos

Hi @Jorge3, where you able to get this issue resolved? I believe your artifact build path points outside the synced directory structure, and after syncing ../libs, libs should be available within the bundle root, so the artifact path should be update...

  • 2 kudos
1 More Replies
michael_mehrten
by Databricks Partner
  • 50938 Views
  • 26 replies
  • 14 kudos

Resolved! How to use Databricks Repos with a service principal for CI/CD in Azure DevOps?

Databricks Repos best-practices recommend using the Repos REST API to update a repo via your git provider. The REST API requires authentication, which can be done one of two ways:A user / personal access tokenA service principal access tokenUsing a u...

  • 50938 Views
  • 26 replies
  • 14 kudos
Latest Reply
pbz
New Contributor II
  • 14 kudos

For anyone coming here in the future, this should explain it: https://docs.databricks.com/aws/en/repos/ci-cd-techniques-with-repos#authorize-a-service-principal-to-access-git-foldersBasically:1. Go to your service in settings -> identity and access -...

  • 14 kudos
25 More Replies
pedrojunqueira
by New Contributor II
  • 18588 Views
  • 5 replies
  • 3 kudos

Resolved! Generating Personal Access Token to service principle databricks cli

Hi I am having issues generating personal access token to my service principle.I followed the steps from here my `~/.databrickscfg` has the following```[my-profile-name]host = <account-console-url>account_id = <account-id>azure_tenant_id = <azure-ser...

  • 18588 Views
  • 5 replies
  • 3 kudos
Latest Reply
PabloCSD
Valued Contributor II
  • 3 kudos

I want something similar, to use a service principal token instead of a PAT, have you ever done this?https://community.databricks.com/t5/administration-architecture/use-a-service-principal-token-instead-of-personal-access-token/m-p/91629

  • 3 kudos
4 More Replies
ushnish_18
by New Contributor
  • 854 Views
  • 1 replies
  • 0 kudos

Facing error which submitting lab assessment of Delivery Specialization: UC Upgrade

Hi,I attended the lab assessment of Delivery Specialization: UC Upgrade and while submitting my answers the grade is not getting updated successfully due to which the status is being showed as failed, although all the checkpoints got validated succes...

  • 854 Views
  • 1 replies
  • 0 kudos
Latest Reply
Arpita_S
Databricks Employee
  • 0 kudos

Hi @ushnish_18, Can you share the user ID or email you used to access the lab so the team can take a look? Alternatively, you can send all the details, including screenshots, here for the team to investigate in detail and guide you appropriately. Tha...

  • 0 kudos
HarryRichard08
by New Contributor II
  • 1186 Views
  • 1 replies
  • 0 kudos

Acess to s3 in aws

My Problem:My Databricks workspace (Serverless Compute) is in AWS Account A, but my  S3 bucket is in AWS Account B.Works in Shared Compute because i  am  manually setting access_key and secret_key.Does NOT work in Serverless Compute 

  • 1186 Views
  • 1 replies
  • 0 kudos
Latest Reply
SP_6721
Honored Contributor II
  • 0 kudos

Hi @HarryRichard08 ,I came across a similar thread of yours. Were you able to find a resolution for this?

  • 0 kudos
Labels