cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

digui
by New Contributor
  • 1947 Views
  • 4 replies
  • 0 kudos

Issues when trying to modify log4j.properties

Hi y'all.​I'm trying to export metrics and logs to AWS cloudwatch, but while following their tutorial to do so, I ended up facing this error when trying to initialize my cluster with an init script they provided.This is the part where the script fail...

  • 1947 Views
  • 4 replies
  • 0 kudos
Latest Reply
cool_cool_cool
New Contributor II
  • 0 kudos

@digui Did you figure out what to do? We're facing the same issue, the script works for the executors.I was thinking on adding an if that checks if there is log4j.properties and modify it only if it exists

  • 0 kudos
3 More Replies
Menegat
by Visitor
  • 22 Views
  • 1 replies
  • 0 kudos

VACUUM seems to be deleting Autoloader's log files.

Hello everyone,I have a workflow setup that updates a few Delta tables incrementally with autoloader three times a day. Additionally, I run a separate workflow that performs VACUUM and OPTIMIZE on these tables once a week.The issue I'm facing is that...

  • 22 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz
Community Manager
  • 0 kudos

Hi @Menegat, It seems you’re encountering an issue with your Delta tables during incremental updates. Let’s dive into this and explore potential solutions. Delta Live Tables and Incremental Updates: Delta Live Tables allow for incremental updates...

  • 0 kudos
georgef
by Visitor
  • 19 Views
  • 1 replies
  • 0 kudos

Cannot import relative python paths

Hello,Some variations of this question have been asked before but there doesn't seem to be an answer for the following simple use case:I have the following file structure on a Databricks Asset Bundles project: src --dir1 ----file1.py --dir2 ----file2...

  • 19 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz
Community Manager
  • 0 kudos

Hi @georgef, It appears that you’re encountering issues with importing modules within a Databricks Asset Bundles (DABs) project. Let’s explore some potential solutions to address this problem. Bundle Deployment and Import Paths: When deploying a ...

  • 0 kudos
ChingizK
by New Contributor II
  • 289 Views
  • 1 replies
  • 0 kudos

Workflow Failure Alert Webhooks for OpsGenie

I'm trying to set up a Workflow Job Webhook notification to send an alert to OpsGenie REST API on job failure. We've set up Teams & Email successfully.We've created the Webhook and when I configure "On Failure" I can see it in the JSON/YAML view. How...

Screenshot 2024-04-12 at 1.15.33 PM.png Screenshot 2024-04-12 at 1.17.27 PM.png
Data Engineering
jobs
opsgenie
webhooks
Workflows
  • 289 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz
Community Manager
  • 0 kudos

Hi @ChingizK, Configuring the payload for OpsGenie Webhook integration is essential to ensure that the data sent to OpsGenie meets your requirements. Let’s walk through the steps: Create a Webhook Integration in OpsGenie: Go to Settings > Integra...

  • 0 kudos
lindsey
by New Contributor
  • 508 Views
  • 1 replies
  • 0 kudos

"Error: cannot read mws credentials: invalid Databricks Account configuration" on TF Destroy

I have a terraform project that creates a workspace in Databricks, assigns it to an existing metastore, then creates external location/storage credential/catalog. The apply works and all expected resources are created. However, without touching any r...

  • 508 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz
Community Manager
  • 0 kudos

Hi @lindsey, It seems you’re encountering an issue with Terraform and Databricks when trying to destroy resources. Let’s explore some potential solutions to address this problem: Resource Order in Terraform Configuration: Ensure that the databric...

  • 0 kudos
dlaxminaresh
by New Contributor
  • 277 Views
  • 1 replies
  • 0 kudos

what config do we use to set row groups fro delta tables on data bricks.

I have tried multiples way to set row group for delta tables on data bricks notebook its not working where as I am able to set it properly using spark.I tried 1. val blockSize = 1024 * 1024 * 60spark.sparkContext.hadoopConfiguration.setInt( "dfs.bloc...

  • 277 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz
Community Manager
  • 0 kudos

Hi @dlaxminaresh, Setting row groups for Delta tables in Databricks can be a bit tricky, but let’s explore some options to achieve this. First, let’s address the approaches you’ve tried: Setting Block Sizes: You’ve attempted to set the block size...

  • 0 kudos
JonathanFlint
by New Contributor
  • 256 Views
  • 1 replies
  • 0 kudos

DevOps Asset Bundle Deployment to Change the Catalog a Job Writes to

I am trying to set up CI/CD with azure devops and 3 workspaces, dev, test, prod using asset bundlesAll 3 workspaces will have their own catalog in unity catalog. I can't find a way to change which catalog should be used by the jobs and dlt pipelines ...

  • 256 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz
Community Manager
  • 0 kudos

Hi @JonathanFlint, Setting up CI/CD with Azure DevOps for Unity projects involving multiple workspaces and catalogs can be achieved. Here are some approaches you can consider: Catalog Switching at Runtime: At the beginning of your program, issue ...

  • 0 kudos
naveenanto
by New Contributor III
  • 175 Views
  • 1 replies
  • 0 kudos

Custom Spark Extension in SQL Warehouse

I understand only a limited spark configurations are supported in SQL Warehouse but is it possible to add spark extensions to SQL Warehouse clusters?Use Case: We've a few restricted table properties. We prevent that with spark extensions installed in...

Data Engineering
sql-warehouse
  • 175 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz
Community Manager
  • 0 kudos

Hi @naveenanto, While SQL Data Warehouse (now known as Azure Synapse Analytics) has some limitations when it comes to Spark configurations, you can indeed extend its capabilities by adding custom Spark extensions. Let me provide you with some inform...

  • 0 kudos
JohanS
by New Contributor III
  • 151 Views
  • 1 replies
  • 0 kudos

WorkspaceClient authentication fails when running on a Docker cluster

from databricks.sdk import WorkspaceClientw = WorkspaceClient()ValueError: default auth: cannot configure default credentials ...I'm trying to instantiate a WorkspaceClient in a notebook on a cluster running a Docker image, but authentication fails.T...

  • 151 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz
Community Manager
  • 0 kudos

Hi @JohanS, It seems you’re encountering an authentication issue when trying to instantiate a WorkspaceClient in a Docker image running Databricks.   Let’s troubleshoot this! The error message you’re seeing, “default auth: cannot configure defau...

  • 0 kudos
Anske
by New Contributor II
  • 122 Views
  • 5 replies
  • 1 kudos

Resolved! DLT apply_changes applies only deletes and inserts not updates

Hi,I have a DLT pipeline that applies changes from a source table (cdctest_cdc_enriched) to a target table (cdctest), by the following code:dlt.apply_changes(    target = "cdctest",    source = "cdctest_cdc_enriched",    keys = ["ID"],    sequence_by...

Data Engineering
Delta Live Tables
  • 122 Views
  • 5 replies
  • 1 kudos
Latest Reply
Kaniz
Community Manager
  • 1 kudos

Hi @Anske, It seems you’re encountering an issue with your Delta Live Tables (DLT) pipeline where updates from the source table are not being correctly applied to the target table. Let’s troubleshoot this together! Pipeline Update Process: Whe...

  • 1 kudos
4 More Replies
jainshasha
by New Contributor
  • 96 Views
  • 6 replies
  • 0 kudos

Job Cluster in Databricks workflow

Hi,I have configured 20 different workflows in Databricks. All of them configured with job cluster with different name. All 20 workfldows scheduled to run at same time. But even configuring different job cluster in all of them they run sequentially w...

  • 96 Views
  • 6 replies
  • 0 kudos
Latest Reply
Wojciech_BUK
Contributor III
  • 0 kudos

HI @jainshasha i tried to replicate your problem but in my case i was able to run jobs in parallel(the only difference is that i am running notebook from workspace, not from repo)As you can see jobs did not started exactly same time but it run in par...

  • 0 kudos
5 More Replies
Ameshj
by New Contributor
  • 287 Views
  • 7 replies
  • 0 kudos

Dbfs init script migration

I need help with migrating from dbfs on databricks to workspace. I am new to databricks and am struggling with what is on the links provided.My workspace.yml also has dbfs hard-coded. Included is a full deployment with great expectations.This was don...

Data Engineering
Azure Databricks
dbfs
Great expectations
python
  • 287 Views
  • 7 replies
  • 0 kudos
Latest Reply
NandiniN
Valued Contributor II
  • 0 kudos

One of the other suggestions is to use Lakehouse Federation. It is possible it may be a driver issue (we will get to know from the logs)

  • 0 kudos
6 More Replies
ashraf1395
by Visitor
  • 56 Views
  • 3 replies
  • 2 kudos

Resolved! Optimising Clusters in Databricks on GCP

Hi there everyone,We are trying to get hands on Databricks Lakehouse for a prospective client's project.Our Major aim for the project is to Compare Datalakehosue on Databricks and Bigquery Datawarehouse in terms of Costs and time to setup and run que...

  • 56 Views
  • 3 replies
  • 2 kudos
Latest Reply
Kaniz
Community Manager
  • 2 kudos

Hi @ashraf1395, Comparing Databricks Lakehouse and Google BigQuery is essential to make an informed decision for your project. Let’s address your questions: Cluster Configurations for Databricks: Databricks provide flexibility in configuring com...

  • 2 kudos
2 More Replies
tanjil
by New Contributor III
  • 8693 Views
  • 8 replies
  • 6 kudos

Resolved! Downloading sharepoint lists using python

Hello, I am trying to download lists from SharePoint into a pandas dataframe. However I cannot get any information successfully. I have attempted many solution mentioned in stackoverflow. Below is one of those attempts: # https://pypi.org/project/sha...

  • 8693 Views
  • 8 replies
  • 6 kudos
Latest Reply
huntaccess
Visitor
  • 6 kudos

The error "<urlopen error [Errno -2] Name or service not known>" suggests that there's an issue with the server URL or network connectivity. Double-check the server URL to ensure it's correct and accessible. Also, verify that your network connection ...

  • 6 kudos
7 More Replies
Labels
Top Kudoed Authors