cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 
Data + AI Summit 2024 - Data Engineering & Streaming

Forum Posts

SamGreene
by Contributor
  • 2778 Views
  • 4 replies
  • 0 kudos

Resolved! Using parameters in a SQL Notebook and COPY INTO statement

Hi, My scenario is I have an export of a table being dropped in ADLS every day.  I would like to load this data into a UC table and then repeat the process every day, replacing the data.  This seems to rule out DLT as it is meant for incremental proc...

  • 2778 Views
  • 4 replies
  • 0 kudos
Latest Reply
SamGreene
Contributor
  • 0 kudos

The solution that worked what adding this python cell to the notebook: %pythonfrom pyspark.dbutils import DBUtilsdbutils = DBUtils(spark)dbutils.widgets.text("catalog", "my_business_app")dbutils.widgets.text("schema", "dev") Then in the SQL Cell: CRE...

  • 0 kudos
3 More Replies
galzamo
by New Contributor
  • 466 Views
  • 1 replies
  • 0 kudos

Job running time too long

Hi all,I'm doing my first data jobs.I create one job that consists of 4 other jobs.Yesterday I ran the 4 jobs separately and it worked fine (about half hour)-today I ran the big job, and the 4 jobs is running for 2 hours (and still running), Why is t...

  • 466 Views
  • 1 replies
  • 0 kudos
Latest Reply
anardinelli
Contributor
  • 0 kudos

Hello @galzamo how are you? You can check on the SparkUI for long running stages that might give you a clue where it's spending the most time on each task. Somethings can be the reason: 1. Increase of data and partitions on your source data 2. Cluste...

  • 0 kudos
EDDatabricks
by Contributor
  • 1119 Views
  • 1 replies
  • 0 kudos

Expected size of managed Storage Accounts

Dear all,we are monitoring the size of managed storage accounts associated with our deployed Azure databricks instances.We have 5 databricks instances for specific components of our platform replicated in 4 environments (DEV, TEST, PREPROD, PROD).Dur...

Data Engineering
Filesize
LOGS
Managed Storage Account
  • 1119 Views
  • 1 replies
  • 0 kudos
Latest Reply
" src="" />
This widget could not be displayed.
This widget could not be displayed.
This widget could not be displayed.
  • 0 kudos

This widget could not be displayed.
Dear all,we are monitoring the size of managed storage accounts associated with our deployed Azure databricks instances.We have 5 databricks instances for specific components of our platform replicated in 4 environments (DEV, TEST, PREPROD, PROD).Dur...

This widget could not be displayed.
  • 0 kudos
This widget could not be displayed.
Kayla
by Valued Contributor
  • 1398 Views
  • 3 replies
  • 7 kudos

Resolved! SQL Warehouse Timeout / Prevent Long Running Queries

We have an external service connecting to a SQL Warehouse, running a query that normally lasts 30 minutes.On occasion an error occurs and it will run for 6 hours.This happens overnight and is contributing to a larger bill. Is there any way to force l...

  • 1398 Views
  • 3 replies
  • 7 kudos
Latest Reply
Kayla
Valued Contributor
  • 7 kudos

@lucasrocha @raphaelblg That is exactly what I was hoping to find. Thank you!

  • 7 kudos
2 More Replies
thiagoawstest
by Contributor
  • 830 Views
  • 0 replies
  • 0 kudos

add active directory group permission

Hi, I'm using Databricks on AWS, I did the single sign-on integration with Azure extra ID (active directory), everything is working fine, I can add users, but when I try to add a group that was created in AD, it can't be found the group.How should I ...

  • 830 Views
  • 0 replies
  • 0 kudos
AlokThampi
by New Contributor III
  • 346 Views
  • 0 replies
  • 0 kudos

Issues while writing into bad_records path

Hello All,I would like to get your inputs with a scenario that I see while writing into the bad_records file.I am reading a ‘Ԓ’ delimited CSV file based on a schema that I have already defined. I have enabled error handling while reading the file to ...

Alok1_0-1717548996735.png Alok1_1-1717549044696.png
  • 346 Views
  • 0 replies
  • 0 kudos
LasseL
by New Contributor II
  • 520 Views
  • 1 replies
  • 0 kudos

How to use change data feed when schema is changing between delta table versions?

How to use change data feed when delta table schema changes between delta table versions?I tried to read change data feed in parts (in code snippet I read version 1372, because 1371 and 1373 schema versions are different), but getting errorUnsupporte...

  • 520 Views
  • 1 replies
  • 0 kudos
Latest Reply
raphaelblg
Honored Contributor II
  • 0 kudos

Hi @LasseL, Please check: What is the schema for the change data feed? . It might help you  

  • 0 kudos
MaximeGendre
by New Contributor III
  • 723 Views
  • 0 replies
  • 0 kudos

Problem using from_avro function

Hello everyone,I need your help with a topic that has been preoccupying me for a few days."from_avro" function gives me a strange result when I pass it the json schema of a Kafka topic.=================================================================...

MaximeGendre_2-1717533967736.png MaximeGendre_0-1717533089570.png MaximeGendre_1-1717533556219.png
  • 723 Views
  • 0 replies
  • 0 kudos
db_knowledge
by New Contributor II
  • 504 Views
  • 2 replies
  • 0 kudos

Merge operation with ouputMode update in autoloader databricks

Hi team,I am trying to do merge operation along with outputMode('update') and foreachmode byusing below code but it is not updating data could you please any help on this?output=(casting_df.writeStream.format('delta').trigger(availableNow=True).optio...

  • 504 Views
  • 2 replies
  • 0 kudos
Latest Reply
anardinelli
Contributor
  • 0 kudos

Hi @db_knowledge  Please try .foreachBatch(upsertToDelta) instead of creating the lambda inside it. Best, Alessandro

  • 0 kudos
1 More Replies
Adigkar
by New Contributor
  • 613 Views
  • 2 replies
  • 0 kudos

Reprocess of old data stored in adls

Hi,We have a requirement fir a scenario to reprocess old data using data factory pipeline.Here are the detailsStorage in ADLSGEN2Landing zone(where the data will be stored in the same format as we get from source),Data will be loaded from sql server ...

  • 613 Views
  • 2 replies
  • 0 kudos
Latest Reply
Hkesharwani
Contributor II
  • 0 kudos

@Retired_mod I just posted a possible solution for the above problem and it has been rejected community moderator without any explanation. This has happened to me twice in past as well.Can you please help in this case. 

  • 0 kudos
1 More Replies
AmitAharon
by New Contributor
  • 642 Views
  • 0 replies
  • 0 kudos

running git clone from databricks notebook

Hey,We have a use-case where we want to clone a git repository in Azure DevOps to a storage container (Blob storage).When I'm trying to run the "git clone" command to local storage I keep getting `Operation not supported` error.Git is installed and I...

  • 642 Views
  • 0 replies
  • 0 kudos
mk1987c
by New Contributor III
  • 4250 Views
  • 5 replies
  • 1 kudos

Resolved! I am trying to use Databricks Autoloader with File Notification Mode

When i run my command for readstream using  .option("cloudFiles.useNotifications", "true") it start reading the files from Azure blob (please note that i did not provide the configuration like subscription id , clint id , connect string and all while...

  • 4250 Views
  • 5 replies
  • 1 kudos
Latest Reply
jose_gonzalez
Moderator
  • 1 kudos

Hi,I would like to share the following docs that might be able to help you with this issue. https://docs.databricks.com/ingestion/auto-loader/file-notification-mode.html#required-permissions-for-configuring-file-notification-for-adls-gen2-and-azure-b...

  • 1 kudos
4 More Replies

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group
Labels