cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

SamGreene
by Contributor
  • 838 Views
  • 4 replies
  • 0 kudos

Resolved! Using parameters in a SQL Notebook and COPY INTO statement

Hi, My scenario is I have an export of a table being dropped in ADLS every day.  I would like to load this data into a UC table and then repeat the process every day, replacing the data.  This seems to rule out DLT as it is meant for incremental proc...

  • 838 Views
  • 4 replies
  • 0 kudos
Latest Reply
SamGreene
Contributor
  • 0 kudos

The solution that worked what adding this python cell to the notebook: %pythonfrom pyspark.dbutils import DBUtilsdbutils = DBUtils(spark)dbutils.widgets.text("catalog", "my_business_app")dbutils.widgets.text("schema", "dev") Then in the SQL Cell: CRE...

  • 0 kudos
3 More Replies
JUPin
by New Contributor II
  • 484 Views
  • 3 replies
  • 0 kudos

REST API for Pipeline Events does not return all records

I'm using the REST API to retrieve Pipeline Events per the documentation:https://docs.databricks.com/api/workspace/pipelines/listpipelineeventsI am able to retrieve some records but the API stops after a call or two.  I verified the number of rows us...

  • 484 Views
  • 3 replies
  • 0 kudos
Latest Reply
JUPin
New Contributor II
  • 0 kudos

I've attached some screenshots of the API call.  It shows "59" records (Event Log API1.png) retrieved and a populated "next_page_token" however, when I pull the next set of data using the "next_page_token", the result set is empty(Event Log API2.png)...

  • 0 kudos
2 More Replies
galzamo
by New Contributor
  • 144 Views
  • 1 replies
  • 0 kudos

Job running time too long

Hi all,I'm doing my first data jobs.I create one job that consists of 4 other jobs.Yesterday I ran the 4 jobs separately and it worked fine (about half hour)-today I ran the big job, and the 4 jobs is running for 2 hours (and still running), Why is t...

  • 144 Views
  • 1 replies
  • 0 kudos
Latest Reply
anardinelli
New Contributor III
  • 0 kudos

Hello @galzamo how are you? You can check on the SparkUI for long running stages that might give you a clue where it's spending the most time on each task. Somethings can be the reason: 1. Increase of data and partitions on your source data 2. Cluste...

  • 0 kudos
EDDatabricks
by Contributor
  • 609 Views
  • 2 replies
  • 0 kudos

Expected size of managed Storage Accounts

Dear all,we are monitoring the size of managed storage accounts associated with our deployed Azure databricks instances.We have 5 databricks instances for specific components of our platform replicated in 4 environments (DEV, TEST, PREPROD, PROD).Dur...

Data Engineering
Filesize
LOGS
Managed Storage Account
  • 609 Views
  • 2 replies
  • 0 kudos
Latest Reply
Kaniz
Community Manager
  • 0 kudos

Hi @EDDatabricks, Let’s address your questions regarding Azure-managed storage accounts: What do these Storage Accounts contain? An Azure storage account contains various data objects, including: Blobs: Used for storing unstructured data like ima...

  • 0 kudos
1 More Replies
Kayla
by Contributor II
  • 409 Views
  • 3 replies
  • 6 kudos

Resolved! SQL Warehouse Timeout / Prevent Long Running Queries

We have an external service connecting to a SQL Warehouse, running a query that normally lasts 30 minutes.On occasion an error occurs and it will run for 6 hours.This happens overnight and is contributing to a larger bill. Is there any way to force l...

  • 409 Views
  • 3 replies
  • 6 kudos
Latest Reply
Kayla
Contributor II
  • 6 kudos

@lucasrocha @raphaelblg That is exactly what I was hoping to find. Thank you!

  • 6 kudos
2 More Replies
Rik
by New Contributor III
  • 3089 Views
  • 5 replies
  • 7 kudos

Resolved! File information is not passed to trigger job on file arrival

We are using the UC mechanism for triggering jobs on file arrival, as described here: https://learn.microsoft.com/en-us/azure/databricks/workflows/jobs/file-arrival-triggers.Unfortunately, the trigger doesn't actually pass the file-path that is gener...

Data Engineering
file arrival
trigger file
Unity Catalog
  • 3089 Views
  • 5 replies
  • 7 kudos
Latest Reply
marcuskw
Contributor
  • 7 kudos

Also something I'm interested in using, would be really helpful to use File Trigger and get relevant information about exactly what file triggered the workflow!

  • 7 kudos
4 More Replies
AlokThampi
by New Contributor
  • 140 Views
  • 0 replies
  • 0 kudos

Issues while writing into bad_records path

Hello All,I would like to get your inputs with a scenario that I see while writing into the bad_records file.I am reading a ‘Ԓ’ delimited CSV file based on a schema that I have already defined. I have enabled error handling while reading the file to ...

Alok1_0-1717548996735.png Alok1_1-1717549044696.png
  • 140 Views
  • 0 replies
  • 0 kudos
LasseL
by New Contributor
  • 172 Views
  • 1 replies
  • 0 kudos

How to use change data feed when schema is changing between delta table versions?

How to use change data feed when delta table schema changes between delta table versions?I tried to read change data feed in parts (in code snippet I read version 1372, because 1371 and 1373 schema versions are different), but getting errorUnsupporte...

  • 172 Views
  • 1 replies
  • 0 kudos
Latest Reply
raphaelblg
Contributor III
  • 0 kudos

Hi @LasseL, Please check: What is the schema for the change data feed? . It might help you  

  • 0 kudos
MaximeGendre
by New Contributor II
  • 494 Views
  • 0 replies
  • 0 kudos

Problem using from_avro function

Hello everyone,I need your help with a topic that has been preoccupying me for a few days."from_avro" function gives me a strange result when I pass it the json schema of a Kafka topic.=================================================================...

MaximeGendre_2-1717533967736.png MaximeGendre_0-1717533089570.png MaximeGendre_1-1717533556219.png
  • 494 Views
  • 0 replies
  • 0 kudos
db_knowledge
by New Contributor II
  • 198 Views
  • 2 replies
  • 0 kudos

Merge operation with ouputMode update in autoloader databricks

Hi team,I am trying to do merge operation along with outputMode('update') and foreachmode byusing below code but it is not updating data could you please any help on this?output=(casting_df.writeStream.format('delta').trigger(availableNow=True).optio...

  • 198 Views
  • 2 replies
  • 0 kudos
Latest Reply
anardinelli
New Contributor III
  • 0 kudos

Hi @db_knowledge  Please try .foreachBatch(upsertToDelta) instead of creating the lambda inside it. Best, Alessandro

  • 0 kudos
1 More Replies
AyushModi038
by New Contributor III
  • 3568 Views
  • 5 replies
  • 3 kudos

Library installation in cluster taking a long time

I am trying to install "pycaret" libraray in cluster using whl file.But it is creating conflict in the dependency sometimes (not always, sometimes it works too.) ​My questions are -1 - How to install libraries in cluster only single time (Maybe from ...

  • 3568 Views
  • 5 replies
  • 3 kudos
Latest Reply
Spencer_Kent
New Contributor III
  • 3 kudos

Can any Databricks pros provide some guidance on this? My clusters that have "cluster-installed" libraries take 30 minutes or more to become usable. I'm only trying to install a handful of CRAN libraries, but having to re-install them every time a cl...

  • 3 kudos
4 More Replies
Adigkar
by New Contributor
  • 254 Views
  • 3 replies
  • 0 kudos

Reprocess of old data stored in adls

Hi,We have a requirement fir a scenario to reprocess old data using data factory pipeline.Here are the detailsStorage in ADLSGEN2Landing zone(where the data will be stored in the same format as we get from source),Data will be loaded from sql server ...

  • 254 Views
  • 3 replies
  • 0 kudos
Latest Reply
Hkesharwani
Contributor II
  • 0 kudos

@Kaniz I just posted a possible solution for the above problem and it has been rejected community moderator without any explanation. This has happened to me twice in past as well.Can you please help in this case. 

  • 0 kudos
2 More Replies
Join 100K+ Data Experts: Register Now & Grow with Us!

Excited to expand your horizons with us? Click here to Register and begin your journey to success!

Already a member? Login and join your local regional user group! If there isn’t one near you, fill out this form and we’ll create one for you to join!

Labels
Top Kudoed Authors