cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

cmilligan
by Contributor II
  • 1858 Views
  • 1 replies
  • 0 kudos

Pull query that inserts into table

I'm trying to pull some data down for table history and am needing to view the query that inserted into a table. My team owns the process so I'm able to view the current query by just viewing it but I'm also wanting to capture changes over time witho...

  • 1858 Views
  • 1 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

@Coleman Milligan​ :Yes, in Databricks, you can use the built-in Delta Lake feature to track the history of changes made to a table, including the queries that inserted data into it.Here's an example of how to retrieve the queries that inserted data ...

  • 0 kudos
Arty
by New Contributor II
  • 3952 Views
  • 5 replies
  • 6 kudos

Resolved! How to make Autoloader delete files after a successful load

Hi AllCan you please advise how I can arrange loaded file deletion from Azure Storage upon its successful load via Autoloader? As I understood, Spark streaming "cleanSource" option is unavailable for Autoloader, so I'm trying to find the best way to ...

  • 3952 Views
  • 5 replies
  • 6 kudos
Latest Reply
Anonymous
Not applicable
  • 6 kudos

Hi @Artem Sachuk​ Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answers ...

  • 6 kudos
4 More Replies
Julie1
by New Contributor II
  • 2166 Views
  • 2 replies
  • 1 kudos

Resolved! Query data not showing in custom alert notifications and QUERY_RESULT_ROWS

I've set up a custom alert notification for one of my Databricks SQL queries, and it triggers correctly, but I'm not able to get the actual results of the query to appear in the notification email. I've followed the example/template in the custom ale...

  • 2166 Views
  • 2 replies
  • 1 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 1 kudos

The actual query results are not displayed in the alert unfortunately. You can pass the alert condition etc, but not the raw results of the underlying query.I hope this will be added in the future.A workaround is to add a link to the query, so the r...

  • 1 kudos
1 More Replies
Mado
by Valued Contributor II
  • 2209 Views
  • 4 replies
  • 3 kudos

Resolved! Streaming Delta Live Table, if I re-run the pipeline, does it append the new data to the current table?

Hi,I have a question about DLT table. Assume that I have a streaming DLT pipeline which reads data from a Bronze table and apply transformation on data. Pipeline mode is triggered. If I re-run the pipeline, does it append new data to the current tabl...

  • 2209 Views
  • 4 replies
  • 3 kudos
Latest Reply
Anonymous
Not applicable
  • 3 kudos

@Mohammad Saber​ :In a Databricks Delta Lake (DLT) pipeline, when you re-run the pipeline in "append" mode, new data will be appended to the existing table. Delta Lake provides built-in support for handling duplicates through its "upsert" functionali...

  • 3 kudos
3 More Replies
JJ_
by New Contributor II
  • 1341 Views
  • 3 replies
  • 0 kudos

ODBC Connection to Another Compute Within the Same Workspace

Hello all!I couldn't find anything definitive related to this issue so I hope I'm not duplicating another topic :). I have imported an R repository that normally runs on another machine and uses ODBC driver to issue sparkSQL commands to a compute (le...

  • 1341 Views
  • 3 replies
  • 0 kudos
Latest Reply
JJ_
New Contributor II
  • 0 kudos

Thanks @Suteja Kanuri​ for your response! I tried all of the steps you mentioned (and many more) but never managed to make it work. My suspicion was that our azure networking setup was preventing this from happening. I have not found this documented ...

  • 0 kudos
2 More Replies
a2_ish
by New Contributor II
  • 1442 Views
  • 1 replies
  • 0 kudos

Where are delta lake files stored by given path?

I have below code which works for the path below but fails for path = azure storage account path. i have enough access to write and update the storage account. I would like to know what wrong am I doing and the path below which works , how can i phys...

  • 1442 Views
  • 1 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

@Ankit Kumar​ :The error message you received indicates that the user does not have sufficient permission to access the Azure Blob Storage account. You mentioned that you have enough access to write and update the storage account, but it's possible t...

  • 0 kudos
vicusbass
by New Contributor II
  • 6483 Views
  • 3 replies
  • 0 kudos

How to extract values from JSON array field?

Hi,While creating an SQL notebook, I am struggling with extracting some values from a JSON array field. I need to create a view where a field would be an array with values extracted from a field like the one below, specifically I need the `value` fi...

  • 6483 Views
  • 3 replies
  • 0 kudos
Latest Reply
vicusbass
New Contributor II
  • 0 kudos

Maybe I didn't explain it correctly. The JSON snippet from the description is a cell from a row from a table.

  • 0 kudos
2 More Replies
labromb
by Contributor
  • 1831 Views
  • 2 replies
  • 0 kudos

Getting Databricks SQL dashboard to recognise change to an underlying query

Hi CommunityScenario:I have created a query in Databricks SQL, built a number of visualisations from it and published them to a dashboard. I then realise that I need to add another field to the underlying query that I want to then leverage as a dashb...

  • 1831 Views
  • 2 replies
  • 0 kudos
Latest Reply
youssefmrini
Honored Contributor III
  • 0 kudos

Can you take a screenshot ?

  • 0 kudos
1 More Replies
Hitesh_goswami
by New Contributor
  • 764 Views
  • 1 replies
  • 0 kudos

Upgrading Ipython version without changing LTS version

I am using a specific Pydeeque function called ColumnProfilerRunner which is only supported with Spark 3.0.1, so I must use 7.3 LTS. Currently, I am trying to install "great_expectations" library on Python, which requires Ipython version==7.16.3, an...

  • 764 Views
  • 1 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

@Hitesh Goswami​ : please check if the below helps!To upgrade the Ipython version on a Databricks 7.3LTS cluster, you can follow these steps:Create a new library installation command using the Databricks CLI by running the following command in your l...

  • 0 kudos
HamidHamid_Mora
by New Contributor II
  • 1622 Views
  • 2 replies
  • 2 kudos

ganglia is unavailable on DBR 13.0

We created a library in databricks to ingest ganglia metrics for all jobs in our delta tables;However end point 8652 is no more available on DBR 13.0is there any other endpoint available ? since we need to log all metrics for all executed jobs not on...

  • 1622 Views
  • 2 replies
  • 2 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 2 kudos

Ganglia is only supported on Databricks Runtime versions 12 and below. From Databricks Runtime 13, Ganglia is replaced by a new Databricks metrics system offering more features and integrations. To export metrics to external services, you can use Dat...

  • 2 kudos
1 More Replies
JGil
by New Contributor III
  • 1824 Views
  • 5 replies
  • 0 kudos

Installing Bazel on databricks cluster

I am new to azure databricks and I want to install a library on a cluster and to do that I need to install bazel build tool first.I checked the site bazel but I am still not sure how to do it in databricks?I appriciate if any can help me and write me...

  • 1824 Views
  • 5 replies
  • 0 kudos
Latest Reply
Avinash_94
New Contributor III
  • 0 kudos

Databricks migrated over from the standard Scala Build Tool (SBT) to using Bazel to build, test and deploy our Scala code. Follow this doc https://www.databricks.com/blog/2019/02/27/speedy-scala-builds-with-bazel-at-databricks.html

  • 0 kudos
4 More Replies
afzi
by New Contributor II
  • 1498 Views
  • 1 replies
  • 1 kudos

Pandas DataFrame error when using to_csv

Hi Everyone, I would like to a Pandas Dataframe to /dbfs/FileStore/ using to_csv method.Usually it would just write the Dataframe to the path described but It has been giving me "FileNotFoundError: [Errno 2] No such file or directory: '/dbfs/FileStor...

  • 1498 Views
  • 1 replies
  • 1 kudos
Latest Reply
Avinash_94
New Contributor III
  • 1 kudos

f = open("/dbfs/mnt/blob/myNames.txt", "r")

  • 1 kudos
User16826992783
by New Contributor II
  • 935 Views
  • 1 replies
  • 0 kudos

Why are some of my AWS EBS volumes in my workspace unencrypted?

I noticed that 30GB of my EBS volumes are unencrypted, is there a reason for this, and is there a way to encrypt these volumes?

  • 935 Views
  • 1 replies
  • 0 kudos
Latest Reply
Abishek
Valued Contributor
  • 0 kudos

https://docs.databricks.com/security/keys/customer-managed-keys-storage-aws.html#introductionThe Databricks cluster’s EBS volumes (optional) - For Databricks Runtime cluster nodes and other compute resources in the Classic data plane, you can option...

  • 0 kudos
wb
by New Contributor II
  • 861 Views
  • 1 replies
  • 2 kudos

Import paths using repos and installed libraries get confused

We use Azure Devops and Azure Databricks and have custom Python libraries. I placed my notebooks in the same repo and the structure is like this:mylib/ mylib/__init__.pyt mylib/code.py notebooks/ notebooks/job_notebook.py setup.pyAzure pipelines buil...

  • 861 Views
  • 1 replies
  • 2 kudos
Latest Reply
Avinash_94
New Contributor III
  • 2 kudos

It looks for the configs locally i suppose if you can share requirements .txt i can elaborate

  • 2 kudos
User16826990884
by New Contributor III
  • 757 Views
  • 1 replies
  • 1 kudos

Delta log retention

Is there an impact on performance if I increase the Delta log retention to 3000?

  • 757 Views
  • 1 replies
  • 1 kudos
Latest Reply
DD_Sharma
New Contributor III
  • 1 kudos

There will be no performance impact if you want to keep " Delta log retention to 3000". However, it will increase the storage cost so it's not advisable to use a large number until really needed for the business use cases.The default delta.logRetenti...

  • 1 kudos
Join 100K+ Data Experts: Register Now & Grow with Us!

Excited to expand your horizons with us? Click here to Register and begin your journey to success!

Already a member? Login and join your local regional user group! If there isn’t one near you, fill out this form and we’ll create one for you to join!

Labels