cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

apw
by New Contributor II
  • 4508 Views
  • 1 replies
  • 2 kudos

Arrow R package fails to install

# Databricks notebook source .libPaths()   # COMMAND ----------   dir("/databricks/spark/R/lib")   # COMMAND ----------   ## Add current working directory to library paths .libPaths(c(getwd(), .libPaths()))   # COMMAND ----------   ## The latest vers...

Arrow Fail Message" data-fileid="0698Y00000JFZosQAH
  • 4508 Views
  • 1 replies
  • 2 kudos
Latest Reply
Atanu
Databricks Employee
  • 2 kudos

@Anthony McGrath​ can you please download and upload to DBFS and see if the issue still persists?You can check if any global initscript is reinstalling this to your cluster.

  • 2 kudos
Martin1
by New Contributor II
  • 4441 Views
  • 2 replies
  • 2 kudos

Notebook metadata

HelloI would like to view metadata about the notebooks in the Workspace folder hierarchy, for example date created, modified, by user, etc.Is this possible?

  • 4441 Views
  • 2 replies
  • 2 kudos
Latest Reply
Atanu
Databricks Employee
  • 2 kudos

@Martin Aronsson​, you need to follow the notebook revision history (https://docs.databricks.com/notebooks/notebooks-use.html#revision-history) I believe. Also you can try https://docs.microsoft.com/en-us/azure/databricks/administration-guide/account...

  • 2 kudos
1 More Replies
User16835756816
by Databricks Employee
  • 6344 Views
  • 1 replies
  • 8 kudos

Announcing: Workflows!

Databricks is excited to announce the general availability of Databricks Workflows to you, our community. Databricks Workflows is the fully managed lakehouse orchestration service for all your teams to build reliable data, analytics, and AI workflow...

  • 6344 Views
  • 1 replies
  • 8 kudos
Latest Reply
PawanShukla
New Contributor III
  • 8 kudos

I am trying to run the Workflow Pipeline with smaple code shared in getting start.. and getting the below error :DataPlaneException: Failed to start the DLT service on cluster 0526-084319-7hucy1np. Please check the stack trace below or driver logs fo...

  • 8 kudos
as999
by New Contributor III
  • 3146 Views
  • 2 replies
  • 3 kudos

Terraform import multiple notebook copy from repo's?

From below article, i am able to copy only single notebook to dbrick workspace and it's not supporting to copy the multiple notebook using asterisks i.e * and also under resource databrick_notebook, for_each statement is not recognizingdatabricks_n...

  • 3146 Views
  • 2 replies
  • 3 kudos
Latest Reply
Atanu
Databricks Employee
  • 3 kudos

Hi @as999​ is there any error you are getting or it's just simply not copying multiple notebook, can you please share your code too so that I can take a look.Thanks.

  • 3 kudos
1 More Replies
Sophia_Ars
by New Contributor II
  • 2025 Views
  • 1 replies
  • 1 kudos

Abrupt Subscription Cancellation Issues

Hello Community,I've got informed from Help desk to post this issue in community.We've contacted all supportive entities: billing team, help desk and sales team,but the issue hasn't solved yet.My team(Ars Praxia) has issue of sudden cancellation of s...

  • 2025 Views
  • 1 replies
  • 1 kudos
CHANDY
by Databricks Partner
  • 1488 Views
  • 0 replies
  • 0 kudos

real time data processing

Say I am getting a customer record from an website. I want to read the massage & then insert/update that one to snowflake table , depending on the records insert/update is successful I need to respond back the success / failure massage in say 1 sec. ...

  • 1488 Views
  • 0 replies
  • 0 kudos
Sunny
by New Contributor III
  • 1087 Views
  • 0 replies
  • 1 kudos

Integrate exe into workflow

We need to execute a long running exe running on a windows machine and thinking of ways to integrate with the workflow. The plan is to include the exe as a task in the Databricks workflow.​​We are thinking of couple of approachesCreate a DB table and...

  • 1087 Views
  • 0 replies
  • 1 kudos
timothy_uk
by New Contributor III
  • 4003 Views
  • 2 replies
  • 4 kudos

Resolved! Optimum Standard & Premium Tier Strategy

Hi,I would like to deploy Databricks workspaces to build a delta lakehouse to server both scheduled jobs/processing and ad-hoc/analytical querying workloads. Databricks users comprise of both data engineers and data analysts. In terms of requirements...

  • 4003 Views
  • 2 replies
  • 4 kudos
Latest Reply
timothy_uk
New Contributor III
  • 4 kudos

Hi all thank you for informative answers!

  • 4 kudos
1 More Replies
edwardh
by New Contributor III
  • 6847 Views
  • 5 replies
  • 6 kudos

How to call Cloud Fetch APIs?

About Cloud Fetch mentioned in this article:https://databricks.com/blog/2021/08/11/how-we-achieved-high-bandwidth-connectivity-with-bi-tools.htmlAre there any public APIs that can be called directly without ODBC or JDBC drivers? Thanks.

  • 6847 Views
  • 5 replies
  • 6 kudos
Latest Reply
edwardh
New Contributor III
  • 6 kudos

Hi @Kaniz Fatma​, can you please give some help on this question? Thanks

  • 6 kudos
4 More Replies
Deepak_Bhutada
by Databricks Employee
  • 3821 Views
  • 3 replies
  • 3 kudos

Retrieve workspace instance name on E2 architecture (multi-tenant) in notebook running on job cluster

I have a databricks job on E2 architecture in which I want to retrieve the workspace instance name within a notebook running in a Job cluster context so that I can use it further in my use case. While the call dbutils.notebook.entry_point.getDbutils(...

  • 3821 Views
  • 3 replies
  • 3 kudos
Latest Reply
Thomas_B_
New Contributor II
  • 3 kudos

Found workaround for Azure Databricks question above: dbutils.notebook.getContext().apiUrl will return the regional URI, but this forwards to the workspace-specific one if the workspace id is specified with o=.

  • 3 kudos
2 More Replies
Phani1
by Databricks MVP
  • 2680 Views
  • 1 replies
  • 2 kudos

Resolved! is it possible to have multiple tabs in Dashboard? if not is there any workaround for this.

is it possible to have multiple tabs in Dashboard? if not is there any workaround for this.

  • 2680 Views
  • 1 replies
  • 2 kudos
Latest Reply
Prabakar
Databricks Employee
  • 2 kudos

I don't think it will be possible. However, you can raise a feature request via our ideas portal with the requirements so that it might be considered in the future.https://docs.databricks.com/resources/ideas.html

  • 2 kudos
kpendergast
by Contributor
  • 7606 Views
  • 2 replies
  • 2 kudos

Best AWS S3 Bucket Configuration for Auto Loader with Support for Glacier and Future Use Cases

As the titles states I would like to hear how others have setup an AWS s3 bucket to source data with auto loader while supporting the capabilities to archive files after a certain period of time into glacier objects. We currently have about 20 millio...

  • 7606 Views
  • 2 replies
  • 2 kudos
Latest Reply
Prabakar
Databricks Employee
  • 2 kudos

@Ken Pendergast​  To setup Databricks with auto loader, please follow the below document. https://docs.databricks.com/spark/latest/structured-streaming/auto-loader.htmlFetching data from Glacier is not supported. however, you can try one of the follo...

  • 2 kudos
1 More Replies
tom_shaffner
by New Contributor III
  • 15770 Views
  • 3 replies
  • 2 kudos

How to take only the most recent record from a variable number of tables in a stream

Short version: I need a way to take only the most recent record from a variable number of tables in a stream. This is a relatively easy problem in sql or python pandas (group by and take the newest) but in a stream I keep hitting blocks. I could do i...

temp" data-fileid="0698Y00000JF9NlQAL
  • 15770 Views
  • 3 replies
  • 2 kudos
Latest Reply
HÃ¥kon_Ã…mdal
New Contributor III
  • 2 kudos

Did you try storing it all to a DELTA table with a MERGE INTO [1]? You can optionally specify a condition on "WHEN MATCHED" such that you only insert if the timestamp is newer.[1] https://docs.databricks.com/spark/latest/spark-sql/language-manual/del...

  • 2 kudos
2 More Replies
yopbibo
by Contributor II
  • 15813 Views
  • 8 replies
  • 1 kudos

Resolved! Notebook's Widget parameters in SQL cell => howto

dbutils.widgets.text('table', 'product')   %sql select * from ds_data.$tableHello, the above will work.But how can I do something like:dbutils.widgets.text('table', 'product') %sql select * from ds_data.$table_v3in that example, $table is still my ...

  • 15813 Views
  • 8 replies
  • 1 kudos
Latest Reply
yopbibo
Contributor II
  • 1 kudos

Maybe I should add that I use DB9.1 on a high concurrency cluster

  • 1 kudos
7 More Replies
Labels