cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

kiko_roy
by New Contributor III
  • 55 Views
  • 1 replies
  • 0 kudos

DLT cluster : can be manipulated?

Hi All I am using DLT pipeline to pull data from a ADLS gen2 which is mounted. The cluster that is getting fired has access mode is set to shared. I want to change it to single user. But the cluster being attached to DLT , I am not ablr to update and...

  • 55 Views
  • 1 replies
  • 0 kudos
Latest Reply
AlliaKhosla
New Contributor III
  • 0 kudos

Hi @kiko_roy  Greetings! You can't use a single-user cluster to query tables from a Unity Catalog-enabled Delta Live Tables pipeline, including streaming tables and materialized views in Databricks SQL. To access these tables, you need to use a share...

  • 0 kudos
sumitdesai
by New Contributor
  • 74 Views
  • 1 replies
  • 0 kudos

Job not able to access notebook from github

I have created a job in Databricks and configured to use a cluster having single user access enabled and using github as a source. When I am trying to run the job, getting following error-run failed with error message Unable to access the notebook "d...

  • 74 Views
  • 1 replies
  • 0 kudos
Latest Reply
ezhil
New Contributor
  • 0 kudos

I think you need to link the git account with databricks by passing the access token which is generated in githubFollow the document for reference: https://docs.databricks.com/en/repos/get-access-tokens-from-git-provider.htmlNote : While creating the...

  • 0 kudos
philipkd
by New Contributor III
  • 59 Views
  • 0 replies
  • 0 kudos

Cannot get past Query Data tutorial for Azure Databricks

I created a new workspace on Azure Databricks, and I can't get past this first step in the tutorial: DROP TABLE IF EXISTS diamonds; CREATE TABLE diamonds USING CSV OPTIONS (path "/databricks-datasets/Rdatasets/data-001/csv/ggplot2/diamonds.csv", hea...

  • 59 Views
  • 0 replies
  • 0 kudos
Nisha2
by New Contributor
  • 60 Views
  • 0 replies
  • 0 kudos

Databricks spark_jar_task failed when submitted via API

Hello,We are submitting jobs to the data bricks cluster using  /api/2.0/jobs/create this API and running a spark java application (jar that is submitted to this API). We are noticing Java application is executing as expected. however, we see that the...

Data Engineering
API
Databricks
spark
  • 60 Views
  • 0 replies
  • 0 kudos
daz
by New Contributor III
  • 2385 Views
  • 7 replies
  • 3 kudos

DLT managed by non-existent pipeline

I am building out a new DLT pipeline and have since had to rebuild it from scratch. Having deleted the old pipeline and constructed a new one I now get this error:Table 'X' is already managed by pipeline 'Y'. As I only have the one pipeline how would...

  • 2385 Views
  • 7 replies
  • 3 kudos
Latest Reply
Vidula
Honored Contributor
  • 3 kudos

Hi there @Darron Smith​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you...

  • 3 kudos
6 More Replies
Gilg
by Contributor
  • 195 Views
  • 5 replies
  • 2 kudos

Multiple Autoloader reading the same directory path

HiOriginally, I only have 1 pipeline looking to a directory. Now as a test, I cloned the existing pipeline and edited the settings to a different catalog. Now both pipelines is basically reading the same directory path and running continuous mode.Que...

  • 195 Views
  • 5 replies
  • 2 kudos
Latest Reply
Kaniz
Community Manager
  • 2 kudos

Hi @Gilg, When multiple pipelines are simultaneously accessing the same directory path and utilizing Autoloader in continuous mode, it is crucial to consider the management of file locks and data consistency carefully.    Let's delve into the specifi...

  • 2 kudos
4 More Replies
pjp94
by Contributor
  • 51 Views
  • 0 replies
  • 0 kudos

pyspark.pandas PandasNotImplementedError

Can someone explain why this below code is throwing an error? My intuition is telling me it's my spark version (3.2.1) but would like confirmation:d = {'key':['a','a','c','d','e','f','g','h'], 'data':[1,2,3,4,5,6,7,8]} x = ps.DataFrame(d) x[x['...

  • 51 Views
  • 0 replies
  • 0 kudos
Gauthy1825
by New Contributor II
  • 2113 Views
  • 7 replies
  • 3 kudos

How to write to Salesforce from Databricks using the spark salesforce library

Hi, Im facing an issue while writing to Salesforce sandbox from Databricks. I have installed the "spark-salesforce_2.12-1.1.4" library and my code is as follows:-df_newLeads.write\      .format("com.springml.spark.salesforce")\      .option("username...

  • 2113 Views
  • 7 replies
  • 3 kudos
Latest Reply
addy
New Contributor II
  • 3 kudos

I am facing a similar issue. I am able to read from a salesforce table but unable to write to it. Our databricks has also been whitelisted in Salesforce. I am using the same library - "com.springml.spark.salesforce".The error I am getting is not the ...

  • 3 kudos
6 More Replies
mvmiller
by New Contributor II
  • 66 Views
  • 0 replies
  • 0 kudos

Troubleshooting _handle_rpc_error GRPC Error

I am trying to run the following chunk of code in the cell of a Databricks notebook (using Databricks runtime 14.3 LTS, Apache spark 3.5.0, scala 2.12): spark.sql("CREATE OR REPLACE table sample_catalog.sample_schema.sample_table_tmp AS SELECT * FROM...

  • 66 Views
  • 0 replies
  • 0 kudos
Hubert-Dudek
by Esteemed Contributor III
  • 63 Views
  • 0 replies
  • 0 kudos

stored procedures

The plan for stored procedures in databricks spark has been announced in a few places. How can stored procedures look in Spark SQL?

stored.png
  • 63 Views
  • 0 replies
  • 0 kudos
cltj
by New Contributor III
  • 88 Views
  • 1 replies
  • 0 kudos

Managed tables and ADLS - infrastructure

Hi all. I want to get this right and therefore I am reaching out to the community. We are using azure, and currently are using 1 Azure Data Lake Storage for development, and 1 for production. These are connected to dev and prod databricks workspaces....

  • 88 Views
  • 1 replies
  • 0 kudos
Latest Reply
ossinova
Contributor
  • 0 kudos

I recommend you read this article (Managed vs External tables) and answer the following questions:do I require direct access to the data outside of Azure Databricks clusters or Databricks SQL warehouses?If yes, then External is your only optionIn rel...

  • 0 kudos
Marcin_U
by New Contributor
  • 126 Views
  • 2 replies
  • 0 kudos

AutoLoader - problem with adding new source location

Hello,I have some trouble with AutoLoader. Currently we use many diffrent source location on ADLS to read parquet files and write it to delta table using AutoLoader. Files in locations have the same schema.Every things works fine untill we have to ad...

  • 126 Views
  • 2 replies
  • 0 kudos
Latest Reply
Marcin_U
New Contributor
  • 0 kudos

Thanks for the reply @Kaniz . I have some questions related to you answer.Checkpoint Location:Does deleteing checkpoint folder (or only files?) mean that next run of AutoLoader will load all files from provided source locations? So it will duplicate ...

  • 0 kudos
1 More Replies
-werners-
by Esteemed Contributor III
  • 221 Views
  • 2 replies
  • 0 kudos

performance issues using shared compute access mode in scala

I created on our dev environment a cluster using the shared access mode, for our devs to use (instead of separate single user clusters).What I notice is that the performance of this cluster is terrible.  And I mean really terrible: notebook cells wit...

  • 221 Views
  • 2 replies
  • 0 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 0 kudos

Thanks for the answer!It seems that using shared access mode adds overhead.  The nodes/driver are not stressed at all (cpu/ram/network).We use UC only.The clusters seems configured correctly (using the same cluster in single user mode changes perform...

  • 0 kudos
1 More Replies