cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

lauraxyz
by Contributor
  • 345 Views
  • 4 replies
  • 1 kudos

Put file into volume within Databricks

Hi!  From a Databricks job, i want to copy a workspace file into volume.  how can i do that?I tried`dbutils.fs.cp("/Workspace/path/to/the/file", "/Volumes/path/to/destination")`but got Public DBFS root is disabled. Access is denied on path: /Workspac...

  • 345 Views
  • 4 replies
  • 1 kudos
Latest Reply
lauraxyz
Contributor
  • 1 kudos

Found the reason!  It's the runtime, it doesn't work on Databricks Runtime Version 15.4 LTS, but started to work after changing to 16.0.   Maybe this is something supported from the latest version?

  • 1 kudos
3 More Replies
GS_S
by New Contributor III
  • 381 Views
  • 7 replies
  • 0 kudos

Resolved! Error during merge operation: 'NoneType' object has no attribute 'collect'

Why does merge.collect() not return results in access mode: SINGLE_USER, but it does in USER_ISOLATION? I need to log the affected rows (inserted and updated) and can’t find a simple way to get this data in SINGLE_USER mode. Is there a solution or an...

  • 381 Views
  • 7 replies
  • 0 kudos
Latest Reply
Walter_C
Databricks Employee
  • 0 kudos

15.4 does not directly required the serverless but for fine-grained it indeed requires it to run it on Single User as mentioned  This data filtering is performed behind the scenes using serverless compute. In terms of costs:Customers are charged for ...

  • 0 kudos
6 More Replies
manojpatil04
by New Contributor III
  • 225 Views
  • 5 replies
  • 0 kudos

External dependency on serverless job from Airflow is not working on s3 path and workspace

I am working on use case where we have to run python script from serverless job through Airflow. when we are trying to trigger serverless job and passing external dependency as wheel from s3 path or workspace path it is not working, but on volume it ...

  • 225 Views
  • 5 replies
  • 0 kudos
Latest Reply
Walter_C
Databricks Employee
  • 0 kudos

As per serverless compute limitations I can see the following:  Task libraries are not supported for notebook tasks. Use notebook-scoped libraries instead. See Notebook-scoped Python libraries.

  • 0 kudos
4 More Replies
stadelmannkevin
by New Contributor II
  • 304 Views
  • 4 replies
  • 2 kudos

init_script breaks Notebooks

 Hi everyoneWe would like to use our private company Python repository for installing Python libraries with pip install.To achieve this, I created a simple script which sets the index-url configuration of pip to our private repoI set this script as a...

  • 304 Views
  • 4 replies
  • 2 kudos
Latest Reply
Walter_C
Databricks Employee
  • 2 kudos

Did you also try cloning the cluster or using other cluster for the testing? The metastore down is normally a Hive Metastore issue, should not be impacting here, but you could check for more details on the error on the log4j under Driver logs.

  • 2 kudos
3 More Replies
sensanjoy
by Contributor
  • 17535 Views
  • 6 replies
  • 1 kudos

Resolved! Performance issue with pyspark udf function calling rest api

Hi All,I am facing some performance issue with one of pyspark udf function that post data to REST API(uses cosmos db backend to store the data).Please find the details below: # The spark dataframe(df) contains near about 30-40k data. # I am using pyt...

  • 17535 Views
  • 6 replies
  • 1 kudos
Latest Reply
Anonymous
Not applicable
  • 1 kudos

Hi @Sanjoy Sen​ Thank you for your question! To assist you better, please take a moment to review the answer and let me know if it best fits your needs.Please help us select the best solution by clicking on "Select As Best" if it does.Your feedback w...

  • 1 kudos
5 More Replies
wi11iamr
by New Contributor II
  • 445 Views
  • 5 replies
  • 0 kudos

PowerBI Connection: Possible to use ADOMDClient (or alternative)?

I wish to extract from PowerBI Datasets the metadata of all Measures, Relationships and Entities.In VSCode I have a python script that connects to the PowerBI API using the Pyadomd module connecting via the XMLA endpoint. After much trial and error I...

  • 445 Views
  • 5 replies
  • 0 kudos
Latest Reply
Walter_C
Databricks Employee
  • 0 kudos

I understand, yes it seems that this is currently not possible, only option will be to export your dataset as a csv file and import it in Databricks.

  • 0 kudos
4 More Replies
shusharin_anton
by New Contributor II
  • 264 Views
  • 1 replies
  • 1 kudos

Resolved! Sort after update on DWH

Running query on serverless DWH:UPDATEcatalog.schema.tableSETcol_tmp = CAST(col as DECIMAL(30, 15))In query profiling, it has some sort and shuffle stages in graph.Table has partition by partition_date columnSome details in sort node mentions that so...

  • 264 Views
  • 1 replies
  • 1 kudos
Latest Reply
Alberto_Umana
Databricks Employee
  • 1 kudos

Hi @shusharin_anton, The sort and shuffle stages in your query profile are likely triggered by the need to redistribute and order the data based on the partition_date column. This behavior can be attributed to the way Spark handles data partitioning ...

  • 1 kudos
rai00
by New Contributor
  • 140 Views
  • 1 replies
  • 0 kudos

Mock user doesn't have the required privileges to access catalog `remorph` while running 'make test'

Utility : Remorph (Databricks)Issue  : 'User `me@example.com` doesn't have required privileges :: ``to access catalog `remorph`' while running 'make test' cmdI am encountering an issue while running tests for Databricks Labs Remorph using 'make test'...

  • 140 Views
  • 1 replies
  • 0 kudos
Latest Reply
Alberto_Umana
Databricks Employee
  • 0 kudos

Hi @rai00, Ensure that the mock user me@example.com has the necessary privileges at both the catalog and schema levels. The user needs specific privileges such as USE_SCHEMA and CREATE_VOLUME   Use the WorkspaceClient to check the effective privilege...

  • 0 kudos
cool_cool_cool
by New Contributor II
  • 1967 Views
  • 2 replies
  • 2 kudos

Resolved! Trigger Dashboard Update At The End of a Workflow

Heya I have a workflow that computes some data and writes to a delta table, and I have a dashboard that is based on the table. How can I trigger refresh on the dashboard once the workflow is finished? Thanks!

  • 1967 Views
  • 2 replies
  • 2 kudos
Latest Reply
DanWertheimer
New Contributor II
  • 2 kudos

How does one do this with the new dashboards? I only see the ability to do this with legacy dashoards.

  • 2 kudos
1 More Replies
SparkMaster
by New Contributor III
  • 8680 Views
  • 11 replies
  • 2 kudos

Why can't I delete experiments without deleting the notebook? Or better Organize experiments into folders?

My Databricks Experiments is cluttered with a whole lot of experiments. Many of them are notebooks which are showing there for some reason (even though they didn't have an MLflow run associated with it). I would like to delete the experiments, but it...

  • 8680 Views
  • 11 replies
  • 2 kudos
Latest Reply
mhiltner
Databricks Employee
  • 2 kudos

Hey @Debayan @SparkMaster  A bit late here, but I believe this is being caused by a click on the right side experiments icon. This may look like a meaningless click but it actually triggers a run. 

  • 2 kudos
10 More Replies
jeremy98
by Contributor
  • 161 Views
  • 1 replies
  • 0 kudos

Resolved! Can we modify the constraint of a primary key in an existed table?

 Hello Community,Is it possible to modify the schema of an existing table that currently has an ID column without any constraints? I would like to update the schema to make the ID column a primary key with auto-increment starting by the maximum id al...

  • 161 Views
  • 1 replies
  • 0 kudos
Latest Reply
PiotrMi
New Contributor III
  • 0 kudos

Hey @jeremy98 Based on some old article it looks it cannot be done:There are a few caveats you should keep in mind when adopting this new feature. Identity columns cannot be added to existing tables; the tables will need to be recreated with the new ...

  • 0 kudos
JUMAN4422
by New Contributor II
  • 395 Views
  • 3 replies
  • 0 kudos

DELTA LIVE TABLE -Parallel processing

how can we process multiple tables within a delta live table pipeline parallelly as table names as parameters.

  • 395 Views
  • 3 replies
  • 0 kudos
Latest Reply
JUMAN4422
New Contributor II
  • 0 kudos

can we run a dlt pipeline multiple time at the same time using different parameters using rest api call with asyncio.i have created a function to start the pipeline using rest api.when calling the function with asyncio , i am getting [409 Conflict]> ...

  • 0 kudos
2 More Replies
Shreyash_Gupta
by New Contributor III
  • 398 Views
  • 4 replies
  • 0 kudos

Resolved! Can we display key vault secret in Databricks notebook

I am using databricks notebook and Azure key vault.When I am using below function I am getting as output [REDACTED].'dbutils.secrets.get(scope_name,secret_name)' I want to know if there is any way to display the secret in databricks.

  • 398 Views
  • 4 replies
  • 0 kudos
Latest Reply
daniel_sahal
Esteemed Contributor
  • 0 kudos

@Shreyash_Gupta You can simply iterate over each letter of the secret and print it.Something like this:for letter in dbutils.secrets.get(scope_name,secret_name): print(letter)

  • 0 kudos
3 More Replies
francisix
by New Contributor II
  • 3894 Views
  • 5 replies
  • 1 kudos

Resolved! I haven't received badge for completion

Hi,Today I completed the test for Lakehouse fundamentals by scored 85%, still I haven't received the badge through my email francis@intellectyx.comKindly let me know please !-Francis

  • 3894 Views
  • 5 replies
  • 1 kudos
Latest Reply
sureshrocks1984
New Contributor II
  • 1 kudos

HI  I completed the test for Databricks Certified Data Engineer Associate on 17 December 2024.  still I haven't received the badge through my email sureshrocks.1984@hotmail.comKindly let me know please !SURESHK 

  • 1 kudos
4 More Replies
f1nesse13
by New Contributor
  • 130 Views
  • 1 replies
  • 0 kudos

Question about notifications and failed jobs

Hello, I had a question involving rerunning a job from a checkpoint using ‘Repair Run’. I have a job which failed and Im looking to rerun the stream from a checkpoint. My job uses notifications for file detection (cloudFiles.useNotifications). My que...

  • 130 Views
  • 1 replies
  • 0 kudos
Latest Reply
BigRoux
Databricks Employee
  • 0 kudos

When rerunning your job from a checkpoint using Repair Run with cloudFiles.useNotifications, only unprocessed messages in the queue (representing new or failed-to-process files) will be consumed. Files or events already recorded in the checkpoint wil...

  • 0 kudos

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group
Labels