cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 
Data + AI Summit 2024 - Data Engineering & Streaming

Forum Posts

mannepk85
by New Contributor III
  • 459 Views
  • 0 replies
  • 0 kudos

Databricks academy courses are defaulting to hive metastore

So far, I started 2 Databricks Academy Courses. In both the course, the default is hive-metastore where the schema is created. In my org, hive metastore is blocked and we have been asked to use Unity Catalog. Is there a way the course material in dat...

  • 459 Views
  • 0 replies
  • 0 kudos
CaptainJack
by New Contributor III
  • 965 Views
  • 4 replies
  • 1 kudos

Get taskValue from job as task, and then pass it to next task.

I have workflow like this.1 task: job as a task. Inside this job there is task which is seting parameter x as taskValue using dbutils.jobs.taskValues.set. 2. task dependent on previous job as a task. I would like to access this parameter x. I tried t...

  • 965 Views
  • 4 replies
  • 1 kudos
Latest Reply
NandiniN
Databricks Employee
  • 1 kudos

I see, I have requested for someone else to guide you on this. cc: @Retired_mod 

  • 1 kudos
3 More Replies
turtleXturtle
by New Contributor II
  • 779 Views
  • 1 replies
  • 0 kudos

Delta share existing parquet files in R2

Hi - I have existing parquet files in Cloudflare R2 storage (created outside of Databricks).  I would like to share them via Delta Share, but I keep running into an error.  Is it possible to share existing parquet files without duplicating them?I did...

  • 779 Views
  • 1 replies
  • 0 kudos
Latest Reply
turtleXturtle
New Contributor II
  • 0 kudos

Thanks @Retired_mod.  It's currently possible to share a delta table stored in an S3 external location without duplication or doing the `DEEP CLONE` first.  Is it on the roadmap to support this for R2 as well?

  • 0 kudos
MYB24
by New Contributor III
  • 7377 Views
  • 6 replies
  • 0 kudos

Resolved! Error: cannot create mws credentials: invalid Databricks Account configuration

Good Evening, I am configuring databricks_mws_credentials through Terraform on AWS.  I am getting the following error:Error: cannot create mws credentials: invalid Databricks Account configuration││ with module.databricks.databricks_mws_credentials.t...

Data Engineering
AWS
credentials
Databricks
Terraform
  • 7377 Views
  • 6 replies
  • 0 kudos
Latest Reply
Alexandre467
New Contributor II
  • 0 kudos

Hello, I'm facing a similaire Issue. I try to update my TF with properly authentification and I have this error ?! â•· │ Error: cannot create mws credentials: failed visitor: context canceled │ │ with databricks_mws_credentials.this, │ on main.tf ...

  • 0 kudos
5 More Replies
riccostamendes
by New Contributor II
  • 48509 Views
  • 3 replies
  • 0 kudos

Just a doubt, can we develop a kedro project in databricks?

I am asking this because up to now I have just seen some examples of deploying a pre-existent kedro project in databricks in order to run some pipelines...

  • 48509 Views
  • 3 replies
  • 0 kudos
Latest Reply
noklam
New Contributor II
  • 0 kudos

Hi! Kedro Dev here. You can surely develop Kedro on Databricks, in fact we have a lot of Kedro project running on Databricks. In the past there has been some friction, mainly because Kedro are project based while Databricks focus a lot on notebook. T...

  • 0 kudos
2 More Replies
georgef
by New Contributor III
  • 2413 Views
  • 2 replies
  • 1 kudos

Resolved! Cannot import relative python paths

Hello,Some variations of this question have been asked before but there doesn't seem to be an answer for the following simple use case:I have the following file structure on a Databricks Asset Bundles project: src --dir1 ----file1.py --dir2 ----file2...

  • 2413 Views
  • 2 replies
  • 1 kudos
Latest Reply
m997al
Contributor III
  • 1 kudos

Hi.  This was a long-standing issue for me too.  This solution may not be what is desired, but it works perfectly for my needs.In my python code, I have this structure:if __name__ == '__main__': # directory structure where "mycode" is this code ...

  • 1 kudos
1 More Replies
Pierre1
by New Contributor
  • 1051 Views
  • 1 replies
  • 2 kudos

DLT with Unity Catalog: Multipart table name

Hello,I can't seem to find up to date info on how to handle catalog.schema.table in DLT live table and Unity catalog.My statement is the following and is failing with the error: Multipart table name is not supported. Any workaround possible?Thanks a ...

  • 1051 Views
  • 1 replies
  • 2 kudos
Latest Reply
szymon_dybczak
Contributor III
  • 2 kudos

Hi @Pierre1 ,Actually, you don't provide this infromation in the code. You specify this information when you create dlt pipeline.If you do not select a catalog and target schema for a pipeline, tables are not published to Unity Catalog and can only b...

  • 2 kudos
mbdata
by New Contributor II
  • 35534 Views
  • 6 replies
  • 6 kudos

Resolved! Toggle line comment

I work with Azure Databricks. The shortcut Ctrl + / to toggle line comment doesn't work on AZERTY keyboard on Firefox... Do you know this issue ? Is there an other shortcut I can try ? Thanks !

  • 35534 Views
  • 6 replies
  • 6 kudos
Latest Reply
Flo
New Contributor III
  • 6 kudos

'cmd + shift + 7' works for me!I'm using an AZERTY keyboard on Chrome for MacOS.

  • 6 kudos
5 More Replies
vishal48
by New Contributor II
  • 440 Views
  • 0 replies
  • 1 kudos

Integrating row and column level security in parent child tables with masking only selected rows

Currently I am working with a project where we need to mask PIIs in few columns for VIP customers only.Let me explain briefly with example:Table A: [personid, status, address, UID, VIPFLAG]   --> Mask "UID" and "address" only where VIPFLAG is 1Table ...

  • 440 Views
  • 0 replies
  • 1 kudos
guangyi
by Contributor III
  • 1001 Views
  • 3 replies
  • 1 kudos

Resolved! Complex type variable in Databricks.yml not working

For example here I extract the schedule parameter as a complex type variable: variables: schedule: description: schedule time type: complex default: quartz_cron_expression: '0 22 17 * * ?' timezone_id: Asia/Shanghai pa...

  • 1001 Views
  • 3 replies
  • 1 kudos
Latest Reply
pavlosskev
New Contributor III
  • 1 kudos

 If the validation is fine on your colleague's laptop and not on yours, my first assumption would be that it's a version issue. Do you have the same Databricks CLI version as your colleagues? You can check with  databricks --version Also according to...

  • 1 kudos
2 More Replies
Kotekaman
by New Contributor
  • 350 Views
  • 1 replies
  • 1 kudos

Merge Update in Notebook Faster Than Scala script

Hi Folks,I tested running a merge update using SQL queries in a notebook, and it is faster than using a Scala script. Both tests were done using the same cluster size in Databricks.How can I make the Scala script as fast as the SQL notebook?

  • 350 Views
  • 1 replies
  • 1 kudos
Latest Reply
Witold
Contributor III
  • 1 kudos

Have you already compared both query plans?

  • 1 kudos
CaptainJack
by New Contributor III
  • 1438 Views
  • 3 replies
  • 2 kudos

Resolved! Error Handling and Custom Messages in Workflows

I would like to be able to get custom error's message ideally visible from Workflows > Jobs UI.1. For example, workflow failed because file was missing and could not find it, in this case I am getting "Status" Failed and "Error Code" RunExecutionErro...

  • 1438 Views
  • 3 replies
  • 2 kudos
Latest Reply
Edthehead
Contributor II
  • 2 kudos

What you can do is pass the custom error message you want from the notebook back to the workflow output = f"There was an error with {error_code} : {error_msg}"dbutils.notebook.exit(output) Then when you are fetching the status of your pipeline, you c...

  • 2 kudos
2 More Replies
Manthansingh
by New Contributor
  • 720 Views
  • 2 replies
  • 0 kudos

Writing part files in single text file

i want to write all my part file into a single text file is there anything i can do 

  • 720 Views
  • 2 replies
  • 0 kudos
Latest Reply
Edthehead
Contributor II
  • 0 kudos

When writing a pyspark dataframe to a file, it will always write to a part file by default. This is because of partitions, even if there is only 1 partitions.To write into a single file you can convert the pyspark dataframe to a pandas dataframe and ...

  • 0 kudos
1 More Replies

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group
Labels