cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

NTRT
by Visitor
  • 14 Views
  • 1 replies
  • 0 kudos

performance issues when transformin json-stat2

Hi,I am realtively new on databricks, although I am conscious about lazy evaluation, transformations and actions and peristence.I have a piece of code that i want to run on a folder with around 20 json-files. goal is to create a temporary table on ea...

  • 14 Views
  • 1 replies
  • 0 kudos
Latest Reply
koushiknpvs
New Contributor III
  • 0 kudos

Please give me a kudos if this works.Efficiency in Data Collection: Using .collect() on large datasets can lead to out-of-memory errors as it collects all rows to the driver node. If the dataset is large, consider alternatives such as extracting only...

  • 0 kudos
SamGreene
by Contributor
  • 347 Views
  • 3 replies
  • 0 kudos

Using parameters in a SQL Notebook and COPY INTO statement

Hi, My scenario is I have an export of a table being dropped in ADLS every day.  I would like to load this data into a UC table and then repeat the process every day, replacing the data.  This seems to rule out DLT as it is meant for incremental proc...

  • 347 Views
  • 3 replies
  • 0 kudos
Latest Reply
Cary
New Contributor II
  • 0 kudos

I would use widgets in the notebook which will process in Jobs.  SQL in Notebooks can use parameters, as would the SQL in the jobs with parameterized queries now supported.

  • 0 kudos
2 More Replies
zero234
by New Contributor III
  • 25 Views
  • 0 replies
  • 0 kudos

Delta live table is inserting data multiple times

So I have created a delta live table Which uses spark.sql() to execute a query And uses df.write.mode(append).insert intoTo insert  data into the respective table And at the end i return a dumy table Since this was the requirement So now I have also ...

  • 25 Views
  • 0 replies
  • 0 kudos
TimB
by New Contributor III
  • 176 Views
  • 8 replies
  • 3 kudos

Passing multiple paths to .load in autoloader

I am trying to use autoloader to load data from two different blobs from within the same account so that spark will discover the data asynchronously. However, when I try this, it doesn't work and I get the error outlined below. Can anyone point out w...

  • 176 Views
  • 8 replies
  • 3 kudos
Latest Reply
TimB
New Contributor III
  • 3 kudos

If were were to upgrade to ADLSg2, but retain the same structure, would there be scope for this method above to be improved (besides moving to notification mode)?

  • 3 kudos
7 More Replies
pshuk
by New Contributor II
  • 121 Views
  • 2 replies
  • 0 kudos

run md5 using CLI

Hi,I want to run a md5 checksum on the uploaded file to databricks. I can generate md5 on the local file but how do I generate one on uploaded file on databricks using CLI (Command line interface). Any help would be appreciated.I tried running databr...

  • 121 Views
  • 2 replies
  • 0 kudos
Latest Reply
Kaniz
Community Manager
  • 0 kudos

Hi @pshuk, Unfortunately, the databricks fs md5 command is not supported directly.  You can run a Python script to compute the MD5 hash of the uploaded file.If your uploaded file is stored in Azure Blob Storage, you can use the azcopy tool to calcula...

  • 0 kudos
1 More Replies
Amit_Dass_Chmp
by New Contributor II
  • 85 Views
  • 1 replies
  • 0 kudos

On Unity Catalog - what is the best way to adding members to groups

Hi All, On Unity Catalog - what is the best way to adding members to groups using API or CLI? API should be the best option, but thought to check with you all.  

  • 85 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz
Community Manager
  • 0 kudos

Hi @Amit_Dass_Chmp, In general, both API and CLI can be used to manage members and groups in the Unity Catalog. The choice between the two often depends on your specific use case and comfort level with each tool. APIs are often preferred for their...

  • 0 kudos
danial
by New Contributor II
  • 3213 Views
  • 3 replies
  • 1 kudos

Connect Databricks hosted on Azure, with RDS on AWS.

We have Databricks set up and running on Azure. Now we want to connect it with RDS (AWS) to transfer data from RDS to Azure DataLake using the Databricks.I could find the documentation on how to do it within the same cloud (Either AWS or Azure) but n...

  • 3213 Views
  • 3 replies
  • 1 kudos
Latest Reply
Anonymous
Not applicable
  • 1 kudos

Hi @Danial Malik​ Hope everything is going great.Just wanted to check in if you were able to resolve your issue. If yes, would you be happy to mark an answer as best so that other members can find the solution more quickly? If not, please tell us so ...

  • 1 kudos
2 More Replies
chardv
by New Contributor
  • 11 Views
  • 0 replies
  • 0 kudos

Lakehouse Federation Multi-User Authorization

Since Lakehouse Fed uses only one credential per connection to the foreign database, all queries using the connection will see all the data the credentials has to access to. Would anyone know if Lakehouse Fed will support authorization using the cred...

  • 11 Views
  • 0 replies
  • 0 kudos
Michael_Appiah
by New Contributor III
  • 2444 Views
  • 6 replies
  • 2 kudos

Resolved! Parameterized spark.sql() not working

Spark 3.4 introduced parameterized SQL queries and Databricks also discussed this new functionality in a recent blog post (https://www.databricks.com/blog/parameterized-queries-pyspark)Problem: I cannot run any of the examples provided in the PySpark...

Michael_Appiah_0-1704459542967.png Michael_Appiah_1-1704459570498.png
  • 2444 Views
  • 6 replies
  • 2 kudos
Latest Reply
Michael_Appiah
New Contributor III
  • 2 kudos

@Cas Unfortunately I do not have any information on this. However, I have seen that DBR 14.3 and 15.0 introduced some changes to spark.sql(). I have not checked whether those changes resolve the issue outlined here. Your best bet is probably to go ah...

  • 2 kudos
5 More Replies
bradleyjamrozik
by New Contributor III
  • 108 Views
  • 1 replies
  • 0 kudos

Autoloader Failure Creating EventSubscription

Posting this here too in case anyone else has run into this issue... Trying to set up Autoloader File Notifications but keep getting an "Internal Server Error" message.Failure on Write EventSubscription - Internal error - Microsoft Q&A

  • 108 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz
Community Manager
  • 0 kudos

Hi @bradleyjamrozik, Ensure that your service principal for Event Grid and your storage account have the necessary permissions.Specifically, grant the Contributor role to your service principal for Event Grid and your storage account

  • 0 kudos
Phuonganh
by New Contributor II
  • 214 Views
  • 2 replies
  • 1 kudos

Databricks SDK for Python: Errors with parameters for Statement Execution

Hi team,Im using Databricks SDK for python to run SQL queries. I created a variable as below:param = [{'name' : 'a', 'value' :x'}, {'name' : 'b', 'value' : 'y'}]and passed it the statement as below_ = w.statement_execution.execute_statement( warehous...

  • 214 Views
  • 2 replies
  • 1 kudos
Latest Reply
DonkeyKong
New Contributor
  • 1 kudos

@Kaniz This does not help resolve the issue. I am experiencing the same issue when following the above pointers. Here is the statement:response = w.statement_execution.execute_statement( statement='ALTER TABLE users ALTER COLUMN :col_name SET NOT...

  • 1 kudos
1 More Replies
cszczotka
by New Contributor III
  • 85 Views
  • 4 replies
  • 0 kudos

Shallow clone and issue with MODIFY permission to source table

Hi,I'm running shallow clone for external delta tables. The shallow clone is failing for source tables where I don't have MODIFY permission. I'm getting below exception. I don't understand why MODIFY permission to source table is required. Is there a...

  • 85 Views
  • 4 replies
  • 0 kudos
Latest Reply
Amit_Dass_Chmp
New Contributor II
  • 0 kudos

Also check this documentation on access mode :Shallow clone for Unity Catalog tables | Databricks on AWS Working with Unity Catalog shallow clones in Single User access mode, you must have permissions on the resources for the cloned table source as w...

  • 0 kudos
3 More Replies
Labels
Top Kudoed Authors