cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Ajay-Pandey
by Esteemed Contributor III
  • 6277 Views
  • 5 replies
  • 5 kudos

Support of running multiple cells at a time in databricks notebook Hi all,Now databricks notebook supports parallel run of commands in a single notebo...

Support of running multiple cells at a time in databricks notebookHi all,Now databricks notebook supports parallel run of commands in a single notebook that will help run ad hoc queries simultaneously without creating a separate notebook.Once you run...

image.png image
  • 6277 Views
  • 5 replies
  • 5 kudos
Latest Reply
SunilUIIT
New Contributor II
  • 5 kudos

Hi Team,I am observing that the functionality is not working as expected in the Trial workspace of Databricks. Is there a setting that needs to be enabled to allow independent SQL cells in a Databricks notebook to run in parallel, while dependent cel...

  • 5 kudos
4 More Replies
amarnathpal
by New Contributor III
  • 879 Views
  • 4 replies
  • 0 kudos

Resolved! Integrating PySpark DataFrame into SQL Dashboard for Enhanced Visualization

I have created a DataFrame in a notebook using PySpark and am considering creating a fully-featured dashboard in SQL. My question is whether I need to first store the DataFrame as a table in order to use it in the dashboard, or if it's possible to di...

  • 879 Views
  • 4 replies
  • 0 kudos
Latest Reply
hari-prasad
Valued Contributor II
  • 0 kudos

Sorry, I vaugely remember we used to create persistent views on dataframe earlier.Currently, spark dataframe doesn't allow you to create pesistent view on dataframe, rather you have to create table to use it in SQL warehouse.# Assuming there is an ex...

  • 0 kudos
3 More Replies
aayrm5
by Valued Contributor III
  • 507 Views
  • 3 replies
  • 1 kudos

Requirement to remove/skip column(s) in the downstream tables/views while PII data masking

Hi there,As a compliance measure, I'm tasked with masking the PII data starting from bronze to silver and all the tables and views downstream. I suggested my clients to use row filters and column masks as mentioned in the doc.However, when a user who...

  • 507 Views
  • 3 replies
  • 1 kudos
Latest Reply
Walter_C
Databricks Employee
  • 1 kudos

You are right, on this case we might need to open a feature request through https://docs.databricks.com/en/resources/ideas.html#ideas 

  • 1 kudos
2 More Replies
DataGeek_JT
by New Contributor II
  • 3642 Views
  • 1 replies
  • 0 kudos

[SQL_CONF_NOT_FOUND] The SQL config "/Volumes/xxx...." canot be found. Please verify that the confi

I am getting the below error when trying to stream data from Azure Storage path to a Delta Live Table ([PATH] is the path to my files which I have redacted here):[SQL_CONF_NOT_FOUND] The SQL config "/Volumes/[PATH]" cannot be found. Please verify tha...

  • 3642 Views
  • 1 replies
  • 0 kudos
Latest Reply
NandiniN
Databricks Employee
  • 0 kudos

I believe you are not setting  spark.conf.set("/Volumes/[PATH]", "your_actual_path_here") hence when you try to get the conf, it fails.  data_source_path = spark.conf.get("/Volumes/[PATH]") "/Volumes/[PATH]" becomes the conf name, you would not want ...

  • 0 kudos
meystingray
by New Contributor II
  • 4206 Views
  • 1 replies
  • 0 kudos

Azure Databricks: Cannot create volumes or tables

If I try to create a Volume, I get this error:Failed to access cloud storage: AbfsRestOperationException exceptionTraceId=fa207c57-db1a-406e-926f-4a7ff0e4afddWhen i try to create a table, I get this error:Error creating table[RequestId=4b8fedcf-24b3-...

  • 4206 Views
  • 1 replies
  • 0 kudos
Latest Reply
NandiniN
Databricks Employee
  • 0 kudos

It seems like you are encountering issues with accessing cloud storage while trying to create a volume and a table in Databricks on Azure. The errors you are seeing, AbfsRestOperationException and INVALID_STATE.UC_CLOUD_STORAGE_ACCESS_FAILURE, indica...

  • 0 kudos
AlexSantiago
by New Contributor II
  • 4335 Views
  • 15 replies
  • 4 kudos

spotify API get token - raw_input was called, but this frontend does not support input requests.

hello everyone, I'm trying use spotify's api to analyse my music data, but i'm receiving a error during authentication, specifically when I try get the token, above my code.Is it a databricks bug?pip install spotipyfrom spotipy.oauth2 import SpotifyO...

  • 4335 Views
  • 15 replies
  • 4 kudos
Latest Reply
Scofeild618
New Contributor III
  • 4 kudos

Thanks for solution

  • 4 kudos
14 More Replies
ruoyuqian
by New Contributor II
  • 909 Views
  • 1 replies
  • 0 kudos

dbt writing parquet from Volumes to Catalog schema

I have ran into a weird situation, so I uploaded few parquet files (about 10) for my sales data into the Volume in my catalog, and run dbt againt it , dbt went successful and table was able to be created however when i upload a lot more parquet files...

  • 909 Views
  • 1 replies
  • 0 kudos
Latest Reply
NandiniN
Databricks Employee
  • 0 kudos

When dealing with a large number of Parquet files (about 2500 in your case), the system might be running into resource limitations or timeouts. This can happen due to the sheer volume of data being processed at once. The failure might be due to insuf...

  • 0 kudos
Cami
by Contributor III
  • 2128 Views
  • 2 replies
  • 0 kudos

VIEW JSON result value in view which based on volume

Hello guys!I have the following case:It has been decided that the json file will be read from a following definition ( from volume) , which more or less looks like this: CREATE OR REPLACE VIEW [catalog_name].[schema_name].v_[object_name] AS SELECT r...

  • 2128 Views
  • 2 replies
  • 0 kudos
Latest Reply
NandiniN
Databricks Employee
  • 0 kudos

You must be getting the below error: [CONFIG_NOT_AVAILABLE] Configuration spark.sql.legacy.json.allowEmptyString.enabled is not available. that's because in a warehouse this config is not configurable. SQL editor won't be the best choice for this.   ...

  • 0 kudos
1 More Replies
DylanStout
by Contributor
  • 4288 Views
  • 3 replies
  • 0 kudos

UC Volumes: writing xlsx file to volume

How to write a DataFrame to a Volume in a catalog?We tried the following code with our pandas Dataframe:dbutils.fs.put('dbfs:/Volumes/xxxx/default/input_bestanden/x test.xlsx', pandasDf.to_excel('/Volumes/xxxx/default/input_bestanden/x test.xlsx')) T...

  • 4288 Views
  • 3 replies
  • 0 kudos
Latest Reply
NandiniN
Databricks Employee
  • 0 kudos

I was able to upload, using  dbutils.fs.cp('/FileStore/excel-1.xlsx', 'dbfs:/Volumes/xxx/default/xxx/x_test.xlsx') Maybe space in the name is causing an issue for you.        

  • 0 kudos
2 More Replies
Akash_Wadhankar
by New Contributor III
  • 168 Views
  • 0 replies
  • 0 kudos

Databricks cluster selection

Compute is one of the largest portions of cost in Databricks ETL. There is not written rule to handle this. Based on experience I have put some thumb rule to set the right cluster. Please check below. https://medium.com/@infinitylearnings1201/a-compr...

  • 168 Views
  • 0 replies
  • 0 kudos
IshaBudhiraja
by New Contributor II
  • 1919 Views
  • 4 replies
  • 0 kudos

Migration of Synapse Data bricks activity executions from All purpose cluster to New job cluster

Hi,We have been planning to migrate the Synapse Data bricks activity executions from 'All-purpose cluster' to 'New job cluster' to reduce overall cost. We are using Standard_D3_v2 as cluster node type that has 4 CPU cores in total. The current quota ...

IshaBudhiraja_0-1711688756158.png
  • 1919 Views
  • 4 replies
  • 0 kudos
Latest Reply
NandiniN
Databricks Employee
  • 0 kudos

I also see a difference in Photon, Enable Photon for workloads with large data scans, joins, aggregations, and decimal computations. Photon provides significant performance benefits over the standard Databricks Runtime.

  • 0 kudos
3 More Replies
Nastia
by New Contributor III
  • 2549 Views
  • 1 replies
  • 0 kudos

I am getting NoneType error when running a query from API on cluster

When I am running a query on Databricks itself from notebook, it is running fine and giving me results. But the same query when executed from FastAPI (Python, using databricks library) is giving me "TypeError: 'NoneType' object is not iterable".I can...

  • 2549 Views
  • 1 replies
  • 0 kudos
Latest Reply
NandiniN
Databricks Employee
  • 0 kudos

Hi @Nastia , can you please share the entire stacktrace and the query that you are running.  There is currently not much detail with which I can help you understand this. But it is totally possible it is a bug that's causing this, because there shoul...

  • 0 kudos
ameet9257
by Contributor
  • 685 Views
  • 3 replies
  • 2 kudos

Databricks Job API: The job must have exactly one owner

Hi Team,I'm trying to set the Job Permission using the Databricks Job API but getting the below error.{"error_code": "INVALID_PARAMETER_VALUE","message": "The job must have exactly one owner."} I first tried to get the job permission using the below ...

ameet9257_0-1731984876346.png ameet9257_1-1731985277351.png ameet9257_2-1731985298976.png
  • 685 Views
  • 3 replies
  • 2 kudos
Latest Reply
NR_Modugula
New Contributor II
  • 2 kudos

Hi , I have tried the same approach but it ddint work for me. Iam using api/2.0 with PUT Request 

  • 2 kudos
2 More Replies
Gilg
by Contributor II
  • 6632 Views
  • 2 replies
  • 0 kudos

Pivot in Databricks SQL

Hi Team,I have a table that has a key column (column name) and value column (value of the column name). These values are generated dynamically and wanted to pivot the table.Question 1: Is there a way that we can do this without specifying all the col...

Gilg_0-1695088239719.png
  • 6632 Views
  • 2 replies
  • 0 kudos
Latest Reply
NSonam
New Contributor II
  • 0 kudos

PySpark can help to list the available columns .Please find the demo snippets as below Image 1. Image 2 

  • 0 kudos
1 More Replies
Brianben
by New Contributor III
  • 576 Views
  • 4 replies
  • 1 kudos

Procedure of retrieving archived data from delta table

Hi all,I am currently researching on the archive support features in Databricks. https://docs.databricks.com/en/optimizations/archive-delta.htmlLet say I have enabled archive support and configured the data to be archived after 5 years and I also con...

  • 576 Views
  • 4 replies
  • 1 kudos
Latest Reply
Brianben
New Contributor III
  • 1 kudos

@Walter_C Thank you for your reply. However, there are some part that might need your further clarification.Assume I already set the delta.timeUntilArchived to 1825days (5years) and I have configured the lifecycle policy align with databricks setting...

  • 1 kudos
3 More Replies

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels