cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

FerArribas
by Contributor
  • 5437 Views
  • 3 replies
  • 0 kudos

Resolved! Azure Databricks - Difference between protecting the WEB UI with IP Access list or disabling public access?

Hi, Thoroughly investigating the best security practices for accessing the Databricks WEB UI. I have doubts about the difference between protecting the WEB UI with (1) IP Access list (https://learn.microsoft.com/en-us/azure/databricks/security/networ...

  • 5437 Views
  • 3 replies
  • 0 kudos
Latest Reply
Rik
New Contributor III
  • 0 kudos

"In short, would it be the same to configure only the IP of the private endpoint in the IP access list vs disable public access?"The access list doesn't apply to private IPs, only to public IP (internet). Relevant part from the docs:"If you use Priva...

  • 0 kudos
2 More Replies
mbhakta
by New Contributor II
  • 1632 Views
  • 1 replies
  • 0 kudos

Dashboard - get value from table on user click

I'm building a dashboard via Python notebook and trying to allow the end user to click a value on a table, and use the selected value in another query / panel. This somewhat works using widget dropdowns for a user to select which value, but I'd reall...

  • 1632 Views
  • 1 replies
  • 0 kudos
Latest Reply
Henrymartin
New Contributor II
  • 0 kudos

@mbhakta wrote:I'm building a dashboard via Python notebook and trying to allow the end user to click a value on a table, and use the selected value in another query / panel. This somewhat works using widget dropdowns for a user to select which value...

  • 0 kudos
dream
by Contributor
  • 10144 Views
  • 1 replies
  • 2 kudos

Comparing schemas of two dataframes

So I was comparing schemas of two different dataframe using this code: >>> df1.schema == df2.schema Out: False But the thing is, both the schemas are completely equal.When digging deeper I realized that some of the StructFields() that should have bee...

  • 10144 Views
  • 1 replies
  • 2 kudos
Latest Reply
Ajay-Pandey
Esteemed Contributor III
  • 2 kudos

Hi @dream ,In this case, you can go with dataframe.dtypes for comparing the schema or datatypes for two dataframeMetadata store information about column properties

  • 2 kudos
PaulStuart
by New Contributor
  • 5147 Views
  • 1 replies
  • 1 kudos

Resolved! "Can't login to databricks socket is closed" when using vsCode Extension

hello there.  I am experiencing a problem using the Databricks Extension with Visual Studio Code, and I wonder if anyone else has experienced this.First, I have installed the databricks cli, and configured some profiles using tokens.  Those profiles ...

  • 5147 Views
  • 1 replies
  • 1 kudos
Latest Reply
nkls
New Contributor III
  • 1 kudos

I finally solved it!I had the same error code as you.Running Databricks Extension v1.1.1, vscode 1.79 on Windows 10.I'm behind a company proxy and the main issue was that vscode didn't have proxy support enabled as default.Adding this to my settings....

  • 1 kudos
piterpan
by New Contributor III
  • 8476 Views
  • 8 replies
  • 11 kudos

Resolved! _sqldf not defined on Azure job cluster v12.2

Since yesterday we have errors in notebooks that were previously working.  NameError: name '_sqldf' is not defined  It was working previously.We are on Azure databricks, usng job pool Driver: Standard_D4s_v5 · Workers: Standard_D4s_v5 · 1-6 workers ·...

Data Engineering
azure
Notebook
pyspark
  • 8476 Views
  • 8 replies
  • 11 kudos
Latest Reply
Tharun-Kumar
Databricks Employee
  • 11 kudos

@piterpan This was a regression issue which impacted the jobs where _sqldf was referenced and the notebook those weren't run interactively. Our Engineering team has fixed this issue yesterday.Could you check whether you are still facing the issue?

  • 11 kudos
7 More Replies
marianopenn
by New Contributor III
  • 19704 Views
  • 6 replies
  • 4 kudos

Resolved! [UDF_MAX_COUNT_EXCEEDED] Exceeded query-wide UDF limit of 5 UDFs

We are using DLT to ingest data into our Unity catalog and then, in a separate job, we are reading and manipulating this data and then writing it to a table like:df.write.saveAsTable(name=target_table_path)We are getting an error which I cannot find ...

Data Engineering
data engineering
dlt
python
udf
Unity Catalog
  • 19704 Views
  • 6 replies
  • 4 kudos
Latest Reply
Tharun-Kumar
Databricks Employee
  • 4 kudos

@AlexPrev You can traverse to the Advanced Settings in the Cluster configuration and include this config in the Spark section.

  • 4 kudos
5 More Replies
Atifdatabricks
by New Contributor II
  • 2074 Views
  • 2 replies
  • 1 kudos

Suspended - Databricks Certified Associate Developer for Apache Spark

During middle of the exam I got suspended. It said due to my eye movement. I had the test on left part of my monitor and pdf (which was provided as a testing aid for this exam) on right side. I was just moving my eyes left and right as I was using PD...

  • 2074 Views
  • 2 replies
  • 1 kudos
Latest Reply
Atifdatabricks
New Contributor II
  • 1 kudos

My request number is 00353935

  • 1 kudos
1 More Replies
Rishi045
by New Contributor III
  • 16916 Views
  • 11 replies
  • 0 kudos

Data getting missed while reading from azure event hub using spark streaming

Hi All,I am facing an issue of data getting missed.I am reading the data from azure event hub and after flattening the json data I am storing it in a parquet file and then using another databricks notebook to perform the merge operations on my delta ...

Data Engineering
Azure event hub
Spark streaming
  • 16916 Views
  • 11 replies
  • 0 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 0 kudos

- In the EventHub, you can preview the event hub job using Azure Analitycs, so please first check are all records there- Please set in Databricks that it is saved directly to the bronze delta table without performing any aggregation, just 1 to 1, and...

  • 0 kudos
10 More Replies
ThomasVanBilsen
by New Contributor III
  • 2363 Views
  • 1 replies
  • 1 kudos

Catalog name's in DTAP scenario

Hi everyone,I'm currently in the process of migrating to Unity Catalog. I have several Azure Databricks Workspaces, one for each phase of the development phase (development, test, acceptance, and production). In accordance with the best practices (ht...

Data Engineering
DTAP
Unity Catalog
  • 2363 Views
  • 1 replies
  • 1 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 1 kudos

you could also store the environment name in a config file f.e. in the databricks filestore.These config files can also be managed by ci/cd.tbh my preferred way of working lately.

  • 1 kudos
sparkstreaming
by New Contributor III
  • 8967 Views
  • 5 replies
  • 4 kudos

Resolved! Missing rows while processing records using foreachbatch in spark structured streaming from Azure Event Hub

I am new to real time scenarios and I need to create a spark structured streaming jobs in databricks. I am trying to apply some rule based validations from backend configurations on each incoming JSON message. I need to do the following actions on th...

  • 8967 Views
  • 5 replies
  • 4 kudos
Latest Reply
Rishi045
New Contributor III
  • 4 kudos

Were you able to achieve any solutions if yes please can you help with it.

  • 4 kudos
4 More Replies
DipsikhaDas
by New Contributor II
  • 1759 Views
  • 1 replies
  • 1 kudos

Databricks notebook exceptions into Service Now

Hello Community members,I am looking for options for redirecting the Databricks notebook raised except within exception block to be redirected to ServiceNowIs there a way the connection can be made directly from the notebook?Looking for suggestions. ...

  • 1759 Views
  • 1 replies
  • 1 kudos
Latest Reply
DipsikhaDas
New Contributor II
  • 1 kudos

Thank you for the solution, I will definitely try this and share to the community if this works.

  • 1 kudos
adivandhya
by New Contributor III
  • 3371 Views
  • 3 replies
  • 4 kudos

configuration for Job Queueing in Terraform

When defining the databricks_job resource in Terraform , we are trying to enable Job Queueing flag for the job. However, from the Terraform Provider docs, we are not able to find any config related to queuing. Is there a different method to configure...

  • 3371 Views
  • 3 replies
  • 4 kudos
Latest Reply
adivandhya
New Contributor III
  • 4 kudos

I've created a Feature Request for this in Github - https://github.com/databricks/terraform-provider-databricks/issues/2531

  • 4 kudos
2 More Replies
HasiCorp
by New Contributor II
  • 14765 Views
  • 3 replies
  • 2 kudos

Resolved! AnalysisException: [RequestId=... ErrorClass=INVALID_PARAMETER_VALUE] Missing cloud file system scheme

Hi community,i get an analysis exception when executing following code in a notebook using a personal compute cluster. Seems to be an issue with permission but I am logged in with my admin account. Any help would be appreciated. USE CATALOG catalog; ...

  • 14765 Views
  • 3 replies
  • 2 kudos
Latest Reply
Leonardo
New Contributor III
  • 2 kudos

I was having the same issue because I was trying to set the location with the absolute path, just like you did.I solved it by creating an external location, then copying the URL and putting it into the location of the path options.

  • 2 kudos
2 More Replies

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels