cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Ria
by New Contributor
  • 2616 Views
  • 1 replies
  • 1 kudos

py4j.security.Py4JSecurityException

Getting this error while loading data with autoloader. Although table access control is already disabled still getting this error."py4j.security.Py4JSecurityException: Method public org.apache.spark.sql.streaming.DataStreamReader org.apache.spark.sql...

image
  • 2616 Views
  • 1 replies
  • 1 kudos
Latest Reply
jose_gonzalez
Databricks Employee
  • 1 kudos

Hi,Are you using a High concurrency cluster? which DBR version are you running?

  • 1 kudos
lurban
by New Contributor II
  • 2364 Views
  • 1 replies
  • 0 kudos

Delta Live Tables Development Mode Resets Cluster On Each Trigger

I believe this is a bug identified, but in the last few days, each time I trigger a test Delta Live Tables run in Development mode, the associated cluster will take 5-7 minutes to spin up each time. The cluster does stay on as anticipated in the comp...

  • 2364 Views
  • 1 replies
  • 0 kudos
Latest Reply
jose_gonzalez
Databricks Employee
  • 0 kudos

Hi,Can you share your cluster JSON settings? it will help us to undertand the settings and VMs you are using.

  • 0 kudos
manasa
by Contributor
  • 6392 Views
  • 3 replies
  • 1 kudos

Need help to insert huge data into cosmos db from azure data lake storage using databricks

I am trying to insert 6GB of data into cosmos db using OLTP ConnectorContainer RU's:40000Cluster Config:cfg = { "spark.cosmos.accountEndpoint" : cosmosdbendpoint, "spark.cosmos.accountKey" : cosmosdbmasterkey, "spark.cosmos.database" : cosmosd...

image.png
  • 6392 Views
  • 3 replies
  • 1 kudos
Latest Reply
ImAbhishekTomar
New Contributor III
  • 1 kudos

Did anyone find solution for this, I’m also using similar clutter and RAU and data ingestion taking lot of time….?

  • 1 kudos
2 More Replies
youssefmrini
by Databricks Employee
  • 1957 Views
  • 1 replies
  • 2 kudos
  • 1957 Views
  • 1 replies
  • 2 kudos
Latest Reply
Sivaprasad1
Databricks Employee
  • 2 kudos

@Youssef Mrini​ : Please have a look at below link which gives the databricks resource limitshttps://docs.databricks.com/resources/limits.html

  • 2 kudos
VictoriaM
by New Contributor II
  • 2110 Views
  • 2 replies
  • 0 kudos

@Chris Grabiel​  Do you have any experience connecting REDCAP API to Databricks you would be able to share?

@Chris Grabiel​  Do you have any experience connecting REDCAP API to Databricks you would be able to share?

  • 2110 Views
  • 2 replies
  • 0 kudos
Latest Reply
Chris_Grabiel
New Contributor III
  • 0 kudos

We absolutely do. We ingest to the lake via Redcap API AND folks use it in notebooks. How can we help?

  • 0 kudos
1 More Replies
JRT5933
by New Contributor III
  • 4572 Views
  • 4 replies
  • 7 kudos

Resolved! GOLD table slowed down at MERGE INTO

Howdy - I recently took a table FACT_TENDER and made it into a medalliona tyle TABLE to test performance since I suspected medallion would be quicker. Key differences: Both tables use bronze dataoriginal has all logic in one long notebookMERGE INTO t...

  • 4572 Views
  • 4 replies
  • 7 kudos
Latest Reply
JRT5933
New Contributor III
  • 7 kudos

I ended up instituing true and tried PARTITIONING and PRUNING methods to boost performance, which has succeeded.

  • 7 kudos
3 More Replies
JJ_LVS1
by New Contributor III
  • 4972 Views
  • 2 replies
  • 1 kudos

Resolved! DLT Notebook Error - Queries with streaming sources must be executed with writeStream.start();

I'm trying to parse incoming stream files in DLT which have variable length records. I'm getting the error:Queries with streaming sources must be executed with writeStream.start();Notebook code@dlt.table ( comment="xAudit Parsed" ) def b_table_pa...

  • 4972 Views
  • 2 replies
  • 1 kudos
Latest Reply
Anonymous
Not applicable
  • 1 kudos

Hi @Jason Johnson​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Than...

  • 1 kudos
1 More Replies
FK
by New Contributor
  • 1424 Views
  • 1 replies
  • 1 kudos
  • 1424 Views
  • 1 replies
  • 1 kudos
Latest Reply
Anonymous
Not applicable
  • 1 kudos

Hi @Faizan Khan​ Thank you for reaching out, and we’re sorry to hear about this log-in issue! We have this Community Edition login troubleshooting post on Community. Please take a look, and follow the troubleshooting steps. If the steps do not resolv...

  • 1 kudos
SG_
by New Contributor II
  • 4225 Views
  • 1 replies
  • 2 kudos

How to display Sparklyr table in a clean readable format similar to the output of display()?

There exist a Databricks’s built-in display() function (see documentation here) which allow users to display R or SparkR dataframe in a clean and human readable manner where user can scroll to see all the columns and perform sorting on the columns. S...

ewFTT.jpg h3hff
  • 4225 Views
  • 1 replies
  • 2 kudos
Latest Reply
rich_goldberg
New Contributor II
  • 2 kudos

I found that the display() function returned this issue when it came across date-type fields that were NULL. The following function seemed to fix the problem:library(tidyverse) library(lubridate)   display_fixed = function(df) { df %>% ...

  • 2 kudos
KVNARK
by Honored Contributor II
  • 7022 Views
  • 2 replies
  • 7 kudos

need to fetch secrets from key vault in my local

Could you please look into this if I'm missing something. Getting the below error:azure.core.exceptions.ServiceRequestError: Bearer token authentication is not permitted for non-TLS protected (non-https) URLs.Using below function for that.def get_aut...

  • 7022 Views
  • 2 replies
  • 7 kudos
Latest Reply
Anonymous
Not applicable
  • 7 kudos

Hope everything is going great.Just wanted to check in if you were able to resolve your issue. If yes, would you be happy to mark an answer as best so that other members can find the solution more quickly? If not, please tell us so we can help you. C...

  • 7 kudos
1 More Replies
agagrins
by New Contributor III
  • 3558 Views
  • 3 replies
  • 2 kudos

How to speed up `dbx launch --from-assets`

Hiya,I'm trying to follow the testing workflow of```$ dbx deploy test --assets-only$ dbx launch test --from-assets --trace --include-output stdout```But I find the turnaround time is quite long, even with an instance pool.The `deployment.yaml` looks ...

  • 3558 Views
  • 3 replies
  • 2 kudos
Latest Reply
tonkol
New Contributor II
  • 2 kudos

Hi, I have no solution, actually I've just registered to open a very similar ticket, when saw yours.According to my experiments getting an already running VM from the pool (times between events: CREATING - INIT_SCRIPTS_STARTED) can take anything betw...

  • 2 kudos
2 More Replies
joshberry
by New Contributor II
  • 3959 Views
  • 2 replies
  • 0 kudos

Resolved! Unable to add password to a user (with SSO enabled)

I am trying to add a non-SSO admin user to my account (not to a workspace). I have SSO backed off to Google for the majority of users.I can create the account OK, then go in and reset the password to something, but when I try and log in I get the err...

  • 3959 Views
  • 2 replies
  • 0 kudos
Latest Reply
joshberry
New Contributor II
  • 0 kudos

Ah, missed that bit of the docs. Thanks

  • 0 kudos
1 More Replies
vinaykumar
by New Contributor III
  • 3176 Views
  • 3 replies
  • 0 kudos

File optimization for delta table (versioning and snapshot ) in storage S3

Delta table generates new file for every insert or update on table and keep the old version files also for versioning  and time travel history . I have 1tb data as delta table and every 30 minutes , 90 percent data getting updated so file size will b...

  • 3176 Views
  • 3 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Hi @vinay kumar​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Thanks...

  • 0 kudos
2 More Replies
dispersion
by New Contributor
  • 2335 Views
  • 2 replies
  • 1 kudos

Running large volume of SQL queries in Python notebooks. How to minimise overheads/maintenance.

I have around 200 SQL queries id like to run in databricks python notebooks. Id like to avoid creating an ETL process for each of the 200 SQL processes.Any suggestions on how to run the queries in a way that it loops through them so i have minimum am...

  • 2335 Views
  • 2 replies
  • 1 kudos
Latest Reply
Anonymous
Not applicable
  • 1 kudos

Hi @Chris French​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Thank...

  • 1 kudos
1 More Replies

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels