cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 
Data + AI Summit 2024 - Data Engineering & Streaming

Forum Posts

Orianh
by Valued Contributor II
  • 2533 Views
  • 2 replies
  • 2 kudos

Resolved! pyodbc read only connection.

Hey Guys, Is there a way to open pyodbc read only connection with simba spark driver? At the moment, I'm able to execute queries such as select , delete, insert into - basically every sql statement using pyodbc. I tried to open pyodbc connection but ...

  • 2533 Views
  • 2 replies
  • 2 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 2 kudos

This readonly=True is working only on some drivers. Just create additional users with granted read-only permission.

  • 2 kudos
1 More Replies
sbahm
by New Contributor III
  • 2797 Views
  • 4 replies
  • 4 kudos

Resolved! Issue with adding gitlab credentials to the databricks for "Git integration"

Hi,we have configured our infrastructure by terraform in AZURE, now we want to config GitLab integration with databriks to automate notebook and job deployment. I sow that now this step is available only via databricks UI interface, can you share som...

  • 2797 Views
  • 4 replies
  • 4 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 4 kudos

Actually Repos API is already available https://docs.databricks.com/dev-tools/api/latest/repos.html#operation/create-repo

  • 4 kudos
3 More Replies
alejandrofm
by Valued Contributor
  • 3083 Views
  • 4 replies
  • 1 kudos

Resolved! Can't see execution plan graph on all-purpose cluster

On a current running all-purpose-cluster, enter the spark UI, then SQL, and into a task, you can see the details, and SQL properties but the visualization doesn't appear, that graph is very useful to debug certain scenarios.It works fine on jobs.Any ...

  • 3083 Views
  • 4 replies
  • 1 kudos
Latest Reply
alejandrofm
Valued Contributor
  • 1 kudos

I'm on Chrome, sometimes appears sometimes not, will look more into that to give something reproducible. Thanks!

  • 1 kudos
3 More Replies
Mr__E
by Contributor II
  • 1275 Views
  • 1 replies
  • 1 kudos

Resolved! SSO and cluster creation restriction

Accounts added after we turned on SSO don't allow me to restrict their cluster creation abilities. How can I undo this, so I can prevent business people from writing to ETLed data?

  • 1275 Views
  • 1 replies
  • 1 kudos
Latest Reply
Mr__E
Contributor II
  • 1 kudos

Nevermind. Turns out someone was giving everyone admin privileges when they weren't supposed to and I didn't notice.

  • 1 kudos
govind
by New Contributor
  • 2091 Views
  • 2 replies
  • 0 kudos

Write 160M rows with 300 columns into Delta Table using Databricks?

Hi, I am using databricks to load data from one delta table into another delta table. I'm using SIMBA Spark JDBC connector to pull data from delta table in my source instance and writing into delta table in my databricks instance. The source has...

  • 2091 Views
  • 2 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Hi @govind@dqlabs.ai​ Just wanted to check in if you were able to resolve your issue or do you need more help? We'd love to hear from you.Thanks!

  • 0 kudos
1 More Replies
Zii
by New Contributor II
  • 3395 Views
  • 0 replies
  • 1 kudos

Delta Live Tables Quality check for distinct Values

Hi All, I have been having an issue identifying how to do a uniqueness check for the quality check. Below is an example. @dlt.expect("origin_not_dup", "origin is distinct from origin") def harmonized_data(): df=dlt.read("raw_data") for col in...

  • 3395 Views
  • 0 replies
  • 1 kudos
Vikram
by New Contributor II
  • 3078 Views
  • 4 replies
  • 4 kudos

Resolved! CVE-2022-0778

How can we update the OpenSSL version for the cluster to address this vulnerability ?https://ubuntu.com/security/CVE-2022-0778Tried with this global init script to auto update the openssl version but does not seem to work as apt-utils is missing. apt...

  • 3078 Views
  • 4 replies
  • 4 kudos
Latest Reply
Atanu
Databricks Employee
  • 4 kudos

I can see below from our internal communication. CVSSv3 score: 4.0 (Medium) AV:N/AC:H/PR:N/UI:N/S:C/C:N/I:N/A:LReference: https://www.openssl.org/news/secadv/20220315.txtSeverity: HighThe BN_mod_sqrt() function, which computes a modular square root, ...

  • 4 kudos
3 More Replies
pavanb
by New Contributor II
  • 10486 Views
  • 3 replies
  • 3 kudos

Resolved! memory issues - databricks

Hi All, All of a sudden in our Databricks dev environment, we are getting exceptions related to memory such as out of memory , result too large etc.Also, the error message is not helping to identify the issue.Can someone please guide on what would be...

  • 10486 Views
  • 3 replies
  • 3 kudos
Latest Reply
pavanb
New Contributor II
  • 3 kudos

Thanks for the response @Hubert Dudek​ .if i run the same code in test environment , its getting successfully completed and in dev its giving out of memory issue. Also the configuration of test nand dev environment is exactly same.

  • 3 kudos
2 More Replies
Vee
by New Contributor
  • 5038 Views
  • 1 replies
  • 1 kudos

Cluster configuration and optimal number for fs.s3a.connection.maximum , fs.s3a.threads.max

Please could you suggest best cluster configuration for a use case stated below and tips to resolve the errors shown below -Use case:There could be 4 or 5 spark jobs that run concurrently.Each job reads 40 input files and spits out 120 output files ...

  • 5038 Views
  • 1 replies
  • 1 kudos
Latest Reply
jose_gonzalez
Databricks Employee
  • 1 kudos

Hi @Vetrivel Senthil​ , Just wondering if this question is a duplicate from this one https://community.databricks.com/s/feed/0D53f00001qvQJcCAM?

  • 1 kudos
Vee
by New Contributor
  • 3336 Views
  • 1 replies
  • 0 kudos

Tips for resolving follolwing errors related to AWS S3 read / write

Job aborted due to stage failure: Task 0 in stage 3084.0 failed 4 times, most recent failure: Lost task 0.3 in stage 3084.0 (TID...., ip..., executor 0): org.apache.spark.SparkExecution: Task failed while writing rowsJob aborted due to stage failure:...

  • 3336 Views
  • 1 replies
  • 0 kudos
Latest Reply
" src="" />
This widget could not be displayed.
This widget could not be displayed.
This widget could not be displayed.
  • 0 kudos

This widget could not be displayed.
Job aborted due to stage failure: Task 0 in stage 3084.0 failed 4 times, most recent failure: Lost task 0.3 in stage 3084.0 (TID...., ip..., executor 0): org.apache.spark.SparkExecution: Task failed while writing rowsJob aborted due to stage failure:...

This widget could not be displayed.
  • 0 kudos
This widget could not be displayed.
Rk2
by New Contributor II
  • 1619 Views
  • 2 replies
  • 4 kudos

Resolved! scheduling a job with multiple notebooks using common parameter

I have a practical use case​three notebooks (pyspark ) all have on​e common parameter. ​need to schedule all three notebooks in a sequence ​is there any way to run them by setting one parameter value, as they are same in all. ​please suggest the ...

  • 1619 Views
  • 2 replies
  • 4 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 4 kudos

@Ramesh Kotha​ , in notebook get parameter like that:my_parameter = dbutils.widgets.get("my_parameter")and set it in a task like that:

  • 4 kudos
1 More Replies
WojtekJ
by New Contributor
  • 5513 Views
  • 1 replies
  • 1 kudos

Is it possible to use Iceberg instead of DeltaLake?

Hi.Do you know if it is possible to use Iceberg table format instead DeltaLake?Ideally, I would like to see the tables in Databricks stored as Iceberg and use them as usual in the notebooks.I read that there is also an option to link external metasto...

  • 5513 Views
  • 1 replies
  • 1 kudos
Latest Reply
" src="" />
This widget could not be displayed.
This widget could not be displayed.
This widget could not be displayed.
  • 1 kudos

This widget could not be displayed.
Hi.Do you know if it is possible to use Iceberg table format instead DeltaLake?Ideally, I would like to see the tables in Databricks stored as Iceberg and use them as usual in the notebooks.I read that there is also an option to link external metasto...

This widget could not be displayed.
  • 1 kudos
This widget could not be displayed.
SailajaB
by Valued Contributor III
  • 4089 Views
  • 3 replies
  • 7 kudos

Resolved! how we can use config file to change pysparks dataframe names without hardcoding

Hi,Can we use config file to change pyspark dataframe attribute names (root, nested of both struct and array type) .Actually in input we are getting attributes in lowercase we need to convert them into camel case(please note we don't have any separat...

  • 4089 Views
  • 3 replies
  • 7 kudos
Latest Reply
Anonymous
Not applicable
  • 7 kudos

Hi @Sailaja B​ This is awesome!Thanks for coming in and posting the solution. We really appreciate it.Cheers!

  • 7 kudos
2 More Replies
Tahseen0354
by Valued Contributor
  • 1554 Views
  • 1 replies
  • 1 kudos

Configure CLI on databricks on GCP

Hi, I have a service account in my GCP project and the service account is added as a user in my databricks GCP account. Is it possible to configure CLI on databricks on GCP using that service account ? Something similar to:databricks configure ---tok...

  • 1554 Views
  • 1 replies
  • 1 kudos
Latest Reply
" src="" />
This widget could not be displayed.
This widget could not be displayed.
This widget could not be displayed.
  • 1 kudos

This widget could not be displayed.
Hi, I have a service account in my GCP project and the service account is added as a user in my databricks GCP account. Is it possible to configure CLI on databricks on GCP using that service account ? Something similar to:databricks configure ---tok...

This widget could not be displayed.
  • 1 kudos
This widget could not be displayed.

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group
Labels