cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Ashley1
by Contributor
  • 1205 Views
  • 5 replies
  • 1 kudos

Resolved! Can ADLS be mounted in DBFS using only ADLS account key?

I realise this is not an optimal configuration but I'm trying to pull together a POC and I'm not at the point that I wish to ask the AAD admins to create an application for OAuth authentication.I have been able to use direct references to the ADLS co...

  • 1205 Views
  • 5 replies
  • 1 kudos
Latest Reply
Anonymous
Not applicable
  • 1 kudos

Hey there @Ashley Betts​ Thank you for posting your question. And you found the solution.This is awesome!Would you be happy to mark the answer as best so that other members can find the solution more quickly?Cheers!

  • 1 kudos
4 More Replies
RengarLee
by Contributor
  • 2292 Views
  • 5 replies
  • 0 kudos

Resolved! How to improve Spark Streaming writer Input Rate and Processing rate?

Hi!I have many questions about Spark Streaming and Evnethub。Can you help me?Q1:How to improve Spark Streaming writer Input Rate and Processing rate?I connect Azure Eventhubs using Spark Streaming(Azure Databricks), but I found if I use display, this ...

  • 2292 Views
  • 5 replies
  • 0 kudos
Latest Reply
RengarLee
Contributor
  • 0 kudos

setMaxEventsPerTrigger not equal to numInputRow is my problem

  • 0 kudos
4 More Replies
shan_chandra
by Honored Contributor III
  • 1929 Views
  • 1 replies
  • 3 kudos

Resolved! How to execute matplotlib animations in a Databricks notebook?

How to execute matplotlib animations in a Databricks notebook?

  • 1929 Views
  • 1 replies
  • 3 kudos
Latest Reply
shan_chandra
Honored Contributor III
  • 3 kudos

Please refer to the below example code and , use displayHTML(ani.to_jshtml()) to execute matplotlib animations in a databricks notebookimport matplotlib.pyplot as plt import matplotlib.animation import numpy as np t = np.linspace(0,2*np.pi) x = np.si...

  • 3 kudos
Orianh
by Valued Contributor II
  • 1403 Views
  • 2 replies
  • 2 kudos

Resolved! pyodbc read only connection.

Hey Guys, Is there a way to open pyodbc read only connection with simba spark driver? At the moment, I'm able to execute queries such as select , delete, insert into - basically every sql statement using pyodbc. I tried to open pyodbc connection but ...

  • 1403 Views
  • 2 replies
  • 2 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 2 kudos

This readonly=True is working only on some drivers. Just create additional users with granted read-only permission.

  • 2 kudos
1 More Replies
sbahm
by New Contributor III
  • 1318 Views
  • 4 replies
  • 4 kudos

Resolved! Issue with adding gitlab credentials to the databricks for "Git integration"

Hi,we have configured our infrastructure by terraform in AZURE, now we want to config GitLab integration with databriks to automate notebook and job deployment. I sow that now this step is available only via databricks UI interface, can you share som...

  • 1318 Views
  • 4 replies
  • 4 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 4 kudos

Actually Repos API is already available https://docs.databricks.com/dev-tools/api/latest/repos.html#operation/create-repo

  • 4 kudos
3 More Replies
alejandrofm
by Valued Contributor
  • 2843 Views
  • 7 replies
  • 8 kudos

Resolved! Pandas.spark.checkpoint() doesn't broke lineage

Hi, I'm doing some something simple on Databricks notebook:spark.sparkContext.setCheckpointDir("/tmp/")   import pyspark.pandas as ps   sql=("""select field1, field2 From table Where date>='2021-01.01""")   df = ps.sql(sql) df.spark.checkpoint()That...

  • 2843 Views
  • 7 replies
  • 8 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 8 kudos

If you need checkpointing, please try the below code. Thanks to persist, you will avoid reprocessing:df = ps.sql(sql).persist() df.spark.checkpoint()

  • 8 kudos
6 More Replies
Mr__E
by Contributor II
  • 644 Views
  • 1 replies
  • 1 kudos

Resolved! SSO and cluster creation restriction

Accounts added after we turned on SSO don't allow me to restrict their cluster creation abilities. How can I undo this, so I can prevent business people from writing to ETLed data?

  • 644 Views
  • 1 replies
  • 1 kudos
Latest Reply
Mr__E
Contributor II
  • 1 kudos

Nevermind. Turns out someone was giving everyone admin privileges when they weren't supposed to and I didn't notice.

  • 1 kudos
govind
by New Contributor
  • 1191 Views
  • 4 replies
  • 0 kudos

Write 160M rows with 300 columns into Delta Table using Databricks?

Hi, I am using databricks to load data from one delta table into another delta table. I'm using SIMBA Spark JDBC connector to pull data from delta table in my source instance and writing into delta table in my databricks instance. The source has...

  • 1191 Views
  • 4 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Hi @govind@dqlabs.ai​ Just wanted to check in if you were able to resolve your issue or do you need more help? We'd love to hear from you.Thanks!

  • 0 kudos
3 More Replies
AlexDavies
by Contributor
  • 3569 Views
  • 12 replies
  • 3 kudos

Resolved! Report on SQL queries that are being executed

We have a SQL workspace with a cluster running that services a number of self service reports against a range of datasets. We want to be able to analyse and report on the queries our self service users are executing so we can get better visibility of...

  • 3569 Views
  • 12 replies
  • 3 kudos
Latest Reply
Anonymous
Not applicable
  • 3 kudos

Hey there @Alex Davies​ Hope you are doing great. Just checking in if you were able to resolve your issue or do you need more help? We'd love to hear from you.Thanks!

  • 3 kudos
11 More Replies
Vikram
by New Contributor II
  • 1560 Views
  • 4 replies
  • 4 kudos

Resolved! CVE-2022-0778

How can we update the OpenSSL version for the cluster to address this vulnerability ?https://ubuntu.com/security/CVE-2022-0778Tried with this global init script to auto update the openssl version but does not seem to work as apt-utils is missing. apt...

  • 1560 Views
  • 4 replies
  • 4 kudos
Latest Reply
Atanu
Esteemed Contributor
  • 4 kudos

I can see below from our internal communication. CVSSv3 score: 4.0 (Medium) AV:N/AC:H/PR:N/UI:N/S:C/C:N/I:N/A:LReference: https://www.openssl.org/news/secadv/20220315.txtSeverity: HighThe BN_mod_sqrt() function, which computes a modular square root, ...

  • 4 kudos
3 More Replies
pavanb
by New Contributor II
  • 7371 Views
  • 3 replies
  • 3 kudos

Resolved! memory issues - databricks

Hi All, All of a sudden in our Databricks dev environment, we are getting exceptions related to memory such as out of memory , result too large etc.Also, the error message is not helping to identify the issue.Can someone please guide on what would be...

  • 7371 Views
  • 3 replies
  • 3 kudos
Latest Reply
pavanb
New Contributor II
  • 3 kudos

Thanks for the response @Hubert Dudek​ .if i run the same code in test environment , its getting successfully completed and in dev its giving out of memory issue. Also the configuration of test nand dev environment is exactly same.

  • 3 kudos
2 More Replies
Rk2
by New Contributor II
  • 704 Views
  • 3 replies
  • 4 kudos

Resolved! scheduling a job with multiple notebooks using common parameter

I have a practical use case​three notebooks (pyspark ) all have on​e common parameter. ​need to schedule all three notebooks in a sequence ​is there any way to run them by setting one parameter value, as they are same in all. ​please suggest the ...

  • 704 Views
  • 3 replies
  • 4 kudos
Latest Reply
Kaniz
Community Manager
  • 4 kudos

Hi @Ramesh Kotha​ , Just a friendly follow-up. Do you still need help, or @Hubert Dudek (Customer)​ 's response help you to find the solution? Please let us know.

  • 4 kudos
2 More Replies
WojtekJ
by New Contributor
  • 4002 Views
  • 2 replies
  • 3 kudos

Is it possible to use Iceberg instead of DeltaLake?

Hi.Do you know if it is possible to use Iceberg table format instead DeltaLake?Ideally, I would like to see the tables in Databricks stored as Iceberg and use them as usual in the notebooks.I read that there is also an option to link external metasto...

  • 4002 Views
  • 2 replies
  • 3 kudos
Latest Reply
Kaniz
Community Manager
  • 3 kudos

Hi @Wojtek J​ , Here's a thorough comparison of Delta Lake, Iceberg and Hudi.This talk shares the research that we did for the comparison of the key features and designs these table format holds, the maturity of features, such as APIs expose to end u...

  • 3 kudos
1 More Replies
lshar
by New Contributor III
  • 13836 Views
  • 7 replies
  • 5 kudos

Resolved! How do I pass arguments/variables from widgets to notebooks?

Hello,I am looking for a solution to this problem, which is known since 7 years: https://community.databricks.com/s/question/0D53f00001HKHZfCAP/how-do-i-pass-argumentsvariables-to-notebooksWhat I need is to parametrize my notebooks using widget infor...

example_if_run
  • 13836 Views
  • 7 replies
  • 5 kudos
Latest Reply
lshar
New Contributor III
  • 5 kudos

@Hubert Dudek​ @Kaniz Fatma​ When I use the dbutils.notebook.run(..) a new cluster is started, hence I can run some other code, but cannot use variable and functions as if I have just run them directly in the same notebook. Hence, my goal is not met....

  • 5 kudos
6 More Replies
SailajaB
by Valued Contributor III
  • 2411 Views
  • 5 replies
  • 7 kudos

Resolved! how we can use config file to change pysparks dataframe names without hardcoding

Hi,Can we use config file to change pyspark dataframe attribute names (root, nested of both struct and array type) .Actually in input we are getting attributes in lowercase we need to convert them into camel case(please note we don't have any separat...

  • 2411 Views
  • 5 replies
  • 7 kudos
Latest Reply
Anonymous
Not applicable
  • 7 kudos

Hi @Sailaja B​ This is awesome!Thanks for coming in and posting the solution. We really appreciate it.Cheers!

  • 7 kudos
4 More Replies
Labels
Top Kudoed Authors