cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

405041
by New Contributor II
  • 2856 Views
  • 2 replies
  • 0 kudos

Securing the Account Owner

Hey,As I understand, you cannot enable SSO and MFA for the Account Owner.Is there any way on the Databricks side to secure the Account Owner beyond username/password? Is there a lockout that is set up automatically for this user?What are the best pra...

  • 2856 Views
  • 2 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

@Domonkos Rozsa​ :You are correct that Databricks does not support SSO and MFA for the Account Owner. However, there are several built-in mechanisms that can help secure the Account Owner account and protect it from unauthorized access:Password polic...

  • 0 kudos
1 More Replies
source2sea
by Contributor
  • 7041 Views
  • 1 replies
  • 0 kudos

Resolved! what mode is the deploy-mode when calling spark in databricks/

https://spark.apache.org/docs/latest/submitting-applications.htmlmainly want to know if extra class path could be used or not when i submit a job

  • 7041 Views
  • 1 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

@min shi​ :In Databricks, when you run a job, you are submitting a Spark application to run in the cluster. The deploy-mode that is used by default depends on the type of job you are running:For interactive clusters, the deploy-mode is client. This m...

  • 0 kudos
Hubert-Dudek
by Databricks MVP
  • 2558 Views
  • 2 replies
  • 8 kudos

Databricks has added new metrics to its control panel, replacing the outdated Ganglia tool. These new metrics allow users to monitor the following clu...

Databricks has added new metrics to its control panel, replacing the outdated Ganglia tool. These new metrics allow users to monitor the following cluster performance metrics easily:- CPU utilization- Memory usage- Free filesystem space- Network traf...

Screenshot 2023-04-13 154026
  • 2558 Views
  • 2 replies
  • 8 kudos
Latest Reply
jose_gonzalez
Databricks Employee
  • 8 kudos

Thank you for sharing @Hubert Dudek​ !!!

  • 8 kudos
1 More Replies
Erik_L
by Contributor II
  • 13404 Views
  • 2 replies
  • 2 kudos

Joining a big amount of data causes "Out of disk space error", how to ingest?

What I am trying to dodf = None   # For all of the IDs that are valid for id in ids: # Get the parts of the data from different sources df_1 = spark.read.parquet(url_for_id) df_2 = spark.read.parquet(url_for_id) ...   # Join together the pa...

  • 13404 Views
  • 2 replies
  • 2 kudos
Latest Reply
Anonymous
Not applicable
  • 2 kudos

@Erik Louie​ :There are several strategies that you can use to handle large joins like this in Spark:Use a broadcast join: If one of your dataframes is relatively small (less than 10-20 GB), you can use a broadcast join to avoid shuffling data. A bro...

  • 2 kudos
1 More Replies
Khalil
by Contributor
  • 12829 Views
  • 4 replies
  • 4 kudos

Resolved! Pivot a DataFrame in Delta Live Table DLT

I wanna apply a pivot on a dataframe in DLT but I'm having the following warningNotebook:XXXX used `GroupedData.pivot` function that will be deprecated soon. Please fix the notebook.I have the same warning if I use the the function collect.Is it risk...

  • 12829 Views
  • 4 replies
  • 4 kudos
Latest Reply
Khalil
Contributor
  • 4 kudos

Thanks @Kaniz Fatma​  for your support.The solution was to do the pivot outside of views or tables and the warning disappeared.

  • 4 kudos
3 More Replies
moski
by New Contributor II
  • 3279 Views
  • 3 replies
  • 1 kudos

How to import a data table from SQLQuery2 into Databricks notebook

Can anyone show me a few commands to import a table, say "mytable2 From: Microsoft SQL Server Into: Databricks Notebook using spark dataframe or at least pandas dataframeCheers!

  • 3279 Views
  • 3 replies
  • 1 kudos
Latest Reply
irfanaziz
Contributor II
  • 1 kudos

You can read any table from MSSQL. You would need to authenticate to the db, so your would need the connection string:def dbProps(): return { "user" : "db-user", "password" : "your password", "driver" : "com.microsoft.sqlserver.jdbc.SQLServerD...

  • 1 kudos
2 More Replies
Data_Analytics_
by New Contributor II
  • 13495 Views
  • 3 replies
  • 3 kudos

Resolved! Connect SQL server using windows authentication

How do I connect to a on-premise SQL server using window authentication from a databricks notebook

  • 13495 Views
  • 3 replies
  • 3 kudos
Latest Reply
User16829050420
Databricks Employee
  • 3 kudos

We should have network setup from databricks Vnet to the on-prem SQL server. Then the connection from the databricks notebook using JDBC using Windows authenticated username/password - https://docs.microsoft.com/en-us/azure/databricks/data/data-sourc...

  • 3 kudos
2 More Replies
chandra_ym
by New Contributor II
  • 22064 Views
  • 7 replies
  • 2 kudos

Resolved! recommended course ?

hello, I am new here. Any recommended courses for Databricks Certified Associate Developer for Apache Spark 3.0 - Python ? Thank you

  • 22064 Views
  • 7 replies
  • 2 kudos
Latest Reply
fabio2352
Contributor
  • 2 kudos

Hi, this post have a practice exams:https://files.training.databricks.com/assessments/practice-exams/PracticeExam-DCADAS3-Python.pdf?_gl=1*1kqf0to*_gcl_aw*R0NMLjE2ODI0NDkyOTcuRUFJYUlRb2JDaE1JNWFTZ2d0ekZfZ0lWSkc1dkJCMVQ2UTJNRUFBWUFpQUFFZ0pOc3ZEX0J3RQ.

  • 2 kudos
6 More Replies
uzairm
by Databricks Partner
  • 10929 Views
  • 12 replies
  • 3 kudos

Resolved! Concurrent Jobs - The spark driver has stopped unexpectedly!

Hi, I am running concurrent notebooks in concurrent workflow jobs in job compute cluster c5a.8xlarge with 5-7 worker nodes. Each job has 100 concurrent child notebooks and there are 10 job instances. 8/10 jobs gives the error the spark driver has sto...

  • 10929 Views
  • 12 replies
  • 3 kudos
Latest Reply
Anonymous
Not applicable
  • 3 kudos

Hi @uzair mustafa​ Hope everything is going great.Just wanted to check in if you were able to resolve your issue. If yes, would you be happy to mark an answer as best so that other members can find the solution more quickly? If not, please tell us so...

  • 3 kudos
11 More Replies
564824
by New Contributor II
  • 2577 Views
  • 2 replies
  • 0 kudos

Job webhook alerts are not sending authorization headers

Hi, I have set up a webhook which will send the event to a lambda in AWS. I validate the event through the credentials given while creating the webhook but sometimes the event that is being sent from databricks does not contain authorization in the h...

  • 2577 Views
  • 2 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

@Muthu Kumaran​ :If the event being sent from Databricks to your Lambda function sometimes does not contain authorization headers, you may need to modify your webhook configuration or Lambda function code to handle this situation. Here are a few sugg...

  • 0 kudos
1 More Replies
qwerty1
by Contributor
  • 9710 Views
  • 3 replies
  • 1 kudos

Is there a way to register a scala function that is available to other notebooks?

I am in a situation where I have a notebook that runs in a pipeline that creates a "live streaming table". So, I cannot use a language other than sql in the pipeline. I would like to format a certain column in the pipeline using a scala code (it's a ...

  • 9710 Views
  • 3 replies
  • 1 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 1 kudos

no, DLT does not work with Scala unfortunately.Delta Live Tables are not vanilla spark.Is python an option instead of scala?

  • 1 kudos
2 More Replies
Sushma
by New Contributor
  • 2312 Views
  • 1 replies
  • 0 kudos

Databricks Lakehouse Fundamentals Certificate and Badge not received

I successfully passed the test after completing the course but I haven't received any certification or badge yet.Any Help is much appreciated. @Vidula Khanna​ 

  • 2312 Views
  • 1 replies
  • 0 kudos
Latest Reply
Vartika
Databricks Employee
  • 0 kudos

Hi @Sushma Rani​,Thank you for reaching out! Please submit a ticket to our Training Team here: https://help.databricks.com/s/contact-us?ReqType=training  and our team will get back to you shortly. 

  • 0 kudos
qwerty1
by Contributor
  • 13008 Views
  • 2 replies
  • 2 kudos

Resolved! Doing a a join within the same row in SQL

My data is a dump of JSON response from an API. The schema of the json iscol_name data_type   data array<struct<attributes:struct<name: String, age: Int relationships:struct<address:struct<data:arraay<struct<id: long, type: string>>>>>>>  ...

  • 13008 Views
  • 2 replies
  • 2 kudos
dalion
by New Contributor III
  • 6759 Views
  • 5 replies
  • 0 kudos

Azure Databricks - ADLS Gen 2.0 Access

Hi all, I have a Azure Databricks Setup (non-premium) and an ADLS Gen 2.0 setup. I am trying to access the ADLS Gen 2.0 containers via a simple access key mode for testing.There is no error, if the ADLS Gen 2.0 is set to "Enable from all networks". B...

  • 6759 Views
  • 5 replies
  • 0 kudos
Latest Reply
fabio2352
Contributor
  • 0 kudos

Hi, can you check two link belowhttps://learn.microsoft.com/en-us/azure/databricks/getting-started/connect-to-azure-storagehttps://docs.databricks.com/storage/azure-storage.html

  • 0 kudos
4 More Replies
sudhanshu1
by New Contributor III
  • 7227 Views
  • 7 replies
  • 0 kudos

Incremental Data copy from one SQL DB to another DB

Hi All,I have 20 tables in source sql DB and we need to create pipeline to incrementally load data into target database .Can some one please suggest me best approach to achieve this using Azure Databricks please?Should i use merge Into ? Copy Into? o...

  • 7227 Views
  • 7 replies
  • 0 kudos
Latest Reply
Vartika
Databricks Employee
  • 0 kudos

Hi @SUDHANSHU RAJ​,Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answers...

  • 0 kudos
6 More Replies
Labels