cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

sagiatul
by New Contributor II
  • 6991 Views
  • 2 replies
  • 3 kudos

Databricks driver logs

I am running jobs on databricks clusters. When the cluster is running I am able to find the executor logs by going to Spark Cluster UI Master dropdown, selecting a worker and going through the stderr logs. However, once the job is finished and cluste...

image
  • 6991 Views
  • 2 replies
  • 3 kudos
Latest Reply
Anonymous
Not applicable
  • 3 kudos

Hi @Atul Arora​ Thank you for your question! To assist you better, please take a moment to review the answer and let me know if it best fits your needs.Please help us select the best solution by clicking on "Select As Best" if it does.Your feedback w...

  • 3 kudos
1 More Replies
saikrishna3390
by New Contributor II
  • 9523 Views
  • 2 replies
  • 2 kudos

How do I configure managed identity to databricks cluster and access azure storage using spark config

Partner want to use adf managed identity to connect to my databricks cluster and connect to my azure storage and copy the data from my azure storage to their azure storage storage

  • 9523 Views
  • 2 replies
  • 2 kudos
Latest Reply
Anonymous
Not applicable
  • 2 kudos

Hi @SAI PUSALA​ Thank you for your question! To assist you better, please take a moment to review the answer and let me know if it best fits your needs.Please help us select the best solution by clicking on "Select As Best" if it does.Your feedback w...

  • 2 kudos
1 More Replies
js54123875
by New Contributor III
  • 3685 Views
  • 2 replies
  • 2 kudos

Failure to initialize configuration' on SQL Warehouse Tables

Yesterday I had a basic DLT pipeline up and running, and was able to query the hive_metastore tables successfully. The pipeline uses autloader to ingest a few csv files from cloud storage to streaming live bronze and silver tables. Today after star...

image of error image
  • 3685 Views
  • 2 replies
  • 2 kudos
Latest Reply
Anonymous
Not applicable
  • 2 kudos

Hi @Jennette Shepard​ Thank you for your question! To assist you better, please take a moment to review the answer and let me know if it best fits your needs.Please help us select the best solution by clicking on "Select As Best" if it does.Your feed...

  • 2 kudos
1 More Replies
User16752244127
by Databricks Employee
  • 2923 Views
  • 2 replies
  • 4 kudos

Resolved! DLT code examples and notebooks?

we like the examples that you show in webinars especially with DLT and Huggingface or DLT with ingestion from Kafka, are they publicly available?

  • 2923 Views
  • 2 replies
  • 4 kudos
Latest Reply
Anonymous
Not applicable
  • 4 kudos

Hi @Frank Munz​ Thank you for your question! To assist you better, please take a moment to review the answer and let me know if it best fits your needs.Please help us select the best solution by clicking on "Select As Best" if it does.Your feedback w...

  • 4 kudos
1 More Replies
Vijay_Bhau
by New Contributor II
  • 5013 Views
  • 4 replies
  • 3 kudos
  • 5013 Views
  • 4 replies
  • 3 kudos
Latest Reply
Anonymous
Not applicable
  • 3 kudos

Hi @Vijay Gadhave​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Than...

  • 3 kudos
3 More Replies
nirajtanwar
by New Contributor
  • 3022 Views
  • 2 replies
  • 2 kudos

To collect the elements of a SparkDataFrame and coerces them into an R dataframe.

Hello Everyone,I am facing the challenge while collecting a spark dataframe into an R dataframe, this I need to do as I am using TraMineR algorithm whih is implemented in R only and the data pre-processing I have done in pysparkI am trying this:event...

  • 3022 Views
  • 2 replies
  • 2 kudos
Latest Reply
Anonymous
Not applicable
  • 2 kudos

Hi @Niraj Tanwar​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Thank...

  • 2 kudos
1 More Replies
Arunsundar
by Databricks Partner
  • 4108 Views
  • 4 replies
  • 3 kudos

Automating the initial configuration of dbx

Hi Team,Good morning.As of now, for the deployment of our code to Databricks, dbx is configured providing the parameters such as cloud provider, git provider, etc., Say, I have a code repository in any one of the git providers. Can this process of co...

  • 4108 Views
  • 4 replies
  • 3 kudos
Latest Reply
Anonymous
Not applicable
  • 3 kudos

Hi @Arunsundar Muthumanickam​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear fr...

  • 3 kudos
3 More Replies
Mado
by Valued Contributor II
  • 8759 Views
  • 4 replies
  • 1 kudos

Resolved! How to set properties for a delta table when I want to write a DataFrame?

Hi,I have a PySpark DataFrame with 11 million records. I created the DataFrame on a cluster. It is not saved on DBFS or storage account. import pyspark.sql.functions as F from pyspark.sql.functions import col, when, floor, expr, hour, minute, to_time...

  • 8759 Views
  • 4 replies
  • 1 kudos
Latest Reply
Lakshay
Databricks Employee
  • 1 kudos

Hi @Mohammad Saber​ , Are you getting the error while writing the file to the table? Or before that?

  • 1 kudos
3 More Replies
Andrei_Radulesc
by Contributor III
  • 10144 Views
  • 2 replies
  • 2 kudos

Resolved! FutureWarning: Deprecated in 3.0.0. Use SparkSession.builder.getOrCreate() instead.

I'm trying to get rid of the warning below:/databricks/spark/python/pyspark/sql/context.py:117: FutureWarning: Deprecated in 3.0.0. Use SparkSession.builder.getOrCreate() instead.In my setup, I have a front-end notebook that gets parameters from the ...

  • 10144 Views
  • 2 replies
  • 2 kudos
Latest Reply
Andrei_Radulesc
Contributor III
  • 2 kudos

That fixes it. Thanks. I need to do spark = SparkSession.builder.getOrCreate() df = spark.table("prod.some_schema.some_table")instead of sc = SparkSession.builder.getOrCreate()   sqlc = SQLContext(sc)   df = sqlc.table(f"prod.some_schema.some...

  • 2 kudos
1 More Replies
sage5616
by Valued Contributor
  • 9684 Views
  • 1 replies
  • 3 kudos

Resolved! Set Workflow Job Concurrency Limit

Hi Everyone,I need a job to be triggered every 5 minutes. However, if that job is already running, it must not be triggered again until that run is finished. Hence, I need to set the maximum run concurrency for that job to only one instance at a time...

  • 9684 Views
  • 1 replies
  • 3 kudos
Latest Reply
Anonymous
Not applicable
  • 3 kudos

@Michael Okulik​ :To ensure that a Databricks job is not triggered again until a running instance of the job is completed, you can set the maximum concurrency for the job to 1. Here's how you can configure this in Databricks:Go to the Databricks work...

  • 3 kudos
sandeepv
by New Contributor II
  • 3651 Views
  • 3 replies
  • 0 kudos

Databricks Spark certification voucher code expired

Hi Team,I am getting error that voucher code expired when trying to register for "Databricks Certified Associate Developer for Apache Spark 3.0 - Python" certification.Can you please help here

  • 3651 Views
  • 3 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Hi @Sandeep Venishetti​ Hope everything is going great.Just wanted to check in if you were able to resolve your issue. If yes, would you be happy to mark an answer as best so that other members can find the solution more quickly? If not, please tell ...

  • 0 kudos
2 More Replies
User16790091296
by Databricks Employee
  • 17451 Views
  • 3 replies
  • 0 kudos
  • 17451 Views
  • 3 replies
  • 0 kudos
Latest Reply
NubeEra
New Contributor II
  • 0 kudos

Databricks provides 4 main deployment models they are:Public Cloud Deployment Model: Databricks can be deployed on public cloud platforms such as AWS, Azure, and Google Cloud Platform. This is the most common deployment model for Databricks and provi...

  • 0 kudos
2 More Replies
Charmin
by New Contributor
  • 2064 Views
  • 1 replies
  • 0 kudos

Why 'runCommand' action does NOT show up in databricksNotebook audit log table?

I understand databricks can send diagnostic/audit logs to log analytics in azure. There is a standard 'DatabricksNotebook' table that provides audit log for notebook actions. In this table there is an action called 'runCommand' but this does not show...

  • 2064 Views
  • 1 replies
  • 0 kudos
Latest Reply
rsamant07
New Contributor III
  • 0 kudos

Hi @Charmin patel​  , you need to enable verbose audit logging in workspace setting for runCommand to appear in the audit logs

  • 0 kudos
laksh
by New Contributor II
  • 2754 Views
  • 2 replies
  • 0 kudos

Real time data quality validation (Streaming data ingestion)

I was wondering how the Unity Catalog would help in data quality validations for real time (streaming data) data ingestion. 

  • 2754 Views
  • 2 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Hi @arun laksh​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Thanks!

  • 0 kudos
1 More Replies
mahesh_vardhan_
by New Contributor
  • 6318 Views
  • 2 replies
  • 2 kudos

Resolved! How do I use numpy case when condition in pyspark.pandas?

I do have some legacy pandas codes which I want to migrate to spark to leaverage parellelization in Databricks. I see datadricks has launched a wrapper package on top of pandas which uses pandas nomenclature but use spark engine in the backend.I comf...

  • 6318 Views
  • 2 replies
  • 2 kudos
Latest Reply
Anonymous
Not applicable
  • 2 kudos

Hi @mahesh vardhan gandhi​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from ...

  • 2 kudos
1 More Replies
Labels