cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

ossinova
by Contributor II
  • 2103 Views
  • 1 replies
  • 0 kudos

Creating cluster from ADF linked service with Workspace init script

Similar issue: https://stackoverflow.com/questions/76220211/create-new-databricks-cluster-from-adf-linked-service-with-initscripts-from-abfsI am trying to create clusters using ADF linked service where the cluster is configured with a init script. As...

  • 2103 Views
  • 1 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Hi @Oscar Dyremyhr​ Great to meet you, and thanks for your question! Let's see if your peers in the community have an answer to your question. Thanks.

  • 0 kudos
AdrianQ
by New Contributor II
  • 2457 Views
  • 1 replies
  • 4 kudos

How to use HTML tags in Alert templates?

According to the alert docs (here), HTML tags should work to format messages in a custom template. When I tried using them, it doesn't seem able to recognize them however and just returns the whole text.ie

image.png
  • 2457 Views
  • 1 replies
  • 4 kudos
Latest Reply
Anonymous
Not applicable
  • 4 kudos

Hi @Adrian Quicoy​ Great to meet you, and thanks for your question! Let's see if your peers in the community have an answer to your question. Thanks.

  • 4 kudos
LiliL
by New Contributor
  • 1288 Views
  • 1 replies
  • 1 kudos
  • 1288 Views
  • 1 replies
  • 1 kudos
Latest Reply
Anonymous
Not applicable
  • 1 kudos

Hi @Lili Levin​ Great to meet you, and thanks for your question! Let's see if your peers in the community have an answer to your question. Thanks.

  • 1 kudos
Chengcheng
by New Contributor III
  • 2175 Views
  • 1 replies
  • 4 kudos

Is Feature Store packaged model compatible with Spark UDF?

Hi, I tried to deploy a Feature Store packaged model into Delta Live Table using mlflow.pyfunc.spark_udf in Azure Databricks. This model is built by Databricks autoML with joined Feature Table inside it.And I'm trying to make prediction using the fol...

  • 2175 Views
  • 1 replies
  • 4 kudos
Latest Reply
Anonymous
Not applicable
  • 4 kudos

Hi @Chengcheng Guo​ Great to meet you, and thanks for your question! Let's see if your peers in the community have an answer to your question. Thanks.

  • 4 kudos
vanessafvg
by New Contributor III
  • 2726 Views
  • 1 replies
  • 3 kudos

Extracting data from excel in datalake storage using openpyxl

i am trying to extract some data into databricks but tripping all over openpyxl, newish user of databricks..from openpyxl import load_workbookdirectory_id="hidden"scope="hidden"client_id="hidden"service_credential_key="hidden"container_name="hidden"s...

  • 2726 Views
  • 1 replies
  • 3 kudos
Latest Reply
Anonymous
Not applicable
  • 3 kudos

Hi @Vanessa Van Gelder​ Great to meet you, and thanks for your question! Let's see if your peers in the community have an answer to your question. Thanks.

  • 3 kudos
guostong
by New Contributor III
  • 6343 Views
  • 1 replies
  • 1 kudos

How to update the items in array of struct column with sql

create table test.json_test_01 ( id int, description string, struct_address STRUCT<street_number: STRING, street_name: STRING, city: STRING, province: STRING>, arrary_phone ARRAY<STRUCT<phone_number: STRING, phone_type: STRING>> );   insert into ...

  • 6343 Views
  • 1 replies
  • 1 kudos
Latest Reply
Anonymous
Not applicable
  • 1 kudos

Hi @Richard Guo​ Great to meet you, and thanks for your question! Let's see if your peers in the community have an answer to your question. Thanks.

  • 1 kudos
timothy_uk
by New Contributor III
  • 1393 Views
  • 1 replies
  • 1 kudos

Mysterious simultaneous long-running Databricks Workflows

Hi,This happened across 4x seemingly unrelated workflows at the same time of the day - all 4x workflows eventually completed successfully. It appeared that all workflows sat idling despite triggering via the Jobs API. The two symptoms I have observed...

  • 1393 Views
  • 1 replies
  • 1 kudos
Latest Reply
Anonymous
Not applicable
  • 1 kudos

Hi @Timothy Lin​ Great to meet you, and thanks for your question! Let's see if your peers in the community have an answer to your question. Thanks.

  • 1 kudos
pskchai
by New Contributor
  • 2477 Views
  • 2 replies
  • 0 kudos

Resolved! Using DLT with a non-streaming large table

We have a source table that receives daily append operations, but the rows created within the last 30 days in this table can be updated or deleted. Thus, the source table is not exactly a streaming source.Our processing workflow involves performing "...

  • 2477 Views
  • 2 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Hi @Pongsakorn Chairatanakul​ Hope everything is going great.Just wanted to check in if you were able to resolve your issue. If yes, would you be happy to mark an answer as best so that other members can find the solution more quickly? If not, please...

  • 0 kudos
1 More Replies
Thaw
by New Contributor III
  • 2332 Views
  • 2 replies
  • 3 kudos

Resolved! How to change Instance Family in CloudFormation in a Databricks trial mood?

I implemented Databrick on AWS and the template is used i3.xlarge. Could I use it for down Instance Family for cost optimization? Is i3.xlarge the minimum size to use Databricks in a trial mood? Thanks

  • 2332 Views
  • 2 replies
  • 3 kudos
Latest Reply
Thaw
New Contributor III
  • 3 kudos

Thank you so much for your reply to my question, @Vidula Khanna​ @Kaniz Fatma​ . After I took some study time, I understood the basics, and then I am on the way to Databricks.

  • 3 kudos
1 More Replies
dukebaslangic
by New Contributor II
  • 2524 Views
  • 3 replies
  • 3 kudos

Resolved! Databricks performance related documentation/books

Hi,Do you know any good resources about Databricks performance improvements(like improving query performances, monitoring/resolving performance bottlenecks etc)?Thanks

  • 2524 Views
  • 3 replies
  • 3 kudos
Latest Reply
Anonymous
Not applicable
  • 3 kudos

Hi @Ömer Özsakarya​  We haven't heard from you since the last response from @Lakshay Goel​ â€‹, and I was checking back to see if her suggestions helped you.Or else, If you have any solution, please share it with the community, as it can be helpful to ...

  • 3 kudos
2 More Replies
Ram443
by New Contributor III
  • 42495 Views
  • 9 replies
  • 5 kudos

Resolved! I created a data frame but was not able to see the data

Code to create a data frame:from pyspark.sql import SparkSessionspark=SparkSession.builder.appName("oracle_queries").master("local[4]")\  .config("spark.sql.warehouse.dir", "C:\\softwares\\git\\pyspark\\hive").getOrCreate()from pyspark.sql.functions ...

  • 42495 Views
  • 9 replies
  • 5 kudos
Latest Reply
Aviral-Bhardwaj
Esteemed Contributor III
  • 5 kudos

@ramanjaneyulu kancharla​  can you please select my answer as best answer

  • 5 kudos
8 More Replies
Paul_Seattle
by New Contributor
  • 7451 Views
  • 1 replies
  • 0 kudos

A Quick Question on Running a job from CLI

Could anyone tell me what could be wrong with my command to submit a spark job with params( If I don’t have --spark-submit-params, it’s fine). Please see the attached snapshot.

image
  • 7451 Views
  • 1 replies
  • 0 kudos
Latest Reply
User16539034020
Databricks Employee
  • 0 kudos

yes, there is no need for spark-submit-params. databricks jobs run-now --job-id ***reference: https://docs.databricks.com/dev-tools/cli/jobs-cli.html

  • 0 kudos
RonanStokes_DB
by Databricks Employee
  • 3137 Views
  • 1 replies
  • 1 kudos

Can you apply a specific cluster policy when launching a Databricks job via Azure Data Factory

When using Azure Data Factory to coordinate the launch of Databricks jobs - can you specify which cluster policy to apply to the job, either explicitly or implicitly?

  • 3137 Views
  • 1 replies
  • 1 kudos
Latest Reply
mvandeborne
New Contributor II
  • 1 kudos

you could, but not from ADF's UI. You need to edit the json of the linked service, adding a 'policyId' parameter in the 'typeProperties' object, pointing to the cluster policy ID from Databricks (which you could find in Databricks' URL).

  • 1 kudos
pcriado
by New Contributor III
  • 7699 Views
  • 2 replies
  • 1 kudos

Resolved! Requested array size exceeds VM limit when saving to feature table

Hi, I'm trying to process a small dataset (less than 300 Mb) composed by five queries that run with spark. The end result of those queries is parsed using python and merged into a data frame. Then I try to write this to a delta lake table using featu...

  • 7699 Views
  • 2 replies
  • 1 kudos
Latest Reply
pcriado
New Contributor III
  • 1 kudos

Hello, we have recently found that it's my user in particular that casues the memory issue. Two other users in my organization can run the same notebook without problems, but my user consistenly consumes all available ram and crashes the cluster... a...

  • 1 kudos
1 More Replies
gustavomcarmo-h
by New Contributor III
  • 4268 Views
  • 5 replies
  • 2 kudos

Resolved! Is there a way to list the dlt maintenance jobs through the API?

After creating the delta pipeline, I would like to get details from the dlt maintenance job automatically created by Databricks, like the scheduled time when the dlt maintenance tasks will be executed. However, it seems the Job API 2.1 doesn't cover ...

  • 4268 Views
  • 5 replies
  • 2 kudos
Latest Reply
gustavomcarmo-h
New Contributor III
  • 2 kudos

Hi @Debayan Mukherjee​ ,Actually the Databricks Jobs API documentation has not been fixed yet. The parameter `job_type` should be included in the list endpoint request documentation. Please do this in order to avoid unnecessary questions here in the ...

  • 2 kudos
4 More Replies

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels