cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

issibra
by New Contributor III
  • 1841 Views
  • 1 replies
  • 1 kudos

ReadStream & writeStream at gold layer level

Hello, I have seen in many places readStream and writeStream in gold layer, Is it correct to use readStream and writeStream for gold layer ? knowing that a gold table is no not valid for streaming.is there some logic when to use readStream/ writeStr...

  • 1841 Views
  • 1 replies
  • 1 kudos
Latest Reply
Anonymous
Not applicable
  • 1 kudos

Hi @Ibrahim ISSOUANI​ Great to meet you, and thanks for your question! Let's see if your peers in the community have an answer to your question. Thanks.

  • 1 kudos
Jaeseon
by New Contributor II
  • 2190 Views
  • 1 replies
  • 0 kudos

Failed to import `Ray` on jupyter notebook.

While working on my school's Linux server, I encountered an issue while attempting to install and import Ray in my Jupyter Notebook. I successfully installed the package ray==2.4.0, but encountered an error when trying to import it, specifically stat...

  • 2190 Views
  • 1 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Hi @Jaeseon Song​ Great to meet you, and thanks for your question! Let's see if your peers in the community have an answer to your question. Thanks.

  • 0 kudos
ossinova
by Contributor II
  • 2559 Views
  • 1 replies
  • 0 kudos

Creating cluster from ADF linked service with Workspace init script

Similar issue: https://stackoverflow.com/questions/76220211/create-new-databricks-cluster-from-adf-linked-service-with-initscripts-from-abfsI am trying to create clusters using ADF linked service where the cluster is configured with a init script. As...

  • 2559 Views
  • 1 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Hi @Oscar Dyremyhr​ Great to meet you, and thanks for your question! Let's see if your peers in the community have an answer to your question. Thanks.

  • 0 kudos
AdrianQ
by New Contributor II
  • 3169 Views
  • 1 replies
  • 4 kudos

How to use HTML tags in Alert templates?

According to the alert docs (here), HTML tags should work to format messages in a custom template. When I tried using them, it doesn't seem able to recognize them however and just returns the whole text.ie

image.png
  • 3169 Views
  • 1 replies
  • 4 kudos
Latest Reply
Anonymous
Not applicable
  • 4 kudos

Hi @Adrian Quicoy​ Great to meet you, and thanks for your question! Let's see if your peers in the community have an answer to your question. Thanks.

  • 4 kudos
LiliL
by New Contributor
  • 1578 Views
  • 1 replies
  • 1 kudos
  • 1578 Views
  • 1 replies
  • 1 kudos
Latest Reply
Anonymous
Not applicable
  • 1 kudos

Hi @Lili Levin​ Great to meet you, and thanks for your question! Let's see if your peers in the community have an answer to your question. Thanks.

  • 1 kudos
Chengcheng
by New Contributor III
  • 2533 Views
  • 1 replies
  • 4 kudos

Is Feature Store packaged model compatible with Spark UDF?

Hi, I tried to deploy a Feature Store packaged model into Delta Live Table using mlflow.pyfunc.spark_udf in Azure Databricks. This model is built by Databricks autoML with joined Feature Table inside it.And I'm trying to make prediction using the fol...

  • 2533 Views
  • 1 replies
  • 4 kudos
Latest Reply
Anonymous
Not applicable
  • 4 kudos

Hi @Chengcheng Guo​ Great to meet you, and thanks for your question! Let's see if your peers in the community have an answer to your question. Thanks.

  • 4 kudos
vanessafvg
by New Contributor III
  • 3241 Views
  • 1 replies
  • 3 kudos

Extracting data from excel in datalake storage using openpyxl

i am trying to extract some data into databricks but tripping all over openpyxl, newish user of databricks..from openpyxl import load_workbookdirectory_id="hidden"scope="hidden"client_id="hidden"service_credential_key="hidden"container_name="hidden"s...

  • 3241 Views
  • 1 replies
  • 3 kudos
Latest Reply
Anonymous
Not applicable
  • 3 kudos

Hi @Vanessa Van Gelder​ Great to meet you, and thanks for your question! Let's see if your peers in the community have an answer to your question. Thanks.

  • 3 kudos
guostong
by New Contributor III
  • 7487 Views
  • 1 replies
  • 1 kudos

How to update the items in array of struct column with sql

create table test.json_test_01 ( id int, description string, struct_address STRUCT<street_number: STRING, street_name: STRING, city: STRING, province: STRING>, arrary_phone ARRAY<STRUCT<phone_number: STRING, phone_type: STRING>> );   insert into ...

  • 7487 Views
  • 1 replies
  • 1 kudos
Latest Reply
Anonymous
Not applicable
  • 1 kudos

Hi @Richard Guo​ Great to meet you, and thanks for your question! Let's see if your peers in the community have an answer to your question. Thanks.

  • 1 kudos
timothy_uk
by New Contributor III
  • 1965 Views
  • 1 replies
  • 1 kudos

Mysterious simultaneous long-running Databricks Workflows

Hi,This happened across 4x seemingly unrelated workflows at the same time of the day - all 4x workflows eventually completed successfully. It appeared that all workflows sat idling despite triggering via the Jobs API. The two symptoms I have observed...

  • 1965 Views
  • 1 replies
  • 1 kudos
Latest Reply
Anonymous
Not applicable
  • 1 kudos

Hi @Timothy Lin​ Great to meet you, and thanks for your question! Let's see if your peers in the community have an answer to your question. Thanks.

  • 1 kudos
pskchai
by New Contributor
  • 3095 Views
  • 2 replies
  • 0 kudos

Resolved! Using DLT with a non-streaming large table

We have a source table that receives daily append operations, but the rows created within the last 30 days in this table can be updated or deleted. Thus, the source table is not exactly a streaming source.Our processing workflow involves performing "...

  • 3095 Views
  • 2 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Hi @Pongsakorn Chairatanakul​ Hope everything is going great.Just wanted to check in if you were able to resolve your issue. If yes, would you be happy to mark an answer as best so that other members can find the solution more quickly? If not, please...

  • 0 kudos
1 More Replies
Thaw
by New Contributor III
  • 2941 Views
  • 2 replies
  • 3 kudos

Resolved! How to change Instance Family in CloudFormation in a Databricks trial mood?

I implemented Databrick on AWS and the template is used i3.xlarge. Could I use it for down Instance Family for cost optimization? Is i3.xlarge the minimum size to use Databricks in a trial mood? Thanks

  • 2941 Views
  • 2 replies
  • 3 kudos
Latest Reply
Thaw
New Contributor III
  • 3 kudos

Thank you so much for your reply to my question, @Vidula Khanna​ @Kaniz Fatma​ . After I took some study time, I understood the basics, and then I am on the way to Databricks.

  • 3 kudos
1 More Replies
dukebaslangic
by New Contributor II
  • 3156 Views
  • 3 replies
  • 3 kudos

Resolved! Databricks performance related documentation/books

Hi,Do you know any good resources about Databricks performance improvements(like improving query performances, monitoring/resolving performance bottlenecks etc)?Thanks

  • 3156 Views
  • 3 replies
  • 3 kudos
Latest Reply
Anonymous
Not applicable
  • 3 kudos

Hi @Ömer Özsakarya​  We haven't heard from you since the last response from @Lakshay Goel​ â€‹, and I was checking back to see if her suggestions helped you.Or else, If you have any solution, please share it with the community, as it can be helpful to ...

  • 3 kudos
2 More Replies
Ram443
by New Contributor III
  • 48572 Views
  • 9 replies
  • 5 kudos

Resolved! I created a data frame but was not able to see the data

Code to create a data frame:from pyspark.sql import SparkSessionspark=SparkSession.builder.appName("oracle_queries").master("local[4]")\  .config("spark.sql.warehouse.dir", "C:\\softwares\\git\\pyspark\\hive").getOrCreate()from pyspark.sql.functions ...

  • 48572 Views
  • 9 replies
  • 5 kudos
Latest Reply
Aviral-Bhardwaj
Esteemed Contributor III
  • 5 kudos

@ramanjaneyulu kancharla​  can you please select my answer as best answer

  • 5 kudos
8 More Replies
Paul_Seattle
by New Contributor
  • 7905 Views
  • 1 replies
  • 0 kudos

A Quick Question on Running a job from CLI

Could anyone tell me what could be wrong with my command to submit a spark job with params( If I don’t have --spark-submit-params, it’s fine). Please see the attached snapshot.

image
  • 7905 Views
  • 1 replies
  • 0 kudos
Latest Reply
User16539034020
Databricks Employee
  • 0 kudos

yes, there is no need for spark-submit-params. databricks jobs run-now --job-id ***reference: https://docs.databricks.com/dev-tools/cli/jobs-cli.html

  • 0 kudos
RonanStokes_DB
by Databricks Employee
  • 3730 Views
  • 1 replies
  • 1 kudos

Can you apply a specific cluster policy when launching a Databricks job via Azure Data Factory

When using Azure Data Factory to coordinate the launch of Databricks jobs - can you specify which cluster policy to apply to the job, either explicitly or implicitly?

  • 3730 Views
  • 1 replies
  • 1 kudos
Latest Reply
mvandeborne
New Contributor II
  • 1 kudos

you could, but not from ADF's UI. You need to edit the json of the linked service, adding a 'policyId' parameter in the 'typeProperties' object, pointing to the cluster policy ID from Databricks (which you could find in Databricks' URL).

  • 1 kudos
Labels