cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Liza
by New Contributor
  • 884 Views
  • 0 replies
  • 0 kudos

Work that involves Shift work, difficulties with sleep, and varying circumstances It is possible that this Modalert guide may not cover all of the pos...

Work that involves Shift work, difficulties with sleep, and varying circumstances It is possible that this Modalert guide may not cover all of the possible applications for Modalert 200.Modafinil is included in the formulation known as Modalert 200 T...

  • 884 Views
  • 0 replies
  • 0 kudos
Stita
by New Contributor II
  • 3815 Views
  • 1 replies
  • 2 kudos

Resolved! How do we pass the row tags dynamically while reading a XML file into a dataframe?

I have a set of xml files where the row tags change dynamically. How can we achieve this scenario in databricks.df1=spark.read.format('xml').option('rootTag','XRoot').option('rowTag','PL1PLLL').load("dbfs:/FileStore/tables/ins/")We need to pass a val...

  • 3815 Views
  • 1 replies
  • 2 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 2 kudos

If it is dynamically for the whole file, you can just use variabletag = 'PL1PLLL' df1=spark.read.format('xml').option('rootTag','XRoot').option('rowTag' ,tag).load("dbfs:/FileStore/tables/ins/file.xml")

  • 2 kudos
Taha_Hussain
by Databricks Employee
  • 2991 Views
  • 2 replies
  • 8 kudos

Register for Databricks Office HoursOctober 12: 8:00 - 9:00 AM PT | 3:00 - 4:00 PM GMTOctober 26: 11:00 AM - 12:00 PM PT | 6:00 - 7:00 PM GMT Databric...

Register for Databricks Office HoursOctober 12: 8:00 - 9:00 AM PT | 3:00 - 4:00 PM GMTOctober 26: 11:00 AM - 12:00 PM PT | 6:00 - 7:00 PM GMTDatabricks Office Hours connects you directly with experts to answer all your Databricks questions.Join us to...

  • 2991 Views
  • 2 replies
  • 8 kudos
Latest Reply
Taha_Hussain
Databricks Employee
  • 8 kudos

Here are some of the Questions and Answers from the 10/12 Office Hours (note: certain questions and answers have been condensed for reposting purposes):Q: What is the best approach for moving data from on-prem S3 storage into cloud blob storage into ...

  • 8 kudos
1 More Replies
Carlton
by Contributor
  • 6714 Views
  • 8 replies
  • 1 kudos

Resolved! How to Use the CharIndex with Databricks SQL

When applying the following T-SQL I don't get any errors on MS SQL ServerSELECT DISTINCT *   FROM dbo.account LEFT OUTER JOIN dbo.crm2cburl_lookup ON account.Id = CRM2CBURL_Lookup.[Key] LEFT OUTER JOIN dbo.organizations ON CRM2CBURL_Lookup.CB_UR...

  • 6714 Views
  • 8 replies
  • 1 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 1 kudos

cross apply is not a function in databricks sql.

  • 1 kudos
7 More Replies
Sulfikkar
by Contributor
  • 18444 Views
  • 4 replies
  • 3 kudos

Resolved! install a custom python package from azure devops artifact to databricks cluster

I am trying to install a package which was uploaded into the azure devops artifact into the databricks cluster by using pip.conf. Basically below are the steps I followed.(step 1 : install in local IDE)Uploaded the package to azure devops feed using ...

  • 18444 Views
  • 4 replies
  • 3 kudos
Latest Reply
Sulfikkar
Contributor
  • 3 kudos

Thanks for your time @Debayan Mukherjee​  and @Kaniz Fatma​ . We have figured out the issue along with the infra team that we had to do a public ip whitelisting of the databricks clusters in azure.I have checked the ip adress from the Spark cluster U...

  • 3 kudos
3 More Replies
joselita
by New Contributor III
  • 30525 Views
  • 4 replies
  • 8 kudos

AnalysisException: is not a Delta table.

Hello, I changed the DBR from 7.2 to 10.4 and I receive the following error: AnalysisException: is not a Delta table. The table is create , using DELTA. so for sure is a Delta table, even though, I read that I read that from vers. 8 all tables are De...

STG_DATA_LOAD
  • 30525 Views
  • 4 replies
  • 8 kudos
Latest Reply
Ryan_Chynoweth
Esteemed Contributor
  • 8 kudos

Hi @JOSELITA MOLTISANTI​ can you run the following commands and share the output? table_name = "stg_data_load" path = spark.sql(f"describe detail {table_name}").select("location").collect()[0][0].replace('dbfs:', '') dbutils.fs.ls(path)

  • 8 kudos
3 More Replies
kfoster
by Contributor
  • 2366 Views
  • 1 replies
  • 0 kudos

Resolved! DLT Pipelines call same table

Orchestration of when DLT runs is handled by Azure Data Factory. There are scenario's a table within a DLT pipeline needs to run on a different schedule.Is there a pipeline configuration option to be set to allow the same table to be ran by two diff...

  • 2366 Views
  • 1 replies
  • 0 kudos
Latest Reply
Vivian_Wilfred
Databricks Employee
  • 0 kudos

Hi @Kristian Foster​ , It should not be possible. Every pipeline owns its table and multiple pipelines cannot write to the same table.

  • 0 kudos
StephanieAlba
by Databricks Employee
  • 12189 Views
  • 6 replies
  • 9 kudos

Resolved! How do I kick off Azure Data Factory from within Databricks?

I want to kick off ingestion in ADF from Databricks. When ADF ingestion is done, my DBX bronze-silver-gold pipeline follows within DBX.I see it is possible to call Databricks notebooks from ADF. Can I also go the other way? I want to start the ingest...

  • 12189 Views
  • 6 replies
  • 9 kudos
Latest Reply
KKo
Contributor III
  • 9 kudos

Are you looking to pass output of databricks notebook to ADF?

  • 9 kudos
5 More Replies
hare
by New Contributor III
  • 4978 Views
  • 1 replies
  • 5 kudos

"Databricks" - "PySpark" - Read "JSON" file - Azure Blob container - "APPEND BLOB"

Hi All, We are getting JSON files in Azure blob container and its "Blob Type" is "Append Blob".We are getting an error "AnalysisException: Unable to infer schema for JSON. It must be specified manually.", when we try to read using below mentioned scr...

  • 4978 Views
  • 1 replies
  • 5 kudos
Latest Reply
User16856839485
Databricks Employee
  • 5 kudos

There currently does not appear to be direct support for append blob reads, however, converting the append blob to block blob [and then parquet or delta, etc.] are a viable option:https://kb.databricks.com/en_US/data-sources/wasb-check-blob-types?_ga...

  • 5 kudos
leos1
by New Contributor II
  • 2159 Views
  • 2 replies
  • 0 kudos

Resolved! Question regarding ZORDER option of OPTIMIZE

Is the order of the columns in ZORDER important? For example, does ZORDER BY (product, site) and ZORDER BY (site, product) produce the same results?

  • 2159 Views
  • 2 replies
  • 0 kudos
Latest Reply
leos1
New Contributor II
  • 0 kudos

thanks for the quick reply

  • 0 kudos
1 More Replies
Trey
by New Contributor III
  • 3748 Views
  • 2 replies
  • 6 kudos

Resolved! Is it a good idea to use a managed delta table as a temporal table?

Hi all!I would like to use a managed delta table as a temporal table, meaning:to create a managed table in the middle of ETL processto drop the managed table right after the processThis way I can perform merge, insert, or delete oprations better than...

  • 3748 Views
  • 2 replies
  • 6 kudos
Latest Reply
karthik_p
Esteemed Contributor
  • 6 kudos

@Kwangwon Yi​ Instead of performance, main issue with managed table is whenever you delete table, data under that table gets deleted.If you have good use case on Reporting, best approach is to go with external storage location to store your managed t...

  • 6 kudos
1 More Replies
Matt101122
by Contributor II
  • 2939 Views
  • 1 replies
  • 1 kudos

Resolved! why aren't rdds using all available cores of executor?

I'm extracting data from a custom format by day of month using a 32 core executor. I'm using rdds to distribute work across cores of the executor. I'm seeing an intermittent issue where for a run sometimes I see 31 cores being used as expected and ot...

image image
  • 2939 Views
  • 1 replies
  • 1 kudos
Latest Reply
Matt101122
Contributor II
  • 1 kudos

I may have figured this out! I'm explicitly setting the number of slices instead of using the default.days_rdd = sc.parallelize(days_to_process,len(days_to_process))

  • 1 kudos
enavuio
by New Contributor II
  • 3027 Views
  • 2 replies
  • 3 kudos

Count on External Table to Azure Data Storage is taking too long

I have created an External table to Azure Data Lake Storage Gen2.The Container has about 200K Json files.The structure of the json files are created with```CREATE EXTERNAL TABLE IF NOT EXISTS dbo.table(    ComponentInfo STRUCT<ComponentHost: STRING, ...

  • 3027 Views
  • 2 replies
  • 3 kudos
Latest Reply
Anonymous
Not applicable
  • 3 kudos

Hi @Ena Vu​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Thanks!

  • 3 kudos
1 More Replies
parthsalvi
by Contributor
  • 2867 Views
  • 3 replies
  • 1 kudos

Unable to update permissions in Unity Catalog object in Single User Mode DBR 11.2

We're trying to update permissions of catalogs in Single User Cluster Mode but running into following error We were able to update permission in Shared Mode. We used Shared mode to create objects but using single user mode to update permission seems...

image.png
  • 2867 Views
  • 3 replies
  • 1 kudos
Latest Reply
Anonymous
Not applicable
  • 1 kudos

Hi @Parth Salvi​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Thanks...

  • 1 kudos
2 More Replies
AJMorgan591
by New Contributor II
  • 4743 Views
  • 4 replies
  • 0 kudos

Temporarily disable Photon

Is it possible to temporarily disable Photon?I have a large workload that greatly benefits from Photon apart from a specific operation therein that is actually slowed by Photon. It's not worth creating a separate cluster for this operation however, s...

  • 4743 Views
  • 4 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Hi @Aaron Morgan​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Thank...

  • 0 kudos
3 More Replies

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels