Data Engineering

Forum Posts

Sorted by:

by irispan • New Contributor II

08-22-2022 8:14:27 AM

5504 Views
4 replies
1 kudos

Recommended Hive metastore pattern for Trino integration

Hi, i have several questions regarding Trino integration:Is it recommended to use an external Hive metastore or leverage on the databricks-maintained Hive metastore when it comes to enabling external query engines such as Trino?When I tried to use ex...

Data Engineering

5504 Views
4 replies
1 kudos

08-22-2022 8:14:27 AM

View Replies

Latest Reply

JunlinZeng
Databricks Employee

08-25-2023 1:19:30 AM

1 kudos

> Is it recommended to use an external Hive metastore or leverage on the databricks-maintained Hive metastore when it comes to enabling external query engines such as Trino?Databricks maintained hive metastore is not suggested to be used externally. ...

1 kudos

08-25-2023 1:19:30 AM

3 More Replies

by CDICSteph • New Contributor

04-28-2023 9:29:59 AM

3419 Views
2 replies
0 kudos

Need pattern for loading a million small XML files

Hi, looking for the right solution pattern for this scenario: We have millions of relatively small XML files (currently sitting in ADLS) that we have to load into delta lake. Each XML file has to be read, parsed, and pivoted before writing to a delta...

Data Engineering

3419 Views
2 replies
0 kudos

04-28-2023 9:29:59 AM

View Replies

Latest Reply

Anonymous
Not applicable

04-29-2023 12:20:18 AM

0 kudos

Hi @Steph Swierenga Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answe...

0 kudos

04-29-2023 12:20:18 AM

1 More Replies

by Jfoxyyc • Valued Contributor

03-06-2023 8:56:48 AM

2741 Views
2 replies
0 kudos

DLT - deduplication pattern?

Say we have an incremental append happening using autoloader, where filename is being added to the dataframe and that's all. If we want to de-duplicate this data in a rolling window, we can do something like merge into logs using dedupedLogs on ...

Data Engineering

2741 Views
2 replies
0 kudos

03-06-2023 8:56:48 AM

View Replies

Latest Reply

Anonymous
Not applicable

03-31-2023 5:10:31 PM

0 kudos

Hi @Jordan Fox Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Thanks!

0 kudos

03-31-2023 5:10:31 PM

1 More Replies

by Srikanth_Gupta_ • Databricks Employee

06-25-2021 8:06:07 AM

1875 Views
1 replies
0 kudos

Can we subscribe to pattern of topics(Kafka) from Structured streaming

Data Engineering

1875 Views
1 replies
0 kudos

06-25-2021 8:06:07 AM

View Replies

Latest Reply

Srikanth_Gupta_
Databricks Employee

06-25-2021 8:06:58 AM

0 kudos

Yes we can using below code snippetspark .readStream .format("kafka") .option("kafka.bootstrap.servers", "host1:port1,host2:port2") .option("subscribePattern", "topic.*") .load()

0 kudos

06-25-2021 8:06:58 AM

Databricks Community

Recommended Hive metastore pattern for Trino integration

Need pattern for loading a million small XML files

DLT - deduplication pattern?

Can we subscribe to pattern of topics(Kafka) from Structured streaming