cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 
Data + AI Summit 2024 - Data Engineering & Streaming

Forum Posts

quakenbush
by Contributor
  • 3573 Views
  • 4 replies
  • 5 kudos

Resolved! Does Databricks offer something like Oracle's dblink?

I am aware, I can load anything into a DataFrame using JDBC, that works well from Oracle sources. Is there an equivalent in Spark SQL, so I can combine datasets as well?Basically something like so - you get the idea...select lt.field1, rt.fie...

  • 3573 Views
  • 4 replies
  • 5 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 5 kudos

Hi @Roger Bieri​  (Customer)​, I appreciate your attempt to choose the best answer for us. I'm glad you got your query resolved. @Joseph Kambourakis​ and @Adrian Łobacz​, Thank you for giving excellent answers .

  • 5 kudos
3 More Replies
Fred_F
by New Contributor III
  • 6507 Views
  • 7 replies
  • 5 kudos

JDBC connection timeout on workflow cluster

Hi there,​I've a batch process configured in a workflow which fails due to a jdbc timeout on a Postgres DB.​I checked the JDBC connection configuration and it seems to work when I query a table and doing a df.show() in the process and it displays th...

  • 6507 Views
  • 7 replies
  • 5 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 5 kudos

Hi @Fred Foucart​, We haven’t heard from you since the last response from @Rama Krishna N​ , and I was checking back to see if his suggestions helped you. Or else, If you have any solution, please share it with the community, as it can be helpful to ...

  • 5 kudos
6 More Replies
Direo
by Contributor
  • 1533 Views
  • 1 replies
  • 1 kudos

Azure databricks integration with Datadog

Before running a script which would create an agent on a cluster, you have to provide SPARK_LOCAL_IP variable. How can I find it? Does it change over time or its a constant?

  • 1533 Views
  • 1 replies
  • 1 kudos
Latest Reply
Debayan
Esteemed Contributor III
  • 1 kudos

Hi, Could you please refer to https://www.datadoghq.com/blog/databricks-monitoring-datadog/ and let us know if this helps. SPARK_LOCAL_IP is the environment variable, FYI, https://spark.apache.org/docs/latest/configuration.html

  • 1 kudos
SIRIGIRI
by Contributor
  • 950 Views
  • 1 replies
  • 2 kudos
  • 950 Views
  • 1 replies
  • 2 kudos
Latest Reply
Debayan
Esteemed Contributor III
  • 2 kudos

Hi, What kind of internal problem you are talking about? Anything particular?

  • 2 kudos
Kajorn
by New Contributor III
  • 4982 Views
  • 2 replies
  • 0 kudos

Resolved! WHEN NOT MATCHED BY SOURCE Syntax error at or near 'BY' (DBR 11.2 ML)

Hi, I have trouble with executing the given SQL Statement below.MERGE INTO warehouse.pdr_debit_card as TARGET USING (SELECT * FROM ( SELECT CIF, CARD_TYPE, ISSUE_DATE, MATURITY_DATE, BOO, DATA_DATE, row_number(...

  • 4982 Views
  • 2 replies
  • 0 kudos
Latest Reply
Debayan
Esteemed Contributor III
  • 0 kudos

Hi, Please refer: https://docs.databricks.com/sql/language-manual/delta-merge-into.html

  • 0 kudos
1 More Replies
Ender
by New Contributor II
  • 826 Views
  • 0 replies
  • 0 kudos

Delta Live Tables migration

How can I migrate a delta live tables workflow to another Databricks workspace?PS: Data source/sink will remain the same. I only want to migrate the DLT config.

  • 826 Views
  • 0 replies
  • 0 kudos
Lizhi_Dong
by New Contributor II
  • 1708 Views
  • 4 replies
  • 0 kudos

What would be the best plan for independent course creator?

Hi folks! I want to use databrick community edition as the platform to teach online courses. As you may know, for community edition, you need to create a new cluster when the old one terminates. I found out however tables created from the old cluster...

  • 1708 Views
  • 4 replies
  • 0 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 0 kudos

You can create a notebook for students which recreates everything, like doing the installation of tables etc., before every exercise.

  • 0 kudos
3 More Replies
RRO
by Contributor
  • 1434 Views
  • 1 replies
  • 3 kudos

AutoML forecasting with monthly data?

ARIMA and FBProphet have the capability to forecast monthly data. When using AutoML (via the API or the UI) it seems like it is not possible to have a monthly freq (e.g. 'MS').Is there a way / workaround to make it work with monthly data or is it pla...

  • 1434 Views
  • 1 replies
  • 3 kudos
Latest Reply
MateuszLomanski
New Contributor II
  • 3 kudos

It is possible to use AutoML to forecast monthly data, but it may require some additional steps or adjustments.One approach is to resample the monthly data to a lower frequency such as weekly or daily, and then use AutoML to forecast at that lower fr...

  • 3 kudos
Ajay-Pandey
by Esteemed Contributor III
  • 3060 Views
  • 9 replies
  • 11 kudos

Databricks start support to run selected text in a cell this will help us a lot during debugging of the code.In windows just select the line of code w...

Databricks start support to run selected text in a cell this will help us a lot during debugging of the code.In windows just select the line of code which you want to execute and press Ctrl+Shift+Enter

sele
  • 3060 Views
  • 9 replies
  • 11 kudos
Latest Reply
Nhan_Nguyen
Valued Contributor
  • 11 kudos

Thanks @Ajay Pandey​ nice sharing

  • 11 kudos
8 More Replies
SIRIGIRI
by Contributor
  • 485 Views
  • 0 replies
  • 1 kudos

medium.com

During Shuffle operation Data is moving from memory to disk Why?Please find the detailed answer here if any question please comment and hit like and share if interested in upcoming articles.https://medium.com/@sharikrishna26/during-shuffle-operation-...

  • 485 Views
  • 0 replies
  • 1 kudos
Ancil
by Contributor II
  • 1428 Views
  • 1 replies
  • 1 kudos

PythonException: 'RuntimeError: The length of output in Scalar iterator pandas UDF should be the same with the input's; however, the length of output was 1 and the length of input was 2.'.

I have pandas_udf, its working for 4 rows, but I tried with more than 4 rows getting below error.PythonException: 'RuntimeError: The length of output in Scalar iterator pandas UDF should be the same with the input's; however, the length of output was...

  • 1428 Views
  • 1 replies
  • 1 kudos
Latest Reply
Ancil
Contributor II
  • 1 kudos

@Kaniz Fatma​  Can you please help me on pandas_udf ?Above scenario I have used regular expressions, for that we have our spark method, but I have other pandas_udf have same issue.

  • 1 kudos
tatekeller
by New Contributor
  • 1532 Views
  • 1 replies
  • 0 kudos

Can you access a repo file in an init script?

I'd like to configure a cluster with python libraries as defined in a requirements file. I have a pip requirements.txt file in a private repo which I have integrated on Databricks (and I can access it through the UI and view it on Databricks). I upda...

  • 1532 Views
  • 1 replies
  • 0 kudos
Latest Reply
sher
Valued Contributor II
  • 0 kudos

you can install in a cluster

  • 0 kudos
KVNARK
by Honored Contributor II
  • 752 Views
  • 1 replies
  • 5 kudos

accessing secret from spark cluster.

passing spark configuration to access blob, adls from data factory while creating job clusterit's working fine, but when in the property we are accessing secret it's not workingspark.hadoop.fs.azure.account.auth.type.{{secrets/scope/key}}.dfs.core.wi...

  • 752 Views
  • 1 replies
  • 5 kudos
Latest Reply
sher
Valued Contributor II
  • 5 kudos

check here : https://docs.databricks.com/security/secrets/secrets.html

  • 5 kudos
sonali1996
by New Contributor
  • 1072 Views
  • 2 replies
  • 0 kudos

adding Widget as a column and populating its value every-time in that column in a table.

hi , I want date for runtime from ADF as @utcnow() -- base paramater of notebook activity in ADF and take the data in ADB using widgets as runtime_date, further i want that column to be added in my table X with the populated value from the widget.Eve...

  • 1072 Views
  • 2 replies
  • 0 kudos
Latest Reply
sher
Valued Contributor II
  • 0 kudos

you can use as current_timestamp() or now()refer link: https://docs.databricks.com/sql/language-manual/functions/current_timestamp.html

  • 0 kudos
1 More Replies

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group
Labels