- 801 Views
- 2 replies
- 0 kudos
I find myself constantly having to do display(df), and then "recompute with <5g records and download). I was just hoping I could skip the middleman and download from get go. ideally it'd be a function like download(df,num_rows="max") where num_rows i...
- 801 Views
- 2 replies
- 0 kudos
Latest Reply
Question where do you want to download it to? If to cloud location, use regular DataFrameWriter. You can install, for example, Azure Storage Explorer on your computer. Some cloud storage you can even mount in your system as a folder or network share.
1 More Replies
- 1120 Views
- 2 replies
- 0 kudos
Hi, is there any way to share the run_id from a task_A to a task_B within the same job when task_A is a dbt task?
- 1120 Views
- 2 replies
- 0 kudos
Latest Reply
Hi, You can pass {job_id}} and {{run_id}} in Job arguments and print that information and save into wherever it is neededplease find below the documentation for the same:https://docs.databricks.com/data-engineering/jobs/jobs.html#task-parameter-varia...
1 More Replies
- 2081 Views
- 5 replies
- 4 kudos
I have a table that looks like this:/* input */
-- | parent | child |
-- | ------ | ----- |
-- | 1 | 2 |
-- | 2 | 3 |
-- | 3 | 4 |
-- | 5 | 6 |
-- | 6 | 7 |
-- | 8 | 9 |
-- | 10 | 11 |and I...
- 2081 Views
- 5 replies
- 4 kudos
Latest Reply
@Landan George Hey, I am looking into same issue, but when I execute what's suggested in the post for CTE_Recursive https://medium.com/globant/how-to-implement-recursive-queries-in-spark-3d26f7ed3bc9 I get errorError in SQL statement: AnalysisExcep...
4 More Replies
- 1253 Views
- 4 replies
- 0 kudos
I have used mlflow and got my model served through REST API. It work fine when all model features are provided. But my use case is that only a single feature (the primary key) will be provided by the consumer application, and my code has to lookup th...
- 1253 Views
- 4 replies
- 0 kudos
Latest Reply
You can create a custom endpoint for your REST API that handles the data massaging before calling themodel.predict function. This endpoint can take in the primary key as an input, retrieve the additional features from the database based on that key, ...
3 More Replies
- 3512 Views
- 1 replies
- 0 kudos
Hi community,I'm trying to read XML data from Azure Datalake Gen 2 using com.databricks:spark-xml_2.12:0.12.0:spark.read.format('XML').load('abfss://[CONTAINER]@[storageaccount].dfs.core.windows.net/PATH/TO/FILE.xml')The code above gives the followin...
- 3512 Views
- 1 replies
- 0 kudos
Latest Reply
The issue was also raised here: https://github.com/databricks/spark-xml/issues/591A fix is to use the "spark.hadoop" prefix in front of the fs.azure spark config keys:spark.hadoop.fs.azure.account.oauth2.client.id.nubulosdpdlsdev01.dfs.core.windows.n...
by
sid_de
• New Contributor II
- 2062 Views
- 3 replies
- 2 kudos
We are installing google-chrome-stable in databricks cluster using apt-get install. Which has been working fine for a long time, but since the past few days it has started to fail intermittently.The following is the code that we run.%sh
sudo curl -s...
- 2062 Views
- 3 replies
- 2 kudos
Latest Reply
Hi The issue was still persistent. We are trying to solve this by using docker image with preinstalled Selenium driver and chrome browser.RegardsDharmin
2 More Replies
- 2073 Views
- 4 replies
- 5 kudos
I am aware, I can load anything into a DataFrame using JDBC, that works well from Oracle sources. Is there an equivalent in Spark SQL, so I can combine datasets as well?Basically something like so - you get the idea...select
lt.field1,
rt.fie...
- 2073 Views
- 4 replies
- 5 kudos
Latest Reply
Hi @Roger Bieri (Customer), I appreciate your attempt to choose the best answer for us. I'm glad you got your query resolved. @Joseph Kambourakis and @Adrian Łobacz, Thank you for giving excellent answers .
3 More Replies
by
Fred_F
• New Contributor III
- 4003 Views
- 7 replies
- 5 kudos
Hi there,I've a batch process configured in a workflow which fails due to a jdbc timeout on a Postgres DB.I checked the JDBC connection configuration and it seems to work when I query a table and doing a df.show() in the process and it displays th...
- 4003 Views
- 7 replies
- 5 kudos
Latest Reply
Hi @Fred Foucart, We haven’t heard from you since the last response from @Rama Krishna N , and I was checking back to see if his suggestions helped you. Or else, If you have any solution, please share it with the community, as it can be helpful to ...
6 More Replies
- 803 Views
- 1 replies
- 1 kudos
Before running a script which would create an agent on a cluster, you have to provide SPARK_LOCAL_IP variable. How can I find it? Does it change over time or its a constant?
- 803 Views
- 1 replies
- 1 kudos
Latest Reply
Hi, Could you please refer to https://www.datadoghq.com/blog/databricks-monitoring-datadog/ and let us know if this helps. SPARK_LOCAL_IP is the environment variable, FYI, https://spark.apache.org/docs/latest/configuration.html
- 546 Views
- 1 replies
- 2 kudos
@DataBricksHelp232 @Arjun Krishna S R @akash kumar
- 546 Views
- 1 replies
- 2 kudos
Latest Reply
Hi, What kind of internal problem you are talking about? Anything particular?
by
Kajorn
• New Contributor III
- 3270 Views
- 2 replies
- 0 kudos
Hi, I have trouble with executing the given SQL Statement below.MERGE INTO warehouse.pdr_debit_card as TARGET
USING (SELECT * FROM (
SELECT CIF,
CARD_TYPE,
ISSUE_DATE,
MATURITY_DATE,
BOO,
DATA_DATE,
row_number(...
- 3270 Views
- 2 replies
- 0 kudos
Latest Reply
Hi, Please refer: https://docs.databricks.com/sql/language-manual/delta-merge-into.html
1 More Replies
by
Ender
• New Contributor
- 528 Views
- 0 replies
- 0 kudos
How can I migrate a delta live tables workflow to another Databricks workspace?PS: Data source/sink will remain the same. I only want to migrate the DLT config.
- 528 Views
- 0 replies
- 0 kudos
- 988 Views
- 4 replies
- 0 kudos
Hi folks! I want to use databrick community edition as the platform to teach online courses. As you may know, for community edition, you need to create a new cluster when the old one terminates. I found out however tables created from the old cluster...
- 988 Views
- 4 replies
- 0 kudos
Latest Reply
You can create a notebook for students which recreates everything, like doing the installation of tables etc., before every exercise.
3 More Replies
- 3521 Views
- 4 replies
- 0 kudos
I have a workflow created in Dev, now I want to move the whole thing to prod and schedule it. The workflow has multiple notebooks, dependent libraries, parameters and such. How to move the whole thing to prod, instead of moving each notebooks and rec...
- 3521 Views
- 4 replies
- 0 kudos
Latest Reply
Alternatively, you can just click the three dots options in workflow and choose "view JSON" and save JSON. Then use it in the Rest API call to create new workflow/job using that JSON (but usually some part needs to be removed)
3 More Replies
- 950 Views
- 1 replies
- 3 kudos
ARIMA and FBProphet have the capability to forecast monthly data. When using AutoML (via the API or the UI) it seems like it is not possible to have a monthly freq (e.g. 'MS').Is there a way / workaround to make it work with monthly data or is it pla...
- 950 Views
- 1 replies
- 3 kudos
Latest Reply
It is possible to use AutoML to forecast monthly data, but it may require some additional steps or adjustments.One approach is to resample the monthly data to a lower frequency such as weekly or daily, and then use AutoML to forecast at that lower fr...