Hi ,Could someone help me understand how I would be able to get all the parameters in the task (from the widget). ie I want to get input as parameter 'Start_Date' , but the case is that this will not always be passed . It could be 'Run_Date' as well ...
Hi @Bency Mathew​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Thank...
I'm trying to create a new column that fills in the nulls below. I tried using leads and lags but isn't turning out right. Basically trying to figure out who is in "possession" of the record, given the TransferFrom and TransferTo columns and sequence...
Hi there @Eric Lohbeck​ Does @Hubert Dudek​ response answer your question? If yes, would you be happy to mark it as best so that other members can find the solution more quickly?We'd love to hear from you.Thanks!
Hi i want to set up custom tags on an all-purpose cluster for purposes of cost break down and chargebacks. What: specifically, i want to capture JobName, JobID, UserId who ran jobI can set other custom tags such as Business Unit, Owner... However,...
Hey there @DIEUDONNE MAYI​ Does @Kaniz Fatma​ response answer your question? If yes, would you be happy to mark it as best so that other members can find the solution more quickly?We'd love to hear from you.Thanks!
Hi,​allDo I need admin privileges to clone a Git repository on Databricks?​Cloning was not possible with an account that did not have administrator privileges.​Regards.
Navigate to Settings > admin console > Workspace settings > Repos and check the value for "Repos Git URL Allow List permissions".When set to 'Disabled (no restrictions)', users can clone or commit and push to any Git repository.When set to 'Restrict ...
Here's the code:val result = spark
.createDataset(List("test"))
.rdd
.repartition(100000)
.map { _ =>
"test"
}
.collect()
.toList
println(result)I write tests to test for correctness, so I wonde...
Hey there @tanin​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Thank...
Need to connect to an on-prem SQL database to extract data, we are using the Apache Spark SQL connector. The problem is can't able to connect to connection failure SQLServerException: The TCP/IP connection to the host ***.***.X.XX, port 1433 has fail...
Hi @Rob S (Customer)​ , We haven't heard from you on the last response from @Mohit Miglani​ ​, and I was checking back to see if his suggestions helped you. Or else, If you have any solution, please share it with the community as it can be helpful to...
Is it possible to set environment variables at the notebook level instead of the cluster level? Will they be available in the workers in addition to the driver? Can they override the env variables set at the cluster level?
Hi @Martim Lobao​, We haven't heard from you on the last response from @Prabakar​ , and I was checking back to see if his suggestions helped you. Or else, If you have any solution, please share it with the community as it can be helpful to others.Als...
I'm new to python and databricks so I'm still running tests on features, and not sure how much of this can be run without databricks which I guess requires an AWS or Google cloud account? Can I do all three stages without the AWS databricks or how fa...
Hi @Andrew Schell​, Please don't forget to click on the "Select As Best" button whenever the information provided helps resolve your question.@Hubert Dudek​, Thank you for your response.
Hi, what is the maximum number of jobs we can execute in an hour for a given workspace?This page mentions 5000https://docs.microsoft.com/en-us/azure/databricks/data-engineering/jobs/jobsThe number of jobs a workspace can create in an hour is limited ...
Hi @E H​ , We haven’t heard from you on the last response from @Sivaprasad C S​, and I was checking back to see if his suggestions helped you. Or else, If you have any solution, please share it with the community as it can be helpful to others.
I have created an Azure AD Group in "Microsoft 365" type with its own email address, which being added to the Notification of a Databricks Job (on failure). But there is no mail sent to the Azure Group mailbox when the job fails.I am able to send a d...
Hi @Md Tahseen Anam​ Just wanted to check in if you were able to resolve your issue. If yes, would you be happy to mark an answer as best? If not, please tell us so we can help you.Thanks!
I've seen many posts here in the Community as potential solutions to this error, but none seem to be a solution for us. We are trying to launch the 14 day free trial of Databricks from the AWS Marketplace and are getting the error below. Moreover, ...
Here are some answers:copyObject error - we were using a Databricks provided cloudformation template but this error goes away when we use the AWS provided templatecreateWorkspace error - we had subscribed>unsubscribed>resubscribed to Databricks via t...
Hi,Currently, I'm using structure streaming to insert/update/delete to a table. A row will be deleted if value in 'Operation' column is 'deleted'. Everything seems to work fine until there's a new column.Since I don't need 'Operation' column in the t...
I have the following basic script that works fine using pycharm on my machine.from pyspark.sql import SparkSessionprint("START")spark = SparkSession \ .Builder() \ .appName("myapp") \ .master('local[*, 4]') \ .getOrCreate()print(spark)dat...
There does not seem to be a way to log into and view the recent "paid" training sessions from the 2022 Data/AI Summit. I was able to log in and view the videos yesterday, but the website currently posted has no option for logging in/access. Is the...
Hey there @Christopher Warner​ Just wanted to check in if you were able to resolve your issue or do you need more help? We'd love to hear from you.Thanks!
I have a Delta table spark101.airlines (sourced from `/databricks-datasets/airlines/`) partitioned by `Year`. My `spark.sql.shuffle.partitions` is set to default 200. I run a simple query:select Origin, count(*)
from spark101.airlines
group by Origi...