cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 
Data + AI Summit 2024 - Data Engineering & Streaming

Forum Posts

SaiN
by New Contributor II
  • 1852 Views
  • 3 replies
  • 4 kudos

How to get Cost Per Job on a Single Cluster?

How will you get the granular information for cost per job for a single cluster in Azure Databricks? I know we can give Tags for Jobs as well Only Cluster we have. But I can only see Cluster Tag but not the Job TAGs in Cost Analysis on Azure Portal. ...

  • 1852 Views
  • 3 replies
  • 4 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 4 kudos

Hi @Sainath Nagare​ , We haven't heard from you on the last response from @Prabakar Ammeappin​, and I was checking back to see if his suggestions helped you. Or else, If you have any solution, please share it with the community as it can be helpful t...

  • 4 kudos
2 More Replies
colt
by New Contributor III
  • 2250 Views
  • 3 replies
  • 2 kudos

Using built-in SQL functions in Delta Live tables

Do Delta Live Tables have different built-in SQL functions than the corresponding Databricks runtime? I created a cluster with Databricks runtime 10.3 (the current DLT runtime) so I could test my Delta Live Tables code before running it as a pipeline...

  • 2250 Views
  • 3 replies
  • 2 kudos
Latest Reply
Vidula
Honored Contributor
  • 2 kudos

Hi @Colt Kesselring​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Th...

  • 2 kudos
2 More Replies
Akshith_Rajesh
by New Contributor III
  • 4254 Views
  • 4 replies
  • 6 kudos

Unable to write Data frame to Azure Synapse Table

When I am trying to insert records into the azure synapse Table using JDBC Its throwing below error com.microsoft.sqlserver.jdbc.SQLServerException: The statement failed. Column 'COMPANY_ADDRESS_STATE' has a data type that cannot participate ...

  • 4254 Views
  • 4 replies
  • 6 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 6 kudos

Hi @Rajesh Akshith​, We haven't heard from you on the last response from @Hubert Dudek​ , and I was checking back to see if his suggestions helped you. Or else, If you have any solution, please share it with the community as it can be helpful to othe...

  • 6 kudos
3 More Replies
Bency
by New Contributor III
  • 1371 Views
  • 2 replies
  • 1 kudos

How to get the list of parameters passed from widget

Hi ,Could someone help me understand how I would be able to get all the parameters in the task (from the widget). ie I want to get input as parameter 'Start_Date' , but the case is that this will not always be passed . It could be 'Run_Date' as well ...

  • 1371 Views
  • 2 replies
  • 1 kudos
Latest Reply
Vidula
Honored Contributor
  • 1 kudos

Hi @Bency Mathew​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Thank...

  • 1 kudos
1 More Replies
ejloh
by New Contributor II
  • 2710 Views
  • 3 replies
  • 1 kudos

SQL query with leads and lags

I'm trying to create a new column that fills in the nulls below. I tried using leads and lags but isn't turning out right. Basically trying to figure out who is in "possession" of the record, given the TransferFrom and TransferTo columns and sequence...

image image
  • 2710 Views
  • 3 replies
  • 1 kudos
Latest Reply
Vidula
Honored Contributor
  • 1 kudos

Hi there @Eric Lohbeck​ Does @Hubert Dudek​  response answer your question? If yes, would you be happy to mark it as best so that other members can find the solution more quickly?We'd love to hear from you.Thanks!

  • 1 kudos
2 More Replies
dmayi
by New Contributor
  • 3973 Views
  • 2 replies
  • 1 kudos

Resolved! Setting up custom tags (JobName, JobID, UserId) on an all-purpose cluster

Hi i want to set up custom tags on an all-purpose cluster for purposes of cost break down and chargebacks. What: specifically, i want to capture JobName, JobID, UserId who ran jobI can set other custom tags such as Business Unit, Owner... However,...

  • 3973 Views
  • 2 replies
  • 1 kudos
Latest Reply
Vidula
Honored Contributor
  • 1 kudos

Hey there @DIEUDONNE MAYI​ Does @Kaniz Fatma​  response answer your question? If yes, would you be happy to mark it as best so that other members can find the solution more quickly?We'd love to hear from you.Thanks!

  • 1 kudos
1 More Replies
ao1
by New Contributor III
  • 1459 Views
  • 2 replies
  • 1 kudos

About privileges that clone a Git repository on Databricks

Hi,​allDo I need admin privileges to clone a Git repository on Databricks?​Cloning was not possible with an account that did not have administrator privileges.​Regards.

  • 1459 Views
  • 2 replies
  • 1 kudos
Latest Reply
AmanSehgal
Honored Contributor III
  • 1 kudos

Navigate to Settings > admin console > Workspace settings > Repos and check the value for "Repos Git URL Allow List permissions".When set to 'Disabled (no restrictions)', users can clone or commit and push to any Git repository.When set to 'Restrict ...

  • 1 kudos
1 More Replies
tanin
by Contributor
  • 3775 Views
  • 8 replies
  • 8 kudos

Using .repartition(100000) causes the unit test to be extremely slow (>20 mins). Is there a way to speed it up?

Here's the code:val result = spark .createDataset(List("test")) .rdd .repartition(100000) .map { _ => "test" } .collect() .toList   println(result)I write tests to test for correctness, so I wonde...

  • 3775 Views
  • 8 replies
  • 8 kudos
Latest Reply
Vidula
Honored Contributor
  • 8 kudos

Hey there @tanin​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Thank...

  • 8 kudos
7 More Replies
thushar
by Contributor
  • 22202 Views
  • 3 replies
  • 2 kudos

Resolved! Connect to an on-prem SQL server database

Need to connect to an on-prem SQL database to extract data, we are using the Apache Spark SQL connector. The problem is can't able to connect to connection failure SQLServerException: The TCP/IP connection to the host ***.***.X.XX, port 1433 has fail...

  • 22202 Views
  • 3 replies
  • 2 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 2 kudos

Hi @Rob S (Customer)​ , We haven't heard from you on the last response from @Mohit Miglani​ â€‹, and I was checking back to see if his suggestions helped you. Or else, If you have any solution, please share it with the community as it can be helpful to...

  • 2 kudos
2 More Replies
572509
by New Contributor
  • 1539 Views
  • 3 replies
  • 1 kudos

Resolved! Noteboook-scoped env variables?

Is it possible to set environment variables at the notebook level instead of the cluster level? Will they be available in the workers in addition to the driver? Can they override the env variables set at the cluster level?

  • 1539 Views
  • 3 replies
  • 1 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 1 kudos

Hi @Martim Lobao​, We haven't heard from you on the last response from @Prabakar​ , and I was checking back to see if his suggestions helped you. Or else, If you have any solution, please share it with the community as it can be helpful to others.Als...

  • 1 kudos
2 More Replies
data_testing1
by New Contributor III
  • 3177 Views
  • 6 replies
  • 6 kudos

Resolved! How much of this tutorial or blog post can I run before starting a cloud instance of databricks?

I'm new to python and databricks so I'm still running tests on features, and not sure how much of this can be run without databricks which I guess requires an AWS or Google cloud account? Can I do all three stages without the AWS databricks or how fa...

  • 3177 Views
  • 6 replies
  • 6 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 6 kudos

Hi @Andrew Schell​, Please don't forget to click on the "Select As Best" button whenever the information provided helps resolve your question.@Hubert Dudek​, Thank you for your response.

  • 6 kudos
5 More Replies
RicksDB
by Contributor II
  • 2293 Views
  • 3 replies
  • 3 kudos

Resolved! Maximum job execution per hour

Hi, what is the maximum number of jobs we can execute in an hour for a given workspace?This page mentions 5000https://docs.microsoft.com/en-us/azure/databricks/data-engineering/jobs/jobsThe number of jobs a workspace can create in an hour is limited ...

  • 2293 Views
  • 3 replies
  • 3 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 3 kudos

Hi @E H​  , We haven’t heard from you on the last response from @Sivaprasad C S​, and I was checking back to see if his suggestions helped you. Or else, If you have any solution, please share it with the community as it can be helpful to others.

  • 3 kudos
2 More Replies
BradSheridan
by Valued Contributor
  • 9527 Views
  • 20 replies
  • 3 kudos

Resolved! Cloudformation error when launching Databricks in AWS

I've seen many posts here in the Community as potential solutions to this error, but none seem to be a solution for us. We are trying to launch the 14 day free trial of Databricks from the AWS Marketplace and are getting the error below. Moreover, ...

  • 9527 Views
  • 20 replies
  • 3 kudos
Latest Reply
BradSheridan
Valued Contributor
  • 3 kudos

Here are some answers:copyObject error - we were using a Databricks provided cloudformation template but this error goes away when we use the AWS provided templatecreateWorkspace error - we had subscribed>unsubscribed>resubscribed to Databricks via t...

  • 3 kudos
19 More Replies
noimeta
by Contributor II
  • 2918 Views
  • 4 replies
  • 1 kudos

Apply change data with delete and schema evolution

Hi,Currently, I'm using structure streaming to insert/update/delete to a table. A row will be deleted if value in 'Operation' column is 'deleted'. Everything seems to work fine until there's a new column.Since I don't need 'Operation' column in the t...

  • 2918 Views
  • 4 replies
  • 1 kudos
Latest Reply
User16753725469
Contributor II
  • 1 kudos

please go through this documentation https://docs.delta.io/latest/api/python/index.html

  • 1 kudos
3 More Replies
170017
by New Contributor II
  • 1464 Views
  • 2 replies
  • 1 kudos

Spark Error when running python script on databricks

I have the following basic script that works fine using pycharm on my machine.from pyspark.sql import SparkSessionprint("START")spark = SparkSession \ .Builder() \ .appName("myapp") \ .master('local[*, 4]') \ .getOrCreate()print(spark)dat...

  • 1464 Views
  • 2 replies
  • 1 kudos
Latest Reply
Vidula
Honored Contributor
  • 1 kudos

Hi @Patricia Mayer​ Just wanted to check in if you were able to resolve your issue or do you need more help? We'd love to hear from you.Thanks!

  • 1 kudos
1 More Replies

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group
Labels