cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

ejloh
by New Contributor II
  • 5547 Views
  • 3 replies
  • 1 kudos

SQL query with leads and lags

I'm trying to create a new column that fills in the nulls below. I tried using leads and lags but isn't turning out right. Basically trying to figure out who is in "possession" of the record, given the TransferFrom and TransferTo columns and sequence...

image image
  • 5547 Views
  • 3 replies
  • 1 kudos
Latest Reply
Vidula
Databricks Partner
  • 1 kudos

Hi there @Eric Lohbeck​ Does @Hubert Dudek​  response answer your question? If yes, would you be happy to mark it as best so that other members can find the solution more quickly?We'd love to hear from you.Thanks!

  • 1 kudos
2 More Replies
dmayi
by Databricks Partner
  • 5852 Views
  • 1 replies
  • 0 kudos

Setting up custom tags (JobName, JobID, UserId) on an all-purpose cluster

Hi i want to set up custom tags on an all-purpose cluster for purposes of cost break down and chargebacks. What: specifically, i want to capture JobName, JobID, UserId who ran jobI can set other custom tags such as Business Unit, Owner... However,...

  • 5852 Views
  • 1 replies
  • 0 kudos
Latest Reply
Vidula
Databricks Partner
  • 0 kudos

Hey there @DIEUDONNE MAYI​ Does @Kaniz Fatma​  response answer your question? If yes, would you be happy to mark it as best so that other members can find the solution more quickly?We'd love to hear from you.Thanks!

  • 0 kudos
ao1
by Databricks Partner
  • 3023 Views
  • 2 replies
  • 1 kudos

About privileges that clone a Git repository on Databricks

Hi,​allDo I need admin privileges to clone a Git repository on Databricks?​Cloning was not possible with an account that did not have administrator privileges.​Regards.

  • 3023 Views
  • 2 replies
  • 1 kudos
Latest Reply
AmanSehgal
Honored Contributor III
  • 1 kudos

Navigate to Settings > admin console > Workspace settings > Repos and check the value for "Repos Git URL Allow List permissions".When set to 'Disabled (no restrictions)', users can clone or commit and push to any Git repository.When set to 'Restrict ...

  • 1 kudos
1 More Replies
tanin
by Contributor
  • 7321 Views
  • 8 replies
  • 8 kudos

Using .repartition(100000) causes the unit test to be extremely slow (>20 mins). Is there a way to speed it up?

Here's the code:val result = spark .createDataset(List("test")) .rdd .repartition(100000) .map { _ => "test" } .collect() .toList   println(result)I write tests to test for correctness, so I wonde...

  • 7321 Views
  • 8 replies
  • 8 kudos
Latest Reply
Vidula
Databricks Partner
  • 8 kudos

Hey there @tanin​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Thank...

  • 8 kudos
7 More Replies
thushar
by Databricks Partner
  • 39033 Views
  • 2 replies
  • 2 kudos

Resolved! Connect to an on-prem SQL server database

Need to connect to an on-prem SQL database to extract data, we are using the Apache Spark SQL connector. The problem is can't able to connect to connection failure SQLServerException: The TCP/IP connection to the host ***.***.X.XX, port 1433 has fail...

  • 39033 Views
  • 2 replies
  • 2 kudos
Latest Reply
Mohit_m
Databricks Employee
  • 2 kudos

Maybe you can check the below docs and see if something is missing in the set uphttps://docs.microsoft.com/en-us/azure/databricks/administration-guide/cloud-configurations/azure/on-prem-network

  • 2 kudos
1 More Replies
572509
by New Contributor
  • 2763 Views
  • 2 replies
  • 1 kudos

Resolved! Noteboook-scoped env variables?

Is it possible to set environment variables at the notebook level instead of the cluster level? Will they be available in the workers in addition to the driver? Can they override the env variables set at the cluster level?

  • 2763 Views
  • 2 replies
  • 1 kudos
Latest Reply
Prabakar
Databricks Employee
  • 1 kudos

It is not possible to set it from the notebook level.

  • 1 kudos
1 More Replies
data_testing1
by New Contributor III
  • 5725 Views
  • 5 replies
  • 5 kudos

Resolved! How much of this tutorial or blog post can I run before starting a cloud instance of databricks?

I'm new to python and databricks so I'm still running tests on features, and not sure how much of this can be run without databricks which I guess requires an AWS or Google cloud account? Can I do all three stages without the AWS databricks or how fa...

  • 5725 Views
  • 5 replies
  • 5 kudos
Latest Reply
Hubert-Dudek
Databricks MVP
  • 5 kudos

Hi, to run it, you need databricks. You can try to open a free community account. Here is explained how: https://community.databricks.com/s/feed/0D53f00001ebEasCAE

  • 5 kudos
4 More Replies
RicksDB
by Contributor III
  • 3909 Views
  • 2 replies
  • 3 kudos

Resolved! Maximum job execution per hour

Hi, what is the maximum number of jobs we can execute in an hour for a given workspace?This page mentions 5000https://docs.microsoft.com/en-us/azure/databricks/data-engineering/jobs/jobsThe number of jobs a workspace can create in an hour is limited ...

  • 3909 Views
  • 2 replies
  • 3 kudos
Latest Reply
Sivaprasad1
Databricks Employee
  • 3 kudos

Up to 5000 jobs (both normal and ephemeral) may be created per hour in a single workspace

  • 3 kudos
1 More Replies
BradSheridan
by Databricks Partner
  • 18899 Views
  • 20 replies
  • 3 kudos

Resolved! Cloudformation error when launching Databricks in AWS

I've seen many posts here in the Community as potential solutions to this error, but none seem to be a solution for us. We are trying to launch the 14 day free trial of Databricks from the AWS Marketplace and are getting the error below. Moreover, ...

  • 18899 Views
  • 20 replies
  • 3 kudos
Latest Reply
BradSheridan
Databricks Partner
  • 3 kudos

Here are some answers:copyObject error - we were using a Databricks provided cloudformation template but this error goes away when we use the AWS provided templatecreateWorkspace error - we had subscribed>unsubscribed>resubscribed to Databricks via t...

  • 3 kudos
19 More Replies
noimeta
by Contributor III
  • 7065 Views
  • 4 replies
  • 1 kudos

Apply change data with delete and schema evolution

Hi,Currently, I'm using structure streaming to insert/update/delete to a table. A row will be deleted if value in 'Operation' column is 'deleted'. Everything seems to work fine until there's a new column.Since I don't need 'Operation' column in the t...

  • 7065 Views
  • 4 replies
  • 1 kudos
Latest Reply
User16753725469
Databricks Employee
  • 1 kudos

please go through this documentation https://docs.delta.io/latest/api/python/index.html

  • 1 kudos
3 More Replies
170017
by New Contributor II
  • 3108 Views
  • 1 replies
  • 1 kudos

Spark Error when running python script on databricks

I have the following basic script that works fine using pycharm on my machine.from pyspark.sql import SparkSessionprint("START")spark = SparkSession \ .Builder() \ .appName("myapp") \ .master('local[*, 4]') \ .getOrCreate()print(spark)dat...

  • 3108 Views
  • 1 replies
  • 1 kudos
Latest Reply
Vidula
Databricks Partner
  • 1 kudos

Hi @Patricia Mayer​ Just wanted to check in if you were able to resolve your issue or do you need more help? We'd love to hear from you.Thanks!

  • 1 kudos
563641
by New Contributor II
  • 1259 Views
  • 1 replies
  • 2 kudos

Advanced ML Virtual Training Video from 2022 Summit (not currently accessible)

There does not seem to be a way to log into and view the recent "paid" training sessions from the 2022 Data/AI Summit. I was able to log in and view the videos yesterday, but the website currently posted has no option for logging in/access. Is the...

  • 1259 Views
  • 1 replies
  • 2 kudos
Latest Reply
Vidula
Databricks Partner
  • 2 kudos

Hey there @Christopher Warner​ Just wanted to check in if you were able to resolve your issue or do you need more help? We'd love to hear from you.Thanks!

  • 2 kudos
pawelmitrus
by Contributor
  • 6819 Views
  • 4 replies
  • 1 kudos

Why Databricks spawns multiple jobs

I have a Delta table spark101.airlines (sourced from `/databricks-datasets/airlines/`) partitioned by `Year`. My `spark.sql.shuffle.partitions` is set to default 200. I run a simple query:select Origin, count(*) from spark101.airlines group by Origi...

image
  • 6819 Views
  • 4 replies
  • 1 kudos
Latest Reply
User16753725469
Databricks Employee
  • 1 kudos

Could you please paste the query plan here to analyse the issue

  • 1 kudos
3 More Replies
hamzatazib96
by Databricks Partner
  • 3616 Views
  • 1 replies
  • 1 kudos

Snowflake/GCP error: Premature end of chunk coded message body: closing chunk expected

Hello all,I've been experiencing the error described below, where I try to query a table from Snowflake which is about ~5.5B rows and ~30columns, and it fails almost systematically; specifically, either the Spark Job doesn't even start or I get the ...

  • 3616 Views
  • 1 replies
  • 1 kudos
Latest Reply
Vidula
Databricks Partner
  • 1 kudos

Hey there @hamzatazib96​ Does @Kaniz Fatma​  response answer your question? If yes, would you be happy to mark it as best so that other members can find the solution more quickly?We'd love to hear from you.Thanks!

  • 1 kudos
Anonymous
by Not applicable
  • 5825 Views
  • 5 replies
  • 3 kudos

Encryption/Decryption options in ADB

Hello all,We are working on one of the client requirements to implement suitable data encryption in Azure Databricks.We should be able to encrypt and decrypt the data based on the access, we explored fernet library but client denied it saying it degr...

  • 5825 Views
  • 5 replies
  • 3 kudos
Latest Reply
Vidula
Databricks Partner
  • 3 kudos

Hi @purushotham Chanda​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you...

  • 3 kudos
4 More Replies
Labels