cancel
Showing results for 
Search instead for 
Did you mean: 
Community Platform Discussions
Connect with fellow community members to discuss general topics related to the Databricks platform, industry trends, and best practices. Share experiences, ask questions, and foster collaboration within the community.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

lawal-hash
by New Contributor
  • 1135 Views
  • 1 replies
  • 0 kudos

Unable to clone Bitbucket repository on Databricks Legacy

For the past 3 weeks, I have been unable to clone the Bitbucket repository into Databricks legacy.I got the following error message "Error Creating RepoThere was an error performing the operation. Please try again or open a support ticket."I would ap...

  • 1135 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 0 kudos

Hi @lawal-hash, I understand that you’ve encountered issues while trying to clone a Bitbucket repository into Databricks legacy.    The error message you received, “Error Creating Repo. There was an error operating. Please try again or open a support...

  • 0 kudos
elgeo
by Valued Contributor II
  • 3420 Views
  • 1 replies
  • 1 kudos

Resolved! Αdd columns delta table

Hello. Do you know if you can add columns at a specific position (before / after a column) by altering a delta table ?

  • 3420 Views
  • 1 replies
  • 1 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 1 kudos

yes, using the FIRST or AFTER parameter.https://docs.databricks.com/en/sql/language-manual/sql-ref-syntax-ddl-alter-table-manage-column.html#add-column

  • 1 kudos
serelk
by New Contributor III
  • 1913 Views
  • 2 replies
  • 0 kudos

Delta Live Tables (DLT) in a production environment

I’m keen to learn more about employing Distributed Ledger Technology (DLT) in a production environment. Databricks is indeed promoting Delta Live Tables (DLT) as a comprehensive framework for creating robust, maintainable, and testable data processin...

  • 1913 Views
  • 2 replies
  • 0 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 0 kudos

Hi @serelk, Managing disaster recovery for DLT pipelines is crucial, especially when valuable data is at stake.   Let’s explore some strategies and practices:   Delta Live Tables (DLT) and Disaster Recovery: DLT simplifies and streamlines disaster re...

  • 0 kudos
1 More Replies
memo
by New Contributor II
  • 6156 Views
  • 3 replies
  • 0 kudos

Pivot on multiple columns

I want to pass multiple column as argument to pivot a dataframe  in pyspark pivot likemydf.groupBy("id").pivot("day","city").agg(F.sum("price").alias("price"),F.sum("units").alias("units")).show(). One way I found is to create multiple df with differ...

  • 6156 Views
  • 3 replies
  • 0 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 0 kudos

Hi @memo , Let’s call this function pivot_udf. Here’s how you can implement it: from pyspark.sql import functions as F def pivot_udf(df, *cols): mydf = df.select('id').drop_duplicates() for c in cols: mydf = mydf.join( df...

  • 0 kudos
2 More Replies
rudyevers
by New Contributor III
  • 5045 Views
  • 6 replies
  • 5 kudos

Resolved! Unity catalog - external table lastUpdateversion

We are currently upgrading our Lakehouse to use the Unity Catalog benefits. We will mostly use external tables because alle our DETLA tables are already stored in Azure Storage. I try to figure out how to update the table property "delta.lastUpdateve...

  • 5045 Views
  • 6 replies
  • 5 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 5 kudos

I am in the same boat.That is the reason I opted to use managed tables instead.  OK; it means migrating tables and changing notebooks but besides not having to struggle with external tables, you also get something in return (liquid clustering f.e.).

  • 5 kudos
5 More Replies
neca36
by New Contributor
  • 1380 Views
  • 1 replies
  • 0 kudos

Databricks data engineer associate got paused

Hi team,I've faced a disappointing experience during my first certification attempt and need help in resolving the issue.While attending the certification - Databricks data engineer associate on each 2-3 questions I kept receiving a message that the ...

  • 1380 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 0 kudos

Thank you for posting your concern on Community! To expedite your request, please list your concerns on our ticketing portal. Our support staff would be able to act faster on the resolution (our standard resolution time is 24-48 hours).

  • 0 kudos
Shree23
by New Contributor III
  • 3020 Views
  • 6 replies
  • 0 kudos

Primary key and not null

Hi Expert, how we can get primary key and not null and cluster index in table creation%Sqlcreate table table1 values (id int , product char) expected outputcreate table table1 values (id int  not null primary key, product char) and cluster index  

  • 3020 Views
  • 6 replies
  • 0 kudos
Latest Reply
Shree23
New Contributor III
  • 0 kudos

sugggestion pls

  • 0 kudos
5 More Replies
mvmiller
by New Contributor III
  • 1148 Views
  • 1 replies
  • 0 kudos

Sharing compute between tasks of a job

Is there a way to set up a workflow with multiple tasks, so that different tasks can share the same compute resource, at the same time?I understand that an instance pool may be an option, here. Wasn't sure if there were other possible options to cons...

  • 1148 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 0 kudos

Hi @mvmiller , Certainly! When orchestrating workflows with multiple tasks, it’s essential to optimize resource usage.   Here are a couple of approaches you can consider: GitHub Actions: By default, GitHub Actions runs multiple jobs in parallel. Howe...

  • 0 kudos
Bhanu1
by New Contributor III
  • 1366 Views
  • 1 replies
  • 0 kudos

Thoughts on how to improve string search queries

Please see sample code I am running below. What options can I explore to improve speed of query execution in such a scenario? Current full code takes about 4 hrs to run on 1.5 billion rows. Thanks!SELECT fullVisitorId ,VisitId ,EventDate ,PagePath ,d...

  • 1366 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 0 kudos

Hi @Bhanu1, When dealing with large datasets and slow query execution, there are several strategies you can explore to improve performance. Let’s dive into some options: Indexing: Indexing is a critical technique for enhancing SQL query performance o...

  • 0 kudos
rpl
by New Contributor III
  • 1881 Views
  • 1 replies
  • 0 kudos

Resolved! Read file with Delta Live Tables from external location (Unity Catalog)

As far as I understand, Delta Live Tables should now support reading data from an external location, but I can’t get it to work. I’ve added an ADLS container to Unity Catalog as an external location. There’s a folder in the container containing an ex...

Community Platform Discussions
Delta Live Tables
Unity Catalog
  • 1881 Views
  • 1 replies
  • 0 kudos
Latest Reply
rpl
New Contributor III
  • 0 kudos

I misspelled the folder name; I got it working now  The error message could have been more informative

  • 0 kudos
Jun_NN
by New Contributor
  • 9183 Views
  • 2 replies
  • 1 kudos

Deleted the s3 bucket assocated with metastore

I deleted the aws s3 bucket for the databricks metastore by mistake.How to fix this? can I re-create the s3 bucket? Or can I delete the metastore (I don't have much data in it), and re-generate one? Thank you!

  • 9183 Views
  • 2 replies
  • 1 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 1 kudos

Hi @Jun_NN , Indeed, deleting the AWS S3 bucket for the Databricks metastore can be a nerve-wracking task. However, there are strategies to recover from such situations. Let’s explore some options: Regenerate Metastore: If your metastore doesn’t ...

  • 1 kudos
1 More Replies
hagarciaj
by New Contributor
  • 980 Views
  • 1 replies
  • 0 kudos

Highly Performant Data Ingestion and Processing Pipelines

Hi everyone,I am working on a project that requires highly performant pipelines for managing data ingestion, validation, and processing large data volumes from IOT devices.I am interested in knowing:- The best way to ingest from EventHub/Kafka sinks-...

  • 980 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 0 kudos

Hi @hagarciaj, Certainly! Handling data pipelines for large volumes from IoT devices is crucial.   Let’s dive into each aspect:   Ingestion from EventHub/Kafka Sinks: Azure Event Hubs provides an Apache Kafka endpoint, allowing you to connect using t...

  • 0 kudos
Adil
by New Contributor
  • 1796 Views
  • 1 replies
  • 0 kudos

Find value in any column in a table

Hi,I'm not sure if this is a possible scenario, but is there, by any chance a way to query all the columns of a table for searching a value? Explanation: I want to search for a specific value in all the columns of a databricks table. I don't know whi...

  • 1796 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 0 kudos

Hi @Adil, Certainly! When you need to search for a specific value across all columns in a table, you can use SQL queries to achieve this. You can construct a query that checks each column for the desired value.   Here are a few approaches you can con...

  • 0 kudos
osalawu
by New Contributor
  • 1008 Views
  • 2 replies
  • 0 kudos

Got Suspended while taking Databricks Certified Data Analyst Associate Assessment

Hi Team, What I experienced today from the proctor was so not nice, my experience with proctor today was very frustrating and pathetic, I was taking my assessment today 11-26-2023, 2pm, I was already on between question 22nd to 25th when my assessmen...

  • 1008 Views
  • 2 replies
  • 0 kudos
Latest Reply
Cert-Team
Esteemed Contributor
  • 0 kudos

@osalawu Sorry to hear you had an issue with your exam. In order to protect your Webassessor account information, please file a ticket with our support team. Please include your Webassessor login ID, the exam, and a couple of dates and times that wil...

  • 0 kudos
1 More Replies
Klusener
by New Contributor
  • 1227 Views
  • 0 replies
  • 0 kudos

Arguments parsing in Databricks python jobs

On Databricks created a job task with task type as Python script from s3. However, when arguments are passed via Parameters option, running into unrecognized arguments' error.Code in s3 file:import argparse def parse_arguments(): parser = argpar...

  • 1227 Views
  • 0 replies
  • 0 kudos

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group
Top Kudoed Authors