- 1135 Views
- 1 replies
- 0 kudos
Unable to clone Bitbucket repository on Databricks Legacy
For the past 3 weeks, I have been unable to clone the Bitbucket repository into Databricks legacy.I got the following error message "Error Creating RepoThere was an error performing the operation. Please try again or open a support ticket."I would ap...
- 1135 Views
- 1 replies
- 0 kudos
- 0 kudos
Hi @lawal-hash, I understand that you’ve encountered issues while trying to clone a Bitbucket repository into Databricks legacy. The error message you received, “Error Creating Repo. There was an error operating. Please try again or open a support...
- 0 kudos
- 3420 Views
- 1 replies
- 1 kudos
Resolved! Αdd columns delta table
Hello. Do you know if you can add columns at a specific position (before / after a column) by altering a delta table ?
- 3420 Views
- 1 replies
- 1 kudos
- 1 kudos
yes, using the FIRST or AFTER parameter.https://docs.databricks.com/en/sql/language-manual/sql-ref-syntax-ddl-alter-table-manage-column.html#add-column
- 1 kudos
- 1913 Views
- 2 replies
- 0 kudos
Delta Live Tables (DLT) in a production environment
I’m keen to learn more about employing Distributed Ledger Technology (DLT) in a production environment. Databricks is indeed promoting Delta Live Tables (DLT) as a comprehensive framework for creating robust, maintainable, and testable data processin...
- 1913 Views
- 2 replies
- 0 kudos
- 0 kudos
Hi @serelk, Managing disaster recovery for DLT pipelines is crucial, especially when valuable data is at stake. Let’s explore some strategies and practices: Delta Live Tables (DLT) and Disaster Recovery: DLT simplifies and streamlines disaster re...
- 0 kudos
- 6156 Views
- 3 replies
- 0 kudos
Pivot on multiple columns
I want to pass multiple column as argument to pivot a dataframe in pyspark pivot likemydf.groupBy("id").pivot("day","city").agg(F.sum("price").alias("price"),F.sum("units").alias("units")).show(). One way I found is to create multiple df with differ...
- 6156 Views
- 3 replies
- 0 kudos
- 0 kudos
Hi @memo , Let’s call this function pivot_udf. Here’s how you can implement it: from pyspark.sql import functions as F def pivot_udf(df, *cols): mydf = df.select('id').drop_duplicates() for c in cols: mydf = mydf.join( df...
- 0 kudos
- 5045 Views
- 6 replies
- 5 kudos
Resolved! Unity catalog - external table lastUpdateversion
We are currently upgrading our Lakehouse to use the Unity Catalog benefits. We will mostly use external tables because alle our DETLA tables are already stored in Azure Storage. I try to figure out how to update the table property "delta.lastUpdateve...
- 5045 Views
- 6 replies
- 5 kudos
- 5 kudos
I am in the same boat.That is the reason I opted to use managed tables instead. OK; it means migrating tables and changing notebooks but besides not having to struggle with external tables, you also get something in return (liquid clustering f.e.).
- 5 kudos
- 1380 Views
- 1 replies
- 0 kudos
Databricks data engineer associate got paused
Hi team,I've faced a disappointing experience during my first certification attempt and need help in resolving the issue.While attending the certification - Databricks data engineer associate on each 2-3 questions I kept receiving a message that the ...
- 1380 Views
- 1 replies
- 0 kudos
- 0 kudos
Thank you for posting your concern on Community! To expedite your request, please list your concerns on our ticketing portal. Our support staff would be able to act faster on the resolution (our standard resolution time is 24-48 hours).
- 0 kudos
- 3020 Views
- 6 replies
- 0 kudos
Primary key and not null
Hi Expert, how we can get primary key and not null and cluster index in table creation%Sqlcreate table table1 values (id int , product char) expected outputcreate table table1 values (id int not null primary key, product char) and cluster index
- 3020 Views
- 6 replies
- 0 kudos
- 1148 Views
- 1 replies
- 0 kudos
Sharing compute between tasks of a job
Is there a way to set up a workflow with multiple tasks, so that different tasks can share the same compute resource, at the same time?I understand that an instance pool may be an option, here. Wasn't sure if there were other possible options to cons...
- 1148 Views
- 1 replies
- 0 kudos
- 0 kudos
Hi @mvmiller , Certainly! When orchestrating workflows with multiple tasks, it’s essential to optimize resource usage. Here are a couple of approaches you can consider: GitHub Actions: By default, GitHub Actions runs multiple jobs in parallel. Howe...
- 0 kudos
- 1366 Views
- 1 replies
- 0 kudos
Thoughts on how to improve string search queries
Please see sample code I am running below. What options can I explore to improve speed of query execution in such a scenario? Current full code takes about 4 hrs to run on 1.5 billion rows. Thanks!SELECT fullVisitorId ,VisitId ,EventDate ,PagePath ,d...
- 1366 Views
- 1 replies
- 0 kudos
- 0 kudos
Hi @Bhanu1, When dealing with large datasets and slow query execution, there are several strategies you can explore to improve performance. Let’s dive into some options: Indexing: Indexing is a critical technique for enhancing SQL query performance o...
- 0 kudos
- 1881 Views
- 1 replies
- 0 kudos
Resolved! Read file with Delta Live Tables from external location (Unity Catalog)
As far as I understand, Delta Live Tables should now support reading data from an external location, but I can’t get it to work. I’ve added an ADLS container to Unity Catalog as an external location. There’s a folder in the container containing an ex...
- 1881 Views
- 1 replies
- 0 kudos
- 0 kudos
I misspelled the folder name; I got it working now The error message could have been more informative
- 0 kudos
- 9183 Views
- 2 replies
- 1 kudos
Deleted the s3 bucket assocated with metastore
I deleted the aws s3 bucket for the databricks metastore by mistake.How to fix this? can I re-create the s3 bucket? Or can I delete the metastore (I don't have much data in it), and re-generate one? Thank you!
- 9183 Views
- 2 replies
- 1 kudos
- 1 kudos
Hi @Jun_NN , Indeed, deleting the AWS S3 bucket for the Databricks metastore can be a nerve-wracking task. However, there are strategies to recover from such situations. Let’s explore some options: Regenerate Metastore: If your metastore doesn’t ...
- 1 kudos
- 980 Views
- 1 replies
- 0 kudos
Highly Performant Data Ingestion and Processing Pipelines
Hi everyone,I am working on a project that requires highly performant pipelines for managing data ingestion, validation, and processing large data volumes from IOT devices.I am interested in knowing:- The best way to ingest from EventHub/Kafka sinks-...
- 980 Views
- 1 replies
- 0 kudos
- 0 kudos
Hi @hagarciaj, Certainly! Handling data pipelines for large volumes from IoT devices is crucial. Let’s dive into each aspect: Ingestion from EventHub/Kafka Sinks: Azure Event Hubs provides an Apache Kafka endpoint, allowing you to connect using t...
- 0 kudos
- 1796 Views
- 1 replies
- 0 kudos
Find value in any column in a table
Hi,I'm not sure if this is a possible scenario, but is there, by any chance a way to query all the columns of a table for searching a value? Explanation: I want to search for a specific value in all the columns of a databricks table. I don't know whi...
- 1796 Views
- 1 replies
- 0 kudos
- 0 kudos
Hi @Adil, Certainly! When you need to search for a specific value across all columns in a table, you can use SQL queries to achieve this. You can construct a query that checks each column for the desired value. Here are a few approaches you can con...
- 0 kudos
- 1008 Views
- 2 replies
- 0 kudos
Got Suspended while taking Databricks Certified Data Analyst Associate Assessment
Hi Team, What I experienced today from the proctor was so not nice, my experience with proctor today was very frustrating and pathetic, I was taking my assessment today 11-26-2023, 2pm, I was already on between question 22nd to 25th when my assessmen...
- 1008 Views
- 2 replies
- 0 kudos
- 0 kudos
@osalawu Sorry to hear you had an issue with your exam. In order to protect your Webassessor account information, please file a ticket with our support team. Please include your Webassessor login ID, the exam, and a couple of dates and times that wil...
- 0 kudos
- 1227 Views
- 0 replies
- 0 kudos
Arguments parsing in Databricks python jobs
On Databricks created a job task with task type as Python script from s3. However, when arguments are passed via Parameters option, running into unrecognized arguments' error.Code in s3 file:import argparse def parse_arguments(): parser = argpar...
- 1227 Views
- 0 replies
- 0 kudos
Connect with Databricks Users in Your Area
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.
If there isn’t a group near you, start one and help create a community that brings people together.
Request a New Group-
AI Summit
4 -
Azure
3 -
Azure databricks
3 -
Bi
1 -
Certification
1 -
Certification Voucher
2 -
Chatgpt
1 -
Community
7 -
Community Edition
3 -
Community Members
2 -
Community Social
1 -
Contest
1 -
Data + AI Summit
1 -
Data Engineering
1 -
Data Processing
1 -
Databricks Certification
1 -
Databricks Cluster
1 -
Databricks Community
11 -
Databricks community edition
3 -
Databricks Community Rewards Store
3 -
Databricks Lakehouse Platform
5 -
Databricks notebook
1 -
Databricks Office Hours
1 -
Databricks Runtime
1 -
Databricks SQL
4 -
Databricks-connect
1 -
DBFS
1 -
Dear Community
3 -
Delta
10 -
Delta Live Tables
1 -
Documentation
1 -
Exam
1 -
Featured Member Interview
1 -
HIPAA
1 -
Integration
1 -
LLM
1 -
Machine Learning
1 -
Notebook
1 -
Onboarding Trainings
1 -
Python
2 -
Rest API
11 -
Rewards Store
2 -
Serverless
1 -
Social Group
1 -
Spark
1 -
SQL
8 -
Summit22
1 -
Summit23
5 -
Training
1 -
Unity Catalog
4 -
Version
1 -
VOUCHER
1 -
WAVICLE
1 -
Weekly Release Notes
2 -
weeklyreleasenotesrecap
2 -
Workspace
1
- « Previous
- Next »