cancel
Showing results for 
Search instead for 
Did you mean: 
Get Started Discussions
Start your journey with Databricks by joining discussions on getting started guides, tutorials, and introductory topics. Connect with beginners and experts alike to kickstart your Databricks experience.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

thibault
by Contributor III
  • 8602 Views
  • 6 replies
  • 0 kudos

Asset Bundles git branch per target

Hi,I am migrating from dbx to Databricks Asset Bundles (DAB) a deployment setup where I have specific parameters per environment. This was working well with dbx, and I am trying now to define those parameters defining targets (3 targets : dev, uat, p...

  • 8602 Views
  • 6 replies
  • 0 kudos
Latest Reply
thibault
Contributor III
  • 0 kudos

Something must have changed in the meantime on Databricks side. I have only updated databricks CLI to 016 and now, using a git / branch under each target deploys this setup, where feature-dab is the branch I want the job to pull sources from, I see t...

  • 0 kudos
5 More Replies
Phani1
by Valued Contributor II
  • 14448 Views
  • 1 replies
  • 0 kudos

RBAC

Hi Team,can provide you with step-by-step instructions on how to create role-based access and attribute-based access in Databricks.Regards,Phanindra

  • 14448 Views
  • 1 replies
  • 0 kudos
sanjay
by Valued Contributor II
  • 6311 Views
  • 5 replies
  • 0 kudos

Resolved! maxFilesPerTrigger not working while loading data from Unity Catalogue table

Hi,I am using streaming on unity catalogue tables and trying to limit the number of records read in each batch. Here is my code but its not respecting maxFilesPerTrigger, instead reads all available data. (spark.readStream.option("skipChangeCommits",...

  • 6311 Views
  • 5 replies
  • 0 kudos
Latest Reply
Witold
Honored Contributor
  • 0 kudos

I believe you misunderstand the fundamentals of delta tables. `maxFilesPerTrigger` has nothing to do with how many rows you will process at the same time. And if you really want to control the number of records per file, then you need to adapt the wr...

  • 0 kudos
4 More Replies
Henrik
by New Contributor III
  • 2366 Views
  • 1 replies
  • 1 kudos

Resolved! Serving pay-per-token Chat LLM Model

We have build a chat solution on LLM RAG chat model, but we face an issue when we spin up a service endpoint to host the model.According to the documentation, there should be sevral LLM models available as pay-per-token endpoints, for instance the DB...

  • 2366 Views
  • 1 replies
  • 1 kudos
Latest Reply
daniel_sahal
Esteemed Contributor
  • 1 kudos

@Henrik The documentation clearly states that it should be available in west europe, but i'm also unable to see DBRX ppt endpoint. I think that it would be best to raise an Azure Support ticket - they should either somehow enable it on your workspace...

  • 1 kudos
NelsonE
by New Contributor III
  • 1046 Views
  • 1 replies
  • 0 kudos

Databricks repo not working with installed python libraries

Hello,I'm trying to use some installed libraries in my cluster.I created a single node cluster with the version Runtime version 14.3 LTS.I also installed libraries like oracledb==2.2.1Then when I try to use python to load this libraries in the worksp...

  • 1046 Views
  • 1 replies
  • 0 kudos
Latest Reply
Brahmareddy
Esteemed Contributor
  • 0 kudos

Hello Nelson, How are you dong today?Try checking the permissions on your repo folder to ensure your cluster can access it without issues. Use absolute paths when running from your GitHub repo to avoid directory confusion. Reinstall the oracledb libr...

  • 0 kudos
unity_Catalog
by New Contributor III
  • 1000 Views
  • 1 replies
  • 0 kudos

Concurrent installation of UCX in multiple workspaces

I am trying to install UCX in multiple workspaces concurrently through bash script but facing issue continuously. I have created separate directories for each workspace. I'm facing the below error everytime.Installing UCX in Workspace1Error: lib: cle...

  • 1000 Views
  • 1 replies
  • 0 kudos
Latest Reply
Brahmareddy
Esteemed Contributor
  • 0 kudos

Hi @unity_Catalog, How are you dong today?Try running the UCX installations sequentially to avoid file access conflicts, adding a small delay between each. Ensure each workspace uses a separate installation directory to prevent overlap. You could als...

  • 0 kudos
rohit_kumar
by New Contributor
  • 1587 Views
  • 1 replies
  • 0 kudos

Informatica API data retrieval through Datrbicks Workflows or ADF!! Which is better?

                                                              The above set of activities took some 4 hours at ADF to explore and design with greater ease of use, connections, monitoring and it could have probably taken 4 days or more using Databrick...

rohit_kumar_0-1724760571211.png rohit_kumar_1-1724760571240.png rohit_kumar_2-1724760571264.png rohit_kumar_3-1724760571282.png
  • 1587 Views
  • 1 replies
  • 0 kudos
Latest Reply
Brahmareddy
Esteemed Contributor
  • 0 kudos

Hi @rohit_kumar, How are you dong today?To answer to your subject line question, If you're looking for flexibility and integration, Databricks Workflows might be better since it offers native support for complex data transformations and seamless inte...

  • 0 kudos
Haneen_Heeba
by New Contributor III
  • 1105 Views
  • 1 replies
  • 0 kudos

Exam suspended

Hello Databricks Team,I had a terrible experience during  certification exam and I have also raised a ticket to the Databricks team but haven’t got any response to the mail till now. I appeared for the Databricks certified Associate Developer for Apa...

  • 1105 Views
  • 1 replies
  • 0 kudos
Latest Reply
Haneen_Heeba
New Contributor III
  • 0 kudos

Hi @Cert-Team Could you please look into this issue and assist me in rescheduling my exam since it’s very important for me to provide my certification to my employer at the earliest.Thanks and RegardsHaneen Heeba

  • 0 kudos
sanjay
by Valued Contributor II
  • 2413 Views
  • 1 replies
  • 1 kudos

Resolved! Remove duplicate records using pyspark

Hi,I am trying to remove duplicate records from pyspark dataframe and keep the latest one. But somehow df.dropDuplicates["id"] keeps the first one instead of latest. One of the option is to use pandas drop_duplicates, Is there any solution in pyspark...

  • 2413 Views
  • 1 replies
  • 1 kudos
Latest Reply
szymon_dybczak
Esteemed Contributor III
  • 1 kudos

Hi @sanjay ,You can write window function that will rank your rows and then filter rows based on that rank.Take a look on below stackoverflow thread: https://stackoverflow.com/questions/63343958/how-to-drop-duplicates-but-keep-first-in-pyspark-datafr...

  • 1 kudos
Phani1
by Valued Contributor II
  • 1720 Views
  • 1 replies
  • 0 kudos

best practices for implementing early arriving fact handling

 Hi All,Can you please share us the best practices for implementing early arriving fact handling in databricks for streaming data processed in near real time using structured streaming.There are many ways to handle this use case in batch/mini batch. ...

Phani1_0-1724754033290.png
  • 1720 Views
  • 1 replies
  • 0 kudos
Latest Reply
Phani1
Valued Contributor II
  • 0 kudos

Greetings Team, I would like to inquire if any of you have suggestions regarding the query.

  • 0 kudos
lean-ai
by New Contributor
  • 995 Views
  • 0 replies
  • 0 kudos

Any women in data+ai interested in gathering to form some a casual networking group based in Sydney?

Hi all,Inspired by the Women in Data+AI Breakfast at the Databricks World Tour in Sydney, I'm considering the potential for a new womens' networking community based in Sydney. This group would cater to women currently working in or interested in purs...

  • 995 Views
  • 0 replies
  • 0 kudos
Raven91
by New Contributor III
  • 7458 Views
  • 13 replies
  • 4 kudos

Can't activate users

A while back apparently a user became inactive on our Databricks platform from unknown reason.So far everything we tried haven't worked:Delete and manually re-create the userDelete and let SSO to create the user on loginUse Databricks CLI - shows no ...

  • 7458 Views
  • 13 replies
  • 4 kudos
Latest Reply
AsgerLarsen
New Contributor III
  • 4 kudos

I had the same issue on my admin account seamlessly becoming inactive at random.The problem occurred after helping our platform team testing out a new setup of another databricks workspace.I was testing the setup logging in as a standard user and an ...

  • 4 kudos
12 More Replies
vengroff
by New Contributor II
  • 2152 Views
  • 0 replies
  • 1 kudos

Installation of cluster requirements.txt does not appear to run as google service account

I am running on 15.4 LTS Beta, which supports cluster-level requirements.txt files. The particular requirements.txt I have uploaded to my workspace specifies an extra index URL using a first line that looks like--extra-index-url https://us-central1-p...

  • 2152 Views
  • 0 replies
  • 1 kudos
VReddy1601
by New Contributor
  • 1143 Views
  • 0 replies
  • 0 kudos

Apply Changes Error in DLT Pipeline

Hi Team, Am Trying to use Apply changes from Bronze to Silver using the below.                   @dlt.table(           name="Silver_Orders",           comment="This table - hive_metastore.silver.Orders reads data from the Bronze layer and writes it t...

  • 1143 Views
  • 0 replies
  • 0 kudos
samandrew3
by New Contributor
  • 2534 Views
  • 1 replies
  • 0 kudos

Unlocking the Power of Databricks: A Comprehensive Guide for Beginners

In the rapidly evolving world of big data, Databricks has emerged as a leading platform for data engineering, data science, and machine learning. Whether you're a data professional or someone looking to expand your knowledge, understanding Databricks...

  • 2534 Views
  • 1 replies
  • 0 kudos
Latest Reply
Rishabh-Pandey
Esteemed Contributor
  • 0 kudos

@samandrew3 keep up the good work 

  • 0 kudos

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels