Connect with fellow community members to discuss general topics related to the Databricks platform, industry trends, and best practices. Share experiences, ask questions, and foster collaboration within the community.
Hi,I would like to get some support in creating a Community User Group in Madrid, Spain. It would be nice to host events/meetings/discussions ...Regards,Ángel
When I save a certain Python notebook where I have selected Hide Code and Hide Results on certain cells, those conditions persist. For example, when I come back the next day in a new session, the hidden material is still hidden.When the notebook is ...
In my situation we cannot split this notebook as ADF pipeline is already in PROD ,I have tried to use the option %%capture .It helps to ran the notebook within size limits but somehow it is corrupting the output. Also checked in the Databricks AI and...
Hello All,When I try to deploy my bundle, I get the following error.I can't edit the bundle.tf.json, I suppose it is created automatically. Does anyone have a solution for the same problem?Many Thanks,Can$ databricks bundle deploy -t devBuilding my_p...
I have a setup where i am replicating Delta live tables parquet, and checkpoint files using azure RAGZRS to peer region for disaster recovery. When i load the the replicated files in peer region using delta format, i get an error that _delta_log/0000...
Team,I get a ConcurrentAppendException: Files were added to the root of the table by a concurrent update when trying to update a table which executes via jobs with for each activity in ADF,I tried with Databricks run time 14.x and set the delete vect...
In case of such an issue, I would like to suggest apply retry and try except logic (you can use one of existing libraries) in both concurrent updates - it should help, and jobs won't report any error.
Hello,We are using a 5 worker node DLT job compute for a continuous mode streaming pipeline. The worker configuration is Standard_D4ads_v5 i.e. 4 cores so total cores across 5 workers is 20 cores.We have wide transformation at some places in the pipe...
Hi @PushkarDeole ,Each Delta Live Tables pipeline has two associated clusters:The updates cluster processes pipeline updates.The maintenance cluster runs daily maintenance tasks.According to docs, if you want to configure settings at the pipeline lev...
Hi,I am planning to take the Databricks Associate exam in the upcoming week. My current ID proof is my original driving license issued by the Tamil Nadu government; however, it is laminated rather than a hard plastic card.Could you please confirm if ...
Hi Databricks community and @Cert-Team,Can you please help me here? The reason I am asking is Microsoft certification doesn’t approve laminated id proof’s.
We currently have approximately 60 tables that collectively exceed 500GB, representing 99.9% of our total database size. One potential solution is to migrate these larger tables to Databricks, which may help us mitigate synchronization costs and addr...
I'm trying to create community edition account but everytime i try it shows" An error has occurred. Please try again later."I have attached the screenshot.I also tried accesiing Databricks in a differenct pc based on the answers from previous threads...
By any chance is that email address already registered with another Databricks Tier (like company or paid)?
Can you try in incognito mode with all ad blockers/popup blockers disabled
I’m trying to fetch billing data from an AWS account using boto3 to assume a role that has access to this information. This operation works fine in No Isolation and Single User access modes, but it fails in Shared access mode. Since I need to store t...
Instance Profiles are open for all, they won't work in Shared access mode.
Using Storage credentials, create External Locations. Cleaner way to Govern the access.
I am attempting to create a Databrick Repo in a workspace via Terraform. I would like the Repo and the associated Git Credential to be associated with a Service Principal. In my initial run, the Terraform provider is associated with the user defined ...
Hi Kinger and Debi-Moha,
Do the steps in the "Use a service principal with Databricks Git folders" documentation work for you?
Specifically for Terraform: https://docs.databricks.com/en/repos/ci-cd-techniques-with-repos.html#terraform-integration
Th...
Hey guys, after some successful data preprocessing without any errors, i have a final dataframe shape with the shape of ~ (200M, 150). the cluster i am using has sufficient ram + cpus + autoscaling, all metrics look fine after the job was done.The pr...
@szymon_dybczak i could resolve it now! basically, i broke the process down into further subprocesses, for each sub process, i cached and wrote them all into delta table (without overwritting), the next subprocess needs to read data in the delta tabl...
I am displaying a table in a notebook dashboard. One column of the data is conceptually a list of strings. I can originate or convert the list as whatever format would be useful (as a string representing a JSON array, as an ARRAY struct, etc.). I w...
Hi @DavidKxx ,What you can do is convert your array to into an HTML formatted string with bullet points.Here is the code: # Sample data with an array column
data = [
(1, ['Apple', 'Banana', 'Cherry']),
(2, ['Dug', 'Elephant']),
(3, ['Fish...
When I'm displaying a Table-style visualization in a notebook dashboard, is there a setting I can apply to a text column so that it automatically word-wraps text longer than the display width of the column?For example, in the following dashboard disp...
Hi @DavidKxx ,That is quite similar question to one about displaying array as bullet list. Since you were successful in implementing displayHTML, what do you think about doing similar in this case? # Sample DataFrame with long text
data = [
(1, '...