- 2427 Views
- 2 replies
- 1 kudos
Data lineage on views
I do not know if this is intended behavior of data lineage but for me it is weird.When I create a view based on two tables the data lineage upstream looks correct. But when I replace the view to only use one of the tables, then data lineage upstream ...
- 2427 Views
- 2 replies
- 1 kudos
- 1 kudos
After some thoughts, i have come to this conclusion:Data lineage on views is working as one should expect. I strongly recommend that this feature is redesigned so it shows the result of the lastest view.
- 1 kudos
- 4768 Views
- 3 replies
- 0 kudos
Iterative read and writes cause java.lang.OutOfMemoryError: GC overhead limit exceeded
I have an iterative algorithm which read and writes a dataframe iteration trough a list with new partitions, like this: for p in partitions_list:df = spark.read.parquet("adls_storage/p")df.write.format("delta").mode("overwrite").option("partitionOver...
- 4768 Views
- 3 replies
- 0 kudos
- 0 kudos
@daniel_sahalI've attached the wrong snip/ Actually it is FULL GC Ergonomics, which was bothering me. Now I am attaching the correct snip. But as you said I scaled a bit. The thing I forgot to mention is that the table is wide - more than 300 column...
- 0 kudos
- 2324 Views
- 1 replies
- 3 kudos
Resolved! Using DeltaTable.merge() and generating surrogate keys on insert?
I'm using merge to upsert data into a table:DeltaTable.forName(DESTINATION_TABLE).as("target").merge(merge_df.as("source") ,"source.topic = target.topic and source.key = target.key").whenMatched().updateAll().whenNotMatched().insertAll().execute()Id ...
- 2324 Views
- 1 replies
- 3 kudos
- 3 kudos
@Dekova 1) uuid() is non-deterministic meaning that it will give you different result each time you run this function2) Per the documentation "For Databricks Runtime 9.1 and above, MERGE operations support generated columns when you set spark.databri...
- 3 kudos
- 2976 Views
- 4 replies
- 0 kudos
Import dbfs file into workspace using Python SDK
Hello,I am looking to replicate the functionality provided by the databricks_cli Python package using the Python SDK. Previously, using the databricks_cli WorkspaceApi object, I could use the import_workspace or import_workspace_dir methods to move a...
- 2976 Views
- 4 replies
- 0 kudos
- 0 kudos
Even, I am looking for a way to bring files present in S3 to Workspace programmatically.
- 0 kudos
- 602 Views
- 0 replies
- 0 kudos
Big time differences in reading tables
When I read managed table in #databricks# i can see big differences in time spent. Small test table with just 2 records is once loaded in 3 seconds and another time in 30 seconds. Reading table_change for this tinny table took 15 minutes. Don't know ...
- 602 Views
- 0 replies
- 0 kudos
- 1953 Views
- 2 replies
- 4 kudos
Resolved! Is there a plan to support workflow jobs to be stored in a subfolder?
I have many workflow jobs created and they all in a flat list. Is there a way to create (kind of) sub folders that I can category my databricks workflow jobs into it (kind of organizer)...
- 1953 Views
- 2 replies
- 4 kudos
- 4 kudos
@Anonymous thanks for the suggestion. And thanks @Vinay_M_R a lot for answering the question. The solution mentioned is doable but less optimized way to do. Everyone in the team has to follow the same rules especially for shared jobs, and sometimes n...
- 4 kudos
- 2088 Views
- 0 replies
- 0 kudos
terraform jobs depends_on
I am attempting to automate Jobs creation using Databrick Terraform provider. I have a number of task that will "depends_on" each other and am trying to use dynamic content to do this. Each task name is stored in a string array so looping over th...
- 2088 Views
- 0 replies
- 0 kudos
- 505 Views
- 0 replies
- 0 kudos
Init scripts in legacy workspace (pre-E2)
Hello,I've got a legacy workspace (not E2) and I am trying to move my cluster scoped init script to the workspace area (from DBFS). It doesn't seem to be possible to store a shell script in the workspace area (Accepted formats: .dbc, .scala, .py, .sq...
- 505 Views
- 0 replies
- 0 kudos
- 1436 Views
- 3 replies
- 1 kudos
Databricks on AWS
I want to host databricks on AWS. I want to know if we create databricks on top of AWS, will it be created in same account's VPC or will it be created out of my AWS account?If it is going to be created in my account, will it create a new VPC for me?T...
- 1436 Views
- 3 replies
- 1 kudos
- 1 kudos
Hi, If you want to know more about the how to properly setup the databricks on top of AWS. I would really recommend to do the AWS platform administrator course of Databricks. In this everything is explained what you need to know. Hopes this helps.Kin...
- 1 kudos
- 1670 Views
- 1 replies
- 0 kudos
503 Error from Databricks when Cluster Inactive/Starting Up via Alteryx
Hello,I have been connecting to Databricks via Alteryx. It works fine when our cluster is active, but returns a 503 Service Unavailable error if the Cluster is inactive/starting up. I have previously reached out to Alteryx, but they have told me this...
- 1670 Views
- 1 replies
- 0 kudos
- 0 kudos
I should have mentioned in the original post, we are using Microsoft Azure and a Simba Spark ODBC Driver.
- 0 kudos
- 683 Views
- 0 replies
- 0 kudos
How to access ADLS Gen2 hdfs from a databricks cluster which has credential passthrough enabled?
When executing through a Databricks cluster with credential passthrough enabled, I wish to obtain supplementary file attributes in ADLS, such as the file's last modified time, which are currently unavailable in the databricks dbutils.fs.ls function.W...
- 683 Views
- 0 replies
- 0 kudos
- 2443 Views
- 1 replies
- 1 kudos
No points shown in databricks new community page
There are no points displayed in Databricks new community page. Is it the same for all or only for me if I have done something wrong.
- 2443 Views
- 1 replies
- 1 kudos
- 1 kudos
same concerns with you. on my account also didnot find where place display how many point in my account
- 1 kudos
- 2969 Views
- 3 replies
- 1 kudos
Unable to access S3 objects from Databricks using IAM access keys in both AWS and Azure Databricks
Hi Team,We are trying to connect to Amazon S3 bucket from both Databricks running on AWS and Azure using IAM access keys directly through Scala code in Notebook and we are facing com.amazonaws.services.s3.model.AmazonS3Exception: Forbidden; with stat...
- 2969 Views
- 3 replies
- 1 kudos
- 1 kudos
Hi @Obulreddy We haven't heard from you since the last response from @KaKa ​, and I was checking back to see if her suggestions helped you. Or else, If you have any solution, please share it with the community, as it can be helpful to others. Also,...
- 1 kudos
- 2584 Views
- 4 replies
- 3 kudos
How to read gcs paths with square barkets?
Hi! I'm trying to read a file using Scala from gcs that has square brackets in the file path.I keep getting the following error:URISyntaxException: Illegal character in path at index 209I tried putting an extra front slash in front of them but it sti...
- 2584 Views
- 4 replies
- 3 kudos
- 3 kudos
Hi @Kaniz ! Thank you for your help.However, when I try using you're code I still get an error: "URISyntaxException: Illegal character in path at index "I'm trying to read a txt file. This is the file path: "gs://my-bucket/my Data/sparkTests/GM-1220,...
- 3 kudos
- 3046 Views
- 0 replies
- 0 kudos
List all delta tables in a database with total size, last snapshot size and user using python/sql
I am trying to list all delta tables in a database and retrieve the following columns: `totalsizeinbyte`, `sizeinbyte` (i.e. the size of last snap shot size) and `created_by` (`lastmodified_by` could also work). Checking online I came across the foll...
- 3046 Views
- 0 replies
- 0 kudos
Connect with Databricks Users in Your Area
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.
If there isn’t a group near you, start one and help create a community that brings people together.
Request a New Group-
AI Summit
4 -
Azure
2 -
Azure databricks
2 -
Bi
1 -
Certification
1 -
Certification Voucher
2 -
Community
7 -
Community Edition
3 -
Community Members
1 -
Community Social
1 -
Contest
1 -
Data + AI Summit
1 -
Data Engineering
1 -
Databricks Certification
1 -
Databricks Cluster
1 -
Databricks Community
8 -
Databricks community edition
3 -
Databricks Community Rewards Store
3 -
Databricks Lakehouse Platform
5 -
Databricks notebook
1 -
Databricks Office Hours
1 -
Databricks Runtime
1 -
Databricks SQL
4 -
Databricks-connect
1 -
DBFS
1 -
Dear Community
1 -
Delta
9 -
Delta Live Tables
1 -
Documentation
1 -
Exam
1 -
Featured Member Interview
1 -
HIPAA
1 -
Integration
1 -
LLM
1 -
Machine Learning
1 -
Notebook
1 -
Onboarding Trainings
1 -
Python
2 -
Rest API
10 -
Rewards Store
2 -
Serverless
1 -
Social Group
1 -
Spark
1 -
SQL
8 -
Summit22
1 -
Summit23
5 -
Training
1 -
Unity Catalog
3 -
Version
1 -
VOUCHER
1 -
WAVICLE
1 -
Weekly Release Notes
2 -
weeklyreleasenotesrecap
2 -
Workspace
1
- « Previous
- Next »