cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

mppradeesh
by New Contributor
  • 1124 Views
  • 1 replies
  • 1 kudos

Connecting to redshift from databricks notebooks without password using IAM

Hello all,Have you ever tried to connect to redshift from databricks notebooks without password using IAM. Pradeesh M P  

  • 1124 Views
  • 1 replies
  • 1 kudos
Latest Reply
datastones
Contributor
  • 1 kudos

you should create an IAM assumed role, adding the principal as the aws account that hosts your databricks env. 

  • 1 kudos
lisaiyer
by New Contributor II
  • 1718 Views
  • 3 replies
  • 0 kudos

Resolved! fs ls lists files that i cannot navigate to an view

Hi Community - I have an issue and I did not find any effective solution so hoping someone can help here. When I use %fs ls "dbfs:/Workspace/Shared/" I see 2 folders but when i navigate to the folder I only see 1. Can someone help me with this issue....

lisaiyer_0-1727709884491.png lisaiyer_1-1727709994294.png lisaiyer_2-1727710024754.png
  • 1718 Views
  • 3 replies
  • 0 kudos
Latest Reply
gchandra
Databricks Employee
  • 0 kudos

Possible UI bug. Submit a Case with the Engineering team.  https://help.databricks.com/s/ (Right top SUBMIT CASE)

  • 0 kudos
2 More Replies
Srini41
by New Contributor
  • 1247 Views
  • 1 replies
  • 0 kudos

org.rocksdb.RocksDBException: No space left on device

Structured stream is failing intermittently with following message. org.rocksdb.RocksDBException: No space left on device .Do we have a settings on databricks to assigned limited diskpace to the checkpoint tracking?Appreciate any help with resolving ...

  • 1247 Views
  • 1 replies
  • 0 kudos
Latest Reply
gchandra
Databricks Employee
  • 0 kudos

Can you share details like the DBR version, and are you using anyForEachBatch?

  • 0 kudos
my_community2
by New Contributor III
  • 24087 Views
  • 9 replies
  • 6 kudos

Resolved! dropping a managed table does not remove the underlying files

the documentation states that "drop table":Deletes the table and removes the directory associated with the table from the file system if the table is not EXTERNAL  table. An exception is thrown if the table does not exist.In case of an external table...

image.png
  • 24087 Views
  • 9 replies
  • 6 kudos
Latest Reply
MajdSAAD_7953
New Contributor II
  • 6 kudos

Hi,There is a way to force delete files after drop the table and don't wait 30 days to see size in S3 decrease?Tables that I dropped related to the dev and staging, I don't want to keep there files for 30 days 

  • 6 kudos
8 More Replies
PearceR
by New Contributor III
  • 18090 Views
  • 4 replies
  • 1 kudos

Resolved! custom upsert for delta live tables apply_changes()

Hello community :).I am currently implementing some pipelines using DLT. They are working great for my medalion architecture for landed json in bronze -> silver (using apply_changes) then materialized gold views ontop.However, I am attempting to crea...

  • 18090 Views
  • 4 replies
  • 1 kudos
Latest Reply
Harsh141220
Databricks Partner
  • 1 kudos

Is it possible to have custom upserts for streaming tables in delta live tables?Im getting the error:pyspark.errors.exceptions.captured.AnalysisException: `blusmart_poc.information_schema.sessions` is not a Delta table.

  • 1 kudos
3 More Replies
Ambika
by New Contributor
  • 6421 Views
  • 2 replies
  • 1 kudos

Error while Resetting my Community Edition Password

I recently tried to create my account with Databricks Community Edition. I have singed up for it and received verification email. After that I have to reset my password. But while doing so I am always getting the following error. Can someone help me ...

image
  • 6421 Views
  • 2 replies
  • 1 kudos
Latest Reply
swredb
New Contributor II
  • 1 kudos

Receiving the same error when creating a new account - "An error has occurred. Please try again later."

  • 1 kudos
1 More Replies
pinaki1
by New Contributor III
  • 2615 Views
  • 1 replies
  • 0 kudos

PySparkRuntimeError: [CONTEXT_ONLY_VALID_ON_DRIVER] It appears that you are attempting to reference

Getting The above error for this lineresult_df.rdd.foreachPartition(self.process_partition)

  • 2615 Views
  • 1 replies
  • 0 kudos
Latest Reply
Pradeep54
Databricks Employee
  • 0 kudos

The error message "CONTEXT_ONLY_VALID_ON_DRIVER" indicates that you are attempting to reference SparkContext from a broadcast variable, action, or transformation. SparkContext can only be used on the driver, not in code that runs on workers. This is ...

  • 0 kudos
Saf4Databricks
by Contributor
  • 9734 Views
  • 4 replies
  • 0 kudos

Reading JSON from Databricks Workspace

I am using second example from Databricks` official document here: Work with workspace files. But I'm getting following error:Question: What could be a cause of the error, and how can we fix it?ERROR: Since Spark 2.3, the queries from raw JSON/CSV fi...

  • 9734 Views
  • 4 replies
  • 0 kudos
Latest Reply
szymon_dybczak
Esteemed Contributor III
  • 0 kudos

Hi @Saf4Databricks ,As you said, you probably need to add multiline options to make it work. You can use this option when creating temporary view or using pyspark api. Below is example of creating temporary view: CREATE TEMPORARY VIEW multilineJson U...

  • 0 kudos
3 More Replies
jaredrohe
by New Contributor III
  • 7094 Views
  • 5 replies
  • 2 kudos

Instance Profiles Do Not Work with Delta Live Tables Default Cluster Policy Access Mode "Shared"

Hello,I am attempting to configure Autoloader in File Notification mode with Delta Live Tables. I configured an instance profile, but it is not working because I immediately get AWS access denied errors. This is the same issue that is referenced here...

Data Engineering
Access Mode
Delta Live Tables
Instance Profiles
No Isolation Shared
  • 7094 Views
  • 5 replies
  • 2 kudos
Latest Reply
AcrobaticMonkey
New Contributor III
  • 2 kudos

Same issue here, the instance profile works fine for both No isolation and single access mode but not for shared

  • 2 kudos
4 More Replies
IsmaelHenzel1
by New Contributor II
  • 2726 Views
  • 1 replies
  • 1 kudos

Resolved! Delta Live Tables - ForeachBatch

I am wondering how to create complex streaming queries using Delta Live Tables (DLT). I can't find a way to use foreachBatch with it, and this is causing me some difficulty. I need to create a window using a lag without a time range, which is not pos...

  • 2726 Views
  • 1 replies
  • 1 kudos
Latest Reply
Brahmareddy
Esteemed Contributor
  • 1 kudos

Hi @IsmaelHenzel1,How are you doing today?As per understanding, Consider using Delta Live Tables (DLT) materialized views to handle complex streaming logic as DLT doesn’t currently support foreachBatch. For windowing with lag, DLT materialized views ...

  • 1 kudos
indianaDE
by New Contributor
  • 1392 Views
  • 1 replies
  • 0 kudos

%run and Repos path error

We have one notebook(N1) which uses the %run command to call a second notebook(N2) which also calls a third notebook(N3) using %run. When running the %run cell within N2, N3 is successfully called and run. When running the %run cell within N1 we  get...

  • 1392 Views
  • 1 replies
  • 0 kudos
Latest Reply
Brahmareddy
Esteemed Contributor
  • 0 kudos

Hi @indianaDE,How are you doing today?As per my understanding,  Consider checking the relative paths you’re using in the %run commands, as the recent update might have changed how Databricks resolves paths for notebooks under the new Workspace/Repos ...

  • 0 kudos
oakhill
by New Contributor III
  • 3501 Views
  • 3 replies
  • 0 kudos

Cannot develop Delta Live Tables using Runtime 14 or 15.

When trying to develop a Delta Live Table-pipeline with my very generic clusters (runtime 14.3 or 15.4 LTS), I get th following error:  The Delta Live Tables (DLT) module is not supported on this cluster. You should either create a new pipeline or us...

  • 3501 Views
  • 3 replies
  • 0 kudos
Latest Reply
zoe-durand
Databricks Employee
  • 0 kudos

Hi @oakhill , as stated above, in order for DLT notebooks to work well you need to create a pipeline (which it sounds like you did!). You are correct - running a notebook cell will trigger a "Validate" action on the entire pipeline code. Alternativel...

  • 0 kudos
2 More Replies
Sampath_Kumar
by New Contributor II
  • 14731 Views
  • 2 replies
  • 0 kudos

Volume Limitations

I have a use case to create a table using JSON files. There are 36 million files in the upstream(S3 bucket). I just created a volume on top of it. So the volume has 36M files.  I'm trying to form a data frame by reading this volume using the below sp...

  • 14731 Views
  • 2 replies
  • 0 kudos
yagmur
by New Contributor II
  • 1748 Views
  • 1 replies
  • 0 kudos

Authentication error on Git status fetch

when i try to change the branch i cannot, it says i need to create a repo. then i try to create repo but it says your git credentials need to be corrected. i try both access token and also azure active directory but still not working. do i need anoth...

  • 1748 Views
  • 1 replies
  • 0 kudos
Latest Reply
nicole_lu_PM
Databricks Employee
  • 0 kudos

Hi Yagmur, You should not need admin access in the workspace to create Git folders, but you need access to the remote repository you are trying to clone. Can you check your token by cloning the remote repo locally? If you continue to run into issues,...

  • 0 kudos
Labels