cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 
Data + AI Summit 2024 - Data Engineering & Streaming

Forum Posts

m997al
by Contributor III
  • 167 Views
  • 1 replies
  • 0 kudos

Errors using Databricks Extension for VS Code on Windows

Hi - I am trying to get my VS Code (running on Windows) to work with the Databricks extension for VS Code.  It seems like I can almost get this to work.  Here is my setup:1. Using Databricks Extension v2.4.02. Connecting to Databricks cluster with ru...

  • 167 Views
  • 1 replies
  • 0 kudos
Latest Reply
m997al
Contributor III
  • 0 kudos

So I found my problem(s).  I had a local environment variable called "DATABRICKS_HOST" that was set to the wrong URL.My Databricks runtime version and the databricks-connect version were not the same.  When I made them both 15.4.x, everything works a...

  • 0 kudos
Dave1967
by New Contributor III
  • 118 Views
  • 2 replies
  • 0 kudos

Resolved! Serverless Compute no support for Caching data frames

Can anyone please tell me why df.cache() and df.persist() are not supported in Serevrless compute?Many Thanks

  • 118 Views
  • 2 replies
  • 0 kudos
Latest Reply
gchandra
Valued Contributor II
  • 0 kudos

Global caching functionality (and other global states used on classic clusters) is conceptually hard to represent on serverless computing. Serverless spark cluster optimizes the cache than the user.

  • 0 kudos
1 More Replies
mppradeesh
by New Contributor
  • 82 Views
  • 1 replies
  • 1 kudos

Connecting to redshift from databricks notebooks without password using IAM

Hello all,Have you ever tried to connect to redshift from databricks notebooks without password using IAM. Pradeesh M P  

  • 82 Views
  • 1 replies
  • 1 kudos
Latest Reply
datastones
Contributor
  • 1 kudos

you should create an IAM assumed role, adding the principal as the aws account that hosts your databricks env. 

  • 1 kudos
lisaiyer
by New Contributor
  • 139 Views
  • 3 replies
  • 0 kudos

Resolved! fs ls lists files that i cannot navigate to an view

Hi Community - I have an issue and I did not find any effective solution so hoping someone can help here. When I use %fs ls "dbfs:/Workspace/Shared/" I see 2 folders but when i navigate to the folder I only see 1. Can someone help me with this issue....

lisaiyer_0-1727709884491.png lisaiyer_1-1727709994294.png lisaiyer_2-1727710024754.png
  • 139 Views
  • 3 replies
  • 0 kudos
Latest Reply
gchandra
Valued Contributor II
  • 0 kudos

Possible UI bug. Submit a Case with the Engineering team.  https://help.databricks.com/s/ (Right top SUBMIT CASE)

  • 0 kudos
2 More Replies
Srini41
by New Contributor
  • 73 Views
  • 1 replies
  • 0 kudos

org.rocksdb.RocksDBException: No space left on device

Structured stream is failing intermittently with following message. org.rocksdb.RocksDBException: No space left on device .Do we have a settings on databricks to assigned limited diskpace to the checkpoint tracking?Appreciate any help with resolving ...

  • 73 Views
  • 1 replies
  • 0 kudos
Latest Reply
gchandra
Valued Contributor II
  • 0 kudos

Can you share details like the DBR version, and are you using anyForEachBatch?

  • 0 kudos
my_community2
by New Contributor III
  • 11238 Views
  • 9 replies
  • 6 kudos

Resolved! dropping a managed table does not remove the underlying files

the documentation states that "drop table":Deletes the table and removes the directory associated with the table from the file system if the table is not EXTERNAL  table. An exception is thrown if the table does not exist.In case of an external table...

image.png
  • 11238 Views
  • 9 replies
  • 6 kudos
Latest Reply
MajdSAAD_7953
New Contributor II
  • 6 kudos

Hi,There is a way to force delete files after drop the table and don't wait 30 days to see size in S3 decrease?Tables that I dropped related to the dev and staging, I don't want to keep there files for 30 days 

  • 6 kudos
8 More Replies
804082
by New Contributor III
  • 166 Views
  • 2 replies
  • 1 kudos

DLT Direct Publishing Mode

Hello,I'm working on a DLT pipeline and have a block of SQL that runs...USE CATALOG catalog_a; USE SCHEMA schema_a; CREATE OR REFRESH MATERIALIZED VIEW table_a AS SELECT ... FROM catalog_b.schema_b.table_b;Executing this block returns the following.....

  • 166 Views
  • 2 replies
  • 1 kudos
Latest Reply
804082
New Contributor III
  • 1 kudos

Ah, thank you! I couldn't find anything about Direct Publishing Mode through my Google search (in fact, if you search it now, this post is the top result).I believe I'll be waiting until it goes into public preview/GA. Removing USE CATALOG and USE SC...

  • 1 kudos
1 More Replies
AH
by New Contributor III
  • 138 Views
  • 2 replies
  • 0 kudos

Databricks Genie AI Next Level Q&A

Hi Team Could Please you answer these three questions to onboard Genie Space for Analytics?Do you know if we can use GenieSpace in our web application through API or SDK?Is there any way to manage access control other than the Unity catalog?Can we ad...

  • 138 Views
  • 2 replies
  • 0 kudos
Latest Reply
jennie258fitz
New Contributor II
  • 0 kudos

@AH my loyola wrote:Hi Team Could Please you answer these three questions to onboard Genie Space for Analytics?Do you know if we can use GenieSpace in our web application through API or SDK?Is there any way to manage access control other than the Uni...

  • 0 kudos
1 More Replies
shadowinc
by New Contributor III
  • 1750 Views
  • 6 replies
  • 3 kudos

Databricks SQL endpoint as Linked Service in Azure Data Factory

We have a special endpoint that grants access to delta tables and we want to know if we can use SQL endpoints as a linked service in ADF.If yes then which ADF-linked service would be suitable for this?Appreciate your support on this. 

Data Engineering
SQl endpoint
  • 1750 Views
  • 6 replies
  • 3 kudos
Latest Reply
SANJAYKJ
New Contributor II
  • 3 kudos

I see linked service connect "Azure Databricks Delta Lake" in Azure Data Factory,  isn't it connecting to Databrcicks SQL Endpoint ? 

  • 3 kudos
5 More Replies
MrJava
by New Contributor III
  • 7364 Views
  • 13 replies
  • 12 kudos

How to know, who started a job run?

Hi there!We have different jobs/workflows configured in our Databricks workspace running on AWS and would like to know who actually started the job run? Are they started by a user or a service principle using curl?Currently one can only see, who is t...

  • 7364 Views
  • 13 replies
  • 12 kudos
Latest Reply
Gianfranco
New Contributor II
  • 12 kudos

It is possible to retrieve this information through the system tables here is an exampleselect *,user_identity.emailfrom system.access.auditwhere event_date = 'xxx'workspace_id = 'xxx'and service_name = 'jobs' and request_params.job_id = 'xxx'

  • 12 kudos
12 More Replies
PearceR
by New Contributor III
  • 11267 Views
  • 4 replies
  • 1 kudos

Resolved! custom upsert for delta live tables apply_changes()

Hello community :).I am currently implementing some pipelines using DLT. They are working great for my medalion architecture for landed json in bronze -> silver (using apply_changes) then materialized gold views ontop.However, I am attempting to crea...

  • 11267 Views
  • 4 replies
  • 1 kudos
Latest Reply
Harsh141220
New Contributor II
  • 1 kudos

Is it possible to have custom upserts for streaming tables in delta live tables?Im getting the error:pyspark.errors.exceptions.captured.AnalysisException: `blusmart_poc.information_schema.sessions` is not a Delta table.

  • 1 kudos
3 More Replies
Ambika
by New Contributor
  • 4425 Views
  • 2 replies
  • 1 kudos

Error while Resetting my Community Edition Password

I recently tried to create my account with Databricks Community Edition. I have singed up for it and received verification email. After that I have to reset my password. But while doing so I am always getting the following error. Can someone help me ...

image
  • 4425 Views
  • 2 replies
  • 1 kudos
Latest Reply
swredb
New Contributor II
  • 1 kudos

Receiving the same error when creating a new account - "An error has occurred. Please try again later."

  • 1 kudos
1 More Replies
majo2
by New Contributor II
  • 292 Views
  • 1 replies
  • 2 kudos

tqdm progressbar in Databricks jobs

Hi,I'm using Databricks workflows to run a train job using `pytorch` + `lightning`. `lightning` has a built in progressbar built on `tqdm` that tracks the progress. It works OK in when I run the notebook outside of a workflow. But when I try to run n...

  • 292 Views
  • 1 replies
  • 2 kudos
Latest Reply
spatel
New Contributor II
  • 2 kudos

I'm having the same issue. I even tried followingfrom tqdm.autonotebook import tqdmbut when I do thisfor row in tqdm(df.itertuples(index=False)):I don't see the progress bar in Databricks workflow notebook. 

  • 2 kudos
pinaki1
by New Contributor III
  • 155 Views
  • 1 replies
  • 0 kudos

PySparkRuntimeError: [CONTEXT_ONLY_VALID_ON_DRIVER] It appears that you are attempting to reference

Getting The above error for this lineresult_df.rdd.foreachPartition(self.process_partition)

  • 155 Views
  • 1 replies
  • 0 kudos
Latest Reply
User16753724828
New Contributor III
  • 0 kudos

The error message "CONTEXT_ONLY_VALID_ON_DRIVER" indicates that you are attempting to reference SparkContext from a broadcast variable, action, or transformation. SparkContext can only be used on the driver, not in code that runs on workers. This is ...

  • 0 kudos

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group
Labels