cancel
Showing results for 
Search instead for 
Did you mean: 
Get Started Discussions
Start your journey with Databricks by joining discussions on getting started guides, tutorials, and introductory topics. Connect with beginners and experts alike to kickstart your Databricks experience.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

varshanagarajan
by New Contributor
  • 1382 Views
  • 2 replies
  • 1 kudos

Pandas API on Spark creates huge query plans

Hello,I have a piece of code written in Pyspark and Pandas API on Spark. On comparing the query plans, I see Pandas API on Spark creates huge query plans whereas Pyspark plan is a tiny one. Furthermore, with Pandas API on spark, we see a lot of incon...

  • 1382 Views
  • 2 replies
  • 1 kudos
Latest Reply
BS_THE_ANALYST
Honored Contributor III
  • 1 kudos

@FRB1984 could you provide some examples? I'm curious. My first thoughts would be around the shuffling. Check this out: https://spark.apache.org/docs/3.5.4/api/python/user_guide/pandas_on_spark/best_practices.html . There's an argument to be made abo...

  • 1 kudos
1 More Replies
sumner-williams
by New Contributor II
  • 191 Views
  • 1 replies
  • 1 kudos

Resolved! Table Counts

Hello,My company loads a lot of tables into a databricks schema. I would like to build a dashboard on what has been loaded, but SQL commands like select * from information_schema do not work. Instead we have SHOW TABLES {FROM} LIKE {}; And that fails...

  • 191 Views
  • 1 replies
  • 1 kudos
Latest Reply
BS_THE_ANALYST
Honored Contributor III
  • 1 kudos

Just trying to rule out some of the lower-hanging stuff. When you run your SQL statements i.e.  select * from information_schemaAre you using the correct namespace syntax i.e. {catalog_here}.information_schemaAre you using Unity Catalog?Example of th...

  • 1 kudos
DataDp
by New Contributor
  • 217 Views
  • 3 replies
  • 0 kudos

What permissions are needed to fix [INSUFFICIENT_PERMISSIONS] User does not have permission toSELECT

 Hi,I am getting the following error in Databricks when running a SELECT query:  [INSUFFICIENT_PERMISSIONS] Insufficient privileges: User does not have permission SELECT on any file. SQLSTATE: 42501Context:Environment: Unity Catalog enabledI am tryin...

  • 217 Views
  • 3 replies
  • 0 kudos
Latest Reply
WiliamRosa
New Contributor II
  • 0 kudos

If you’re getting an “Insufficient Permissions” error in Databricks, it usually means your user is missing one or more privileges required for the action you’re trying to perform. In Unity Catalog, for example, querying a view in dedicated compute mo...

  • 0 kudos
2 More Replies
leticialima__
by New Contributor III
  • 559 Views
  • 4 replies
  • 2 kudos

Resolved! add new column to a table and failing the previous jobs

 Hello community! I’m new to Databricks and currently working on a project structured in Bronze / Silver / Gold layers using Delta Lake and Change Data Feed.I recently added 3 new columns to a table and initially applied these changes via PySpark SQ...

  • 559 Views
  • 4 replies
  • 2 kudos
Latest Reply
Khaja_Zaffer
Contributor
  • 2 kudos

Hello @leticialima__ Good dayCan you please share the error observed on the driver log. is it : [Errno 13] Permission denied or No such file or directory? Please let me know the error on the driver log.  THank you. 

  • 2 kudos
3 More Replies
aw1
by New Contributor
  • 148 Views
  • 1 replies
  • 0 kudos

Steamlit in Databricks

HiI have developed a streamlit app locally on my desktop using dummy data, and now I want to be able to use actual data stored in azure blog storage. I have tried to run the same code within a notebook, but keep on getting dependency errors. Is there...

  • 148 Views
  • 1 replies
  • 0 kudos
Latest Reply
Advika
Databricks Employee
  • 0 kudos

Hello @aw1! What exact dependency errors or permission failures are you getting? Can you please share the error message?

  • 0 kudos
abhirupa7
by New Contributor
  • 217 Views
  • 2 replies
  • 0 kudos

databricks dashboard deployment (schema and catalog modification)

I have a databricks dashboard. I have deployed the lvdash.json file through yml (resource.json) from dev to qa env.Now I can see my dashboard published version in resources folder.I want to change the catalog and schema of those underlying queries I ...

  • 217 Views
  • 2 replies
  • 0 kudos
Latest Reply
alexajames
New Contributor II
  • 0 kudos

You can try using DAB to promote the dashboard and parameterize the query. For more details, check out the DAB dashboard documentation.

  • 0 kudos
1 More Replies
UddP
by New Contributor III
  • 19678 Views
  • 34 replies
  • 1 kudos

Resolved! My Databrick exam got suspended just for coming closer to laptop screen to read question and options

Hi team,My Databricks Certified Data Engineer Associate exam got suspended within 10 minutes.I had also shown my exam room to the proctor. My exam got suspended due to eye movement. I was not moving my eyes away from laptop screen. It's hard to focus...

  • 19678 Views
  • 34 replies
  • 1 kudos
Latest Reply
Kavya_AD
New Contributor II
  • 1 kudos

@Cert-TeamOPS I am writing to raise a concern regarding an interruption that occurred during my Databricks Certified Data Engineer Associate exam scheduled for today at 1:15 PM. I began the exam at 1:00 PM, and the experience was smooth until I recei...

  • 1 kudos
33 More Replies
Alex79
by New Contributor II
  • 321 Views
  • 7 replies
  • 5 kudos

Resolved! How to create classes that can be instantiated from other notebooks?

Hi,I am familiar with object oriented programming and cannot really get my head around the philosophy of coding in Databricks. My approach that naturally consists in creating classes and instantiating objects does not seem to be the right one.Can som...

  • 321 Views
  • 7 replies
  • 5 kudos
Latest Reply
BS_THE_ANALYST
Honored Contributor III
  • 5 kudos

Legendary, @szymon_dybczak  All the best,BS

  • 5 kudos
6 More Replies
itamarwe
by New Contributor II
  • 1653 Views
  • 3 replies
  • 1 kudos

Google PubSub for DLT - Error

I'm trying to create a delta live table from a Google PubSub stream.Unfortunately I'm getting the following error:org.apache.spark.sql.streaming.StreamingQueryException: [PS_FETCH_RETRY_EXCEPTION] Task in pubsub fetch stage cannot be retried. Partiti...

  • 1653 Views
  • 3 replies
  • 1 kudos
Latest Reply
sahilsagar302
New Contributor II
  • 1 kudos

@itamarwe can you please share which permission resulted into the issue and how it got resolved

  • 1 kudos
2 More Replies
Srajole
by New Contributor
  • 1562 Views
  • 2 replies
  • 2 kudos

Data load issue

I have a job in Databricks which completed successfully but the data is not been written into the target table, I have checked all the possible ways, each n every thing is correct in the code, target table name, source table name, etc etc. It is a Fu...

  • 1562 Views
  • 2 replies
  • 2 kudos
Latest Reply
cgrant
Databricks Employee
  • 2 kudos

This looks like a misconfigured Query Watchdog, specifically the below config: spark.conf.get("spark.databricks.queryWatchdog.outputRatioThreshold") Please check the value of this config - it is 1000 by default. Also, we recommend using Jobs Comput...

  • 2 kudos
1 More Replies
jano
by New Contributor III
  • 290 Views
  • 1 replies
  • 1 kudos

Delta UniForm

When we save a delta table using the UniForm option we are seeing a 50% drop in table size. When we add UniForm to a delta table in post we are seeing no change in data size. Is this expected or are others seeing this as well? 

Get Started Discussions
Data Size
delta
UniForm
  • 290 Views
  • 1 replies
  • 1 kudos
Latest Reply
BigRoux
Databricks Employee
  • 1 kudos

Re:When we save a delta table using the UniForm option we are seeing a 50% drop in table size What format are you starting with?  e.g. csv -> Delta.   

  • 1 kudos
ChristianRRL
by Valued Contributor III
  • 380 Views
  • 1 replies
  • 2 kudos

Resolved! AutoLoader Pros/Cons When Extracting Data (Cross-Post)

Cross-posting from: https://community.databricks.com/t5/data-engineering/autoloader-pros-cons-when-extracting-data/td-p/127400Hi there, I am interested in using AutoLoader, but I'd like to get a bit of clarity if it makes sense in my case. Based on e...

  • 380 Views
  • 1 replies
  • 2 kudos
Latest Reply
BS_THE_ANALYST
Honored Contributor III
  • 2 kudos

You’ve already identified data duplication as a potential con of landing the data first, but there are several benefits to this approach that might not be immediately obvious:Schema Inference and Evolution: AutoLoader can automatically infer the sche...

  • 2 kudos
FedeRaimondi
by Contributor
  • 342 Views
  • 3 replies
  • 2 kudos

Resolved! Python module import with Dedicated access mode

I currently have a repo connected in databricks and I was able to correctly import a python module from src folder located in the same root.Since I am using a Machine Learning runtime, I am force to choose a Dedicated (formerly: Single user) access m...

  • 342 Views
  • 3 replies
  • 2 kudos
Latest Reply
FedeRaimondi
Contributor
  • 2 kudos

Thanks @szymon_dybczak ! I confirm that's a permission issue and assigning "CAN MANAGE" solves it.I still find it not really intuitive, since the goal is to use a shared cluster (with ML runtime) for development purposes. I mean, it would make sense ...

  • 2 kudos
2 More Replies

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels