cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Jujiro
by New Contributor III
  • 12354 Views
  • 11 replies
  • 7 kudos

Random error: At least one column must be specified for the table?

I have the following code in a notebook. It is randomly giving me the error, "At least one column must be specified for the table." The error occurs (if at all it occurs) only on the first run after attaching to a cluster.Cluster details:Summary5-1...

dbr-bug
  • 12354 Views
  • 11 replies
  • 7 kudos
Latest Reply
Harold
New Contributor II
  • 7 kudos

Please check if this could help or not:spark.databricks.delta.catalog.update.enabled false

  • 7 kudos
10 More Replies
LidorAbo
by New Contributor II
  • 7891 Views
  • 1 replies
  • 1 kudos

bucket ownership of s3 bucket in databricks

We had a databricks job that has strange behavior,when we passing 'output_path' to function saveAsTextFile and not output_path variable the data saved to the following path: s3://dev-databricks-hy1-rootbucket/nvirginiaprod/3219117805926709/output_pa...

s3
  • 7891 Views
  • 1 replies
  • 1 kudos
Latest Reply
User16752239289
Databricks Employee
  • 1 kudos

I suspect you provided a dbfs path to save the data hence the data saved under your workspace root bucket.For the workspace root bucket, databricks workspace will interact with databricks credential to make sure databricks has access to it and able t...

  • 1 kudos
qwerty1
by Contributor
  • 2054 Views
  • 1 replies
  • 0 kudos

Unable to create bloom filter index

I am unable to create bloom filter index on my tableCREATE BLOOMFILTER INDEX ON TABLE my_namespace.foo FOR COLUMNS (id OPTIONS (fpp = 0.1, numItems = 6000000))Gives the errorAnalysisException: Table `spark_catalog`.`my_namespace`.`foo` did not specif...

  • 2054 Views
  • 1 replies
  • 0 kudos
Latest Reply
Debayan
Databricks Employee
  • 0 kudos

Hi, You can refer to https://issues.apache.org/jira/browse/SPARK-27617 for the above error. Please let us know if this helps, also please tag @Debayan​ with your next response which will notify me, Thank you!

  • 0 kudos
gg_047320_gg_94
by New Contributor II
  • 8715 Views
  • 1 replies
  • 1 kudos

DLT Spark readstream fails on the source table which is overwritten

I am reading the source table which gets updated every day. It is usually append/merge with updates and is occasionally overwritten for other reasons. df = spark.readStream.schema(schema).format("delta").option("ignoreChanges", True).option('starting...

  • 8715 Views
  • 1 replies
  • 1 kudos
Latest Reply
Debayan
Databricks Employee
  • 1 kudos

Hi, Could you please confirm DLT and DBR versions? Also please tag @Debayan​ with your next response which will notify me, Thank you!

  • 1 kudos
eyalo
by New Contributor II
  • 6947 Views
  • 6 replies
  • 0 kudos

Why the SFTP ingest doesn't work?

Hi, I did the following code but it seems like the cluster is running for a long period of time and then stops without any results. Attached my following code: (I used 'com.springml.spark.sftp' library and install it as Maven)Also i whitelisted my lo...

image
  • 6947 Views
  • 6 replies
  • 0 kudos
Latest Reply
eyalo
New Contributor II
  • 0 kudos

@Debayan Mukherjee​ Hi, I don't know if you got my reply so i am bouncing my message to you again.Thanks.

  • 0 kudos
5 More Replies
Ajay-Pandey
by Esteemed Contributor III
  • 4456 Views
  • 2 replies
  • 3 kudos

Resolved! Column is accessible after dropping the same column

Hi Today I have seen very Strang behavior of databricks.I have dropped one column from a dataframe and assigned the result to a new dataframe but I am able to use the dropped column in the filter command.In general scenario I should get an error but ...

image.png
  • 4456 Views
  • 2 replies
  • 3 kudos
Latest Reply
Sandeep
Contributor III
  • 3 kudos

@Ajay Pandey​ , this is a known behavior. Please refer this JIRA for details: https://issues.apache.org/jira/browse/SPARK-30421

  • 3 kudos
1 More Replies
KarenBT
by New Contributor III
  • 7711 Views
  • 15 replies
  • 4 kudos

Welcome 2023 Virtual hackathon participants, we're happy to have you! ✋  Please use this space to ask questions, we'll have some folks from Da...

Welcome 2023 Virtual hackathon participants, we're happy to have you! Please use this space to ask questions, we'll have some folks from Databricks and the community join to help out. We're really excited to see what you work on and if you have any ...

  • 7711 Views
  • 15 replies
  • 4 kudos
Latest Reply
sanggusti
New Contributor II
  • 4 kudos

Hi, I also have another question. Do we get any Databricks platform access for the period of hackathon? My company didn't use one and the trial is only 14 days. I'm pretty aware of the capability and since the hackathon are held by Databricks I think...

  • 4 kudos
14 More Replies
Michelle_-_Devp
by New Contributor III
  • 1369 Views
  • 1 replies
  • 1 kudos

Resolved! How is brainstorming going?

Wondering if anyone is willing to share their project ideas here. It would be great to know how things are going and if anyone has a good open-source dataset they are willing to share.

  • 1369 Views
  • 1 replies
  • 1 kudos
Latest Reply
bayang
New Contributor III
  • 1 kudos

Good, read their docs to get a lot of info to sharpen this hackathon

  • 1 kudos
IndihomeTV
by New Contributor
  • 1370 Views
  • 1 replies
  • 0 kudos

Databricks to redash

We have an issued security in redash, if we used databrick as a connector to redash, Can you support us?https://www.databricks.com/blog/2020/06/24/welcoming-redash-to-databricks.html

  • 1370 Views
  • 1 replies
  • 0 kudos
Latest Reply
arpit
Databricks Employee
  • 0 kudos

Hi @Probis Useetv​ Thank you for reaching out to us.Would you please elaborate your use case about the "issued security in redash" ?

  • 0 kudos
Ismail1
by New Contributor III
  • 3678 Views
  • 3 replies
  • 3 kudos

Resolved! Generating an Account console PAT token

I can't seem to find any documentation on generating an account console PAT token, Can anyone link me to it or guide me?

  • 3678 Views
  • 3 replies
  • 3 kudos
Latest Reply
fkseki
New Contributor III
  • 3 kudos

You can't create a Personal Access Token on account level to use REST APIs. If you want to use SCIM on account level, on the account console settings you'll find the user provisioning tab. In there you can generate de SCIM token. If you want to acces...

  • 3 kudos
2 More Replies
pantelis_mare
by Contributor III
  • 41890 Views
  • 30 replies
  • 15 kudos

Resolved! Repos configuration for Azure Service Principal

Hello community!I would like to update a repo from within my Azure DevOps release pipeline. In the pipeline I generate a token using a AAD Service Principal as recommended, and I setup the databricks api using that token.When I pass the databricks re...

  • 41890 Views
  • 30 replies
  • 15 kudos
Latest Reply
xiangzhu
Contributor III
  • 15 kudos

traditional PAT may have long lifespn, but the new SP feature uses an AAD token which should have a much shorter lifespqn, maybe around one hour, this could be a limiting factor.However, I haven't tested this yet, so these are merely hypotheses.​Neve...

  • 15 kudos
29 More Replies
Phani1
by Valued Contributor II
  • 3486 Views
  • 2 replies
  • 1 kudos

Integration Dolly with Databricks

Hi Databricks Team,Could you please share any links /docs/Sample notebooks to integrate Dolly with Databricks, our aim is to generate SQL queries based on the free text and execute it via databricks cluster/SQL warehouse.

  • 3486 Views
  • 2 replies
  • 1 kudos
Latest Reply
sean_owen
Databricks Employee
  • 1 kudos

https://www.dbdemos.ai/demo.html?demoName=llm-dolly-chatbot is a good demonstration of Dolly (or really any LLM) for question answering. LLMs like this are not for SQL generation, but other LLMs are, like starcoderbase

  • 1 kudos
1 More Replies
sanjay
by Valued Contributor II
  • 2938 Views
  • 2 replies
  • 1 kudos

Resolved! How can I prioritize message in autoloader

Hi,I am using autoloader, it picks data from AWS S3 and stores in delta table. In case there are large number of messages, I like to process messages by priority. Is it possible to prioritize messages in autoloader.Regards,Sanjay

  • 2938 Views
  • 2 replies
  • 1 kudos
Latest Reply
sanjay
Valued Contributor II
  • 1 kudos

Thank you Sandeep. Other option is I can keep messages in 2 different folders in S3. Can autoloader read message from multiple folders

  • 1 kudos
1 More Replies
pauloquantile
by New Contributor III
  • 5985 Views
  • 8 replies
  • 0 kudos

Resolved! Disable scheduling of notebooks

Hi,We are wondering if it is possible to disable the possibility to disable scheduling of a notebook. A client wants to allow many analysts access to databricks, but a concern is the possibility of setting schedules (the fastest is every minute!). Is...

  • 5985 Views
  • 8 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Hi @Paulo Rijnberg​ Thank you for your question! To assist you better, please take a moment to review the answer and let me know if it best fits your needs.Please help us select the best solution by clicking on "Select As Best" if it does.Your feedba...

  • 0 kudos
7 More Replies

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels