cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Dave_Nithio
by Contributor
  • 5256 Views
  • 4 replies
  • 2 kudos

Resolved! How to use autoloader with csv containing spaces in attribute names?

I am attempting to use autoloader to add a number of csv files to a delta table. The underlying csv files have spaces in the attribute names though (i.e. 'Account Number' instead of 'AccountNumber'). When I run my autoload, I get the following error ...

  • 5256 Views
  • 4 replies
  • 2 kudos
Latest Reply
Dave_Nithio
Contributor
  • 2 kudos

@Hubert Dudek​ thanks for your response! I was able to use what you proposed above to generate the schema. The issue is that the schema sets all attributes to STRING values and renames them numerically ('_c0', '_c1', etc.). Although this allows us to...

  • 2 kudos
3 More Replies
Arby
by New Contributor II
  • 10574 Views
  • 4 replies
  • 0 kudos

Help With OSError: [Errno 95] Operation not supported: '/Workspace/Repos/Connectors....

Hello,I am experiencing issues with importing from utils repo the schema file I created.this is the logic we use for all ingestion and all other schemas live in this repo utills/schemasI am unable to access the file I created for a new ingestion pipe...

icon
  • 10574 Views
  • 4 replies
  • 0 kudos
Latest Reply
Arby
New Contributor II
  • 0 kudos

@Debayan Mukherjee​ Hello, thank you for your response. please let me know if these are the correct commands to access the file from notebookI can see the files in the repo folderbut I just noticed this. the file I am trying to access the size is 0 b...

  • 0 kudos
3 More Replies
Ludo
by New Contributor III
  • 5595 Views
  • 7 replies
  • 2 kudos

Resolved! Jobs with multi-tasking are failing to retry; how to fix this issue?

Hello,This is question on our platform with `Databricks Runtime 11.3 LTS`.I'm running a Job with multiple tasks in // using a shared cluster.Each task runs a dedicated scala class within a JAR library attached as a dependency.One of the task fails (c...

  • 5595 Views
  • 7 replies
  • 2 kudos
Latest Reply
YoshiCoppens61
New Contributor II
  • 2 kudos

Hi,This actually should not be marked as solved. We are having the same problem, whenever a Shared Job Cluster crashes for some reason (generally OoM), all tasks will start failing until eternity, with the error message as described above. This is ac...

  • 2 kudos
6 More Replies
CrisCampos
by New Contributor II
  • 3172 Views
  • 1 replies
  • 1 kudos

How to load a "pickle/joblib" file on Databricks

Hi Community, I am trying to load a joblib on Databricks, but doesn't seems to be working.Getting an error message: "Incompatible format detected"  Any idea of how to load this type of file on db?Thanks!

image image
  • 3172 Views
  • 1 replies
  • 1 kudos
Latest Reply
tapash-db
Databricks Employee
  • 1 kudos

You can import joblib/joblibspark package to load joblib files

  • 1 kudos
harraz
by New Contributor III
  • 1998 Views
  • 1 replies
  • 0 kudos

Issues loading files csv files that contain BOM (Byte Order Mark) character

I keep getting and error when creating dataframe or steam from certain CSV files where the header contains BOM (Byte Order Mark) character  This is the error message:AnalysisException: [RequestId=e09c7c8d-2399-4d6a-84ae-216e6a9f8f6e ErrorClass=INVALI...

  • 1998 Views
  • 1 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Hi @mohamed harraz​ Great to meet you, and thanks for your question! Let's see if your peers in the community have an answer to your question. Thanks.

  • 0 kudos
Prabhakar1
by New Contributor III
  • 15398 Views
  • 5 replies
  • 8 kudos

How Selenium Webdriver works on Azure Databricks? I am unable to run a simple code.

from selenium import webdriverfrom webdriver_manager.chrome import ChromeDriverManagerfrom selenium.webdriver.common.by import Byfrom selenium.webdriver.chrome.options import Optionsdrivers = webdriver.Chrome(ChromeDriverManager().install())drivers.g...

  • 15398 Views
  • 5 replies
  • 8 kudos
Latest Reply
Evan_MCK
Contributor
  • 8 kudos

I also got that error. What worked for me was downloading the chrome driver and ensuring its the latest version with shell scripts in the same notebook I used for web scraping. I could not use the web driver manager. You can see all the details here...

  • 8 kudos
4 More Replies
Databricks3
by Contributor
  • 1467 Views
  • 1 replies
  • 1 kudos

Concurrent Insert on a delta table fails if the table contains Identity Columns. Error message are added below.MetadataChangedException: The metadata ...

Concurrent Insert on a delta table fails if the table contains Identity Columns. Error message are added below.MetadataChangedException: The metadata of the Delta table has been changed by a concurrent update. Please try the operation again.

  • 1467 Views
  • 1 replies
  • 1 kudos
Latest Reply
Anonymous
Not applicable
  • 1 kudos

Hi @SK ASIF ALI​ Great to meet you, and thanks for your question! Let's see if your peers in the community have an answer to your question. Thanks.

  • 1 kudos
Qwetroman
by New Contributor
  • 1396 Views
  • 1 replies
  • 0 kudos

AutoML runs fail after 5 seconds

Hi everyoneI am exploring automl, and I met a strange problem - after I launch a classification experiment on my personal newly created cluster (screenshot attached) it successfully performs data exploration, but after that, all runs fail after appro...

  • 1396 Views
  • 1 replies
  • 0 kudos
Latest Reply
swethaNandan
Databricks Employee
  • 0 kudos

Hi Qwetroman,we can see the following error message in the notebook - ExecutionTimeoutError: Execution timed out before any trials could be successfully run. Please increase the timeout for AutoML to run some trials.What's the size of the dataset? St...

  • 0 kudos
qwerty1
by Contributor
  • 1309 Views
  • 1 replies
  • 0 kudos

Unable to create bloom filter index

I am unable to create bloom filter index on my tableCREATE BLOOMFILTER INDEX ON TABLE my_namespace.foo FOR COLUMNS (id OPTIONS (fpp = 0.1, numItems = 6000000))Gives the errorAnalysisException: Table `spark_catalog`.`my_namespace`.`foo` did not specif...

  • 1309 Views
  • 1 replies
  • 0 kudos
Latest Reply
Debayan
Databricks Employee
  • 0 kudos

Hi, You can refer to https://issues.apache.org/jira/browse/SPARK-27617 for the above error. Please let us know if this helps, also please tag @Debayan​ with your next response which will notify me, Thank you!

  • 0 kudos
Tim2407
by New Contributor
  • 1616 Views
  • 1 replies
  • 1 kudos

Connection Error DataGrip Databricks

When trying to connect DataGrip with Databricks SQL, I'm able to do it for multiple connections by using a Token. However, for one specific connection it is not working. We internally tried everything, but we are not able to connect. Below is the err...

  • 1616 Views
  • 1 replies
  • 1 kudos
Latest Reply
Debayan
Databricks Employee
  • 1 kudos

Hi, This looks like, the requested resource is forbidden. Could you please check the destination webserver and recheck the configuration? Please tag @Debayan​ with your next comment so that I will get notified. Thank you!

  • 1 kudos
Hari_Dbrc
by New Contributor II
  • 2731 Views
  • 2 replies
  • 0 kudos

Issue while using community edition

Hello,Is anyone facing issue with their community edition.?Shows the below error and cant access the workspace or previosly created notebooks..Tried accessing from different devices..(not a cache issue)Error popup:(screenshots attached)Unable to view...

  • 2731 Views
  • 2 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Hi @Hari N​ Thank you for reaching out, and we’re sorry to hear about this log-in issue! We have this Community Edition login troubleshooting post on Community. Please take a look, and follow the troubleshooting steps. If the steps do not resolve the...

  • 0 kudos
1 More Replies
owen1
by New Contributor
  • 1146 Views
  • 2 replies
  • 2 kudos

workflow cluster was create error

I set the workflow to run at 12:00 every day in the workflow, but the workflow failed with the error message below, and I don't know why.Run result unavailable: run failed with error message Unexpected failure while waiting for the cluster (0506-0233...

  • 1146 Views
  • 2 replies
  • 2 kudos
Latest Reply
Murthy1
Contributor II
  • 2 kudos

Hello @Sangwoo Lee​ ,As mentioned by vignesh, it seems like an infra related issue. > Does the user (which executes the job) has access to start a cluster?> Incase if it is not an access issue and Incase if you are starting a lot of workflow jobs tog...

  • 2 kudos
1 More Replies
Dean_Lovelace
by New Contributor III
  • 4287 Views
  • 3 replies
  • 0 kudos

Delta Table Optimize Error

I have have started getting an error message when running the following optimize command:-deltaTable.optimize().executeCompaction()Error:-java.util.concurrent.ExecutionException: java.lang.IllegalStateException: Number of records changed after Optimi...

  • 4287 Views
  • 3 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

@Dean Lovelace​ :The error message suggests that the number of records in the Delta table changed after the optimize() command was run. The optimize() command is used to improve the performance of Delta tables by removing small files and compacting l...

  • 0 kudos
2 More Replies
JLSy
by New Contributor III
  • 12729 Views
  • 5 replies
  • 6 kudos

cannot convert Parquet type INT64 to Photon type string

I am receiving an error similar to the post in this link: https://community.databricks.com/s/question/0D58Y00009d8h4tSAA/cannot-convert-parquet-type-int64-to-photon-type-doubleHowever, instead of type double the error message states that the type can...

  • 12729 Views
  • 5 replies
  • 6 kudos
Latest Reply
Anonymous
Not applicable
  • 6 kudos

@John Laurence Sy​ :It sounds like you are encountering a schema conversion error when trying to read in a Parquet file that contains an INT64 column that cannot be converted to a string type. This error can occur when the Parquet file has a schema t...

  • 6 kudos
4 More Replies
jose_gonzalez
by Databricks Employee
  • 2977 Views
  • 2 replies
  • 3 kudos

NoSuchObjectException(message:There is no database named global_temp)

I can see the following error message in my driver logs. whats does it means? and how to solve it?ERROR RetryingHMSHandler: NoSuchObjectException(message:There is no database named global_temp)

  • 2977 Views
  • 2 replies
  • 3 kudos
Latest Reply
source2sea
Contributor
  • 3 kudos

should one create it in the work space manually via UI? would it get overwritten if work space is created via terraform?I use 10.4 LTS runtime.

  • 3 kudos
1 More Replies
Labels