Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
Hello, I changed the DBR from 7.2 to 10.4 and I receive the following error: AnalysisException: is not a Delta table. The table is create , using DELTA. so for sure is a Delta table, even though, I read that I read that from vers. 8 all tables are De...
Hi @JOSELITA MOLTISANTI can you run the following commands and share the output? table_name = "stg_data_load"
path = spark.sql(f"describe detail {table_name}").select("location").collect()[0][0].replace('dbfs:', '')
dbutils.fs.ls(path)
Is there a way to change my Databricks Academy username(email)? It is greyed out in my profile and I cannot update it. How do I go about getting it updated?
Hi, Please go through the Databricks Academy FAQs here: https://files.training.databricks.com/lms/docebo/databricks-academy-faq.pdfAlso, please go through the post here: https://community.databricks.com/s/feed/0D53f00001dq6W6CAI.
The change in the UI is really confusing on what to use where. Earlier i had HC clusters and now I cant find it in new UI. It says HC clusters are not available. I want to use the HC cluster functions. where can I get that?
I have a delta table already created, now I want to enable the change data feed. I read that I have to set delta.enableChangeDataFeed property to true. But however, this cannot be done using the Scala API. I tried using this but it didn't work. I am ...
'delta.enableChangeDataFeed' have to be without quotes. spark.sql("ALTER TABLE delta_training.onaudience_dpm SET TBLPROPERTIES (delta.enableChangeDataFeed = true)").show()
When i run the below query in databricks sql the Precision and scale of the decimal column is getting changed.Select typeof(COALESCE(Cast(3.45 as decimal(15,6)),0));o/p: decimal(16,6)expected o/p: decimal(15,6)Any reason why the Precision and scale i...
I have a notebook functioning as a pipeline, where multiple notebooks are chained together. The issue I'm facing is that some of the notebooks are spark-optimized, others aren't, and what I want is to use 1 cluster for the former and another for the ...
Yes, you can achieve this by setting two different job clusters. In the screenshot, you can see I have used 2 job clusters PipelineTest and pipelinetest2. You can refer the doc https://docs.databricks.com/jobs.html#cluster-config-tips
Hello,I have created my table in Databricks, at this point everything is perfect i got the same value than in my CSV. for my column "Exposure" I have :0 0,00
1 0,00
2 0,00
3 0,00
4 0,00
...But when I load my fi...
Hi @Anis Ben Salem ,How do you read your CSV file? do you use Pandas or Pyspark APIs? also, how do you created your table?could you share more details on the code you are trying to run?
Your schema is tight, but make sure that the conversion to it does not throw an exception.
Try with Memory Optimized Nodes, you may be fine.
My problem was parsing a lot of data from sequence files containing 10K xml files and saving them as a table...