I have written a CTE in Spark SQL WITH temp_data AS (
......
)
CREATE VIEW AS temp_view FROM SELECT * FROM temp_view; I get a cryptic error. Is there a way to create a temp view from CTE using Spark SQL in databricks?
In the CTE you can't do a CREATE. It expects an expression in the form of expression_name [ ( column_name [ , ... ] ) ] [ AS ] ( query )where expression_name specifies a name for the common table expression.If you want to create a view from a CTE, y...
spark is case sensitive?Spark is not case sensitive by default. If you have same column name in different case (Name, name), if you try to select either "Name" or "name" column you will get column ambiguity error.There is a way to handle this issue b...
Hi, even though i set the conf to be true, on writing to disk it had exceptions complaining it has duplicate columns.below is the error message
org.apache.spark.sql.AnalysisException: Found duplicate column(s) in the data to save: branchavailablity....
I am using Databricks community edition for learning purposes.
I created some Hive-managed tables through spark sql as well as with df.saveAsTable options. But when I connect to a new cluser,
"Show databases"
only returns the default database....
This "feature" in the Community edition is based on the fact that I cannot restart a cluster. So, in the morning I create a cluster for studies purposes and in the afternoon I have to recreate the cluster.If there's any dependent objects from previou...
I have took the azure datasets that are available for practice. I got the 10 days data from that dataset and now I want to save this data into DBFS in csv format. I have facing an error :" No such file or directory: 'No such file or directory: '/dbfs...
Hi,after some exercise you need to aware folder create in dbutils.fs.mkdirs("/dbfs/tmp/myfolder") it's created in /dbfs/dbfs/tmp/myfolderif you want to access path to_csv("/dbfs/tmp/myfolder/mytest.csv") you should created with this script dbutils.fs...
Hi, I'm wondering if this is the expected behavior when using last or last_value in a window function? I've written a query like this:select
col1,
col2,
last_value(col2) over (partition by col1 order by col2) as column2_last
from values
...
For those stumbling across this; it seems LAST_VALUE emulates the same functionality as it does in SQL Server which does not, in most people's minds, have a proper row/range frame for the window. You can adjust it with the below syntax.I understand l...
from 10.4 LTS version we have low shuffle merge, so merge is more faster. But what about MERGE INTO function that we run in sql notebook of databricks. Is there any performance difference when we use databrciks pyspark ".merge" function vs databricks...
Hi @Roshan RC​ Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answers you...
The problem is very simple, when you use TUMBLING window with append mode, then the window is closed only when the next message arrives (+watermark logic). In the current implementation, if you stop incoming streaming data, the last window will NEVER...
No, the problem remains the same. The meaning doesn't change because you increased the timeout a little bit. As the window did not close, and does not close until a new message arrives
While inserting into target table i am getting an error '"not enough data columns;target has 3 but the inserted data has 2" but it's the identity column which is the 8th column ".insert into table A(col 1,col 2,col3)select col2,col3from table Bjoin t...
Hi @sky blue​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Thanks!
Support of running multiple cells at a time in databricks notebookHi all,Now databricks notebook supports parallel run of commands in a single notebook that will help run ad hoc queries simultaneously without creating a separate notebook.Once you run...
Hi @Ajay Pandey​ Hope everything is going great.Just wanted to check in if you were able to resolve your issue. If yes, would you be happy to mark an answer as best so that other members can find the solution more quickly? If not, please tell us so w...
Hi amazing community folks,Feel free to share your experience or knowledge regarding below questions:-1.) Can we pass a CTE sql statement into spark jdbc? i tried to do it i couldn't but i can pass normal sql (Select * from ) and it works. i heard th...
Hi @Jyoti j​​, We haven't heard from you since the last response from @Suteja Kanuri​, and I was checking back to see if her suggestions helped you.Or else, If you have any solution, please share it with the community, as it can be helpful to others....
I'm using spark version 3.2.1 on databricks (DBR 10.4 LTS), and I'm trying to convert sql server sql query to a new sql query that runs on a spark cluster using spark sql in sql syntax. However, spark sql does not seem to support XML PATH as a functi...
Posting the solution that I ended up using:%sql
DROP TABLE if exists UserCountry;
CREATE TABLE if not exists UserCountry (
UserID INT,
Country VARCHAR(5000)
);
INSERT INTO UserCountry
SELECT
L.UserID AS UserID,
CONCAT_WS(',', co...
Coming from MS SQL background, I'm trying to write a query in Spark SQL that simply update a column value of table A (source table) by INNER JOINing a new table B with a filter.MS SQL query looks like this:UPDATE T
SET T.OfferAmount = OSE.EndpointEve...
Posting answer to my question:
MERGE into TempOffer VIEW
USING OfferSeq OSE ON VIEW.OfferId = OSE.OfferID AND OSE.OfferId = 1
WHEN MATCHED THEN UPDATE set VIEW.OfferAmount = OSE.EndpointEventAmountValue;
Hi All,I came from MS SQL and just started to learning more about Spark SQLHere is one part that I'm trying to perform. In MS SQL, it can be easily done, but it seems like it doesn't in SparkSo, I want to make a simple update to the record, if the co...
@Hubert Dudek​ Hello, I'm having the same issue with using UPDATE in spark sql and came across your answer. When you say "replace source_table_reference with view" in MERGE, do you mean to replace "P" with "VIEW" that looks something as below:%sql
ME...
let's say there is a database db in which 700 tables are there, and we need to find all the tables name in which column "project_id" is present.just an example for ubderstanding the questions.