I have a query that I'm trying to insert overwrite into a table. In an effort to try and speed up the query I added a range join hint. After adding it I started getting the error below.I can get around this though by creating a temporary view of the ...
In Databricks on AWS, I am trying to run a streaming query (trigger=Once) with delta.enableChangeDataFeed=true in the table definition as instructed, but this always fails with :ERROR: Some streams terminated before this command could finish!
com.d...
Hi @daniel e​ Can you try running the select command on table changes from 0th version and see if you get output?SELECT * FROM table_changes('tableName', 0)Also, Please share the streaming query that you are running.
Noticed with a query based on inline select and joins fails to the client with 'Error occurred while deserializing arrow data' I.e the query succeeds on Databricks but client (DBeaver, AtScale) receives an errorThe error is only noticed with Databri...
Typo error in my second point of the previous post. Click the execution plan of your task[this is available under SQL/Dataframe tab in Spark UI]. It explains what operations run in the photon engine and what didn't execute by photon.
Hi, I'm new here.Currently I have to read information from a query in databricks. I've used the query API to get the query definition but so far I'm not able to run the query and get the results.Is it possible? Thanks
When using the JobsAPI you need to specify dbutils.notebook.exit("returnValue") to pass the results once the notebook finished it's job (https://docs.databricks.com/notebooks/notebook-workflows.html#notebook-workflows-exit).Then you can get notebook_...
i am reading data from IBM DB2 and saving into a MS SQL server (the first step is moving the code itself to databricks, and then we will move the databases to databricks itself). Problem I'm running into is doing something like the below will take > ...
Hi, it is related to partitioning optimization. By default, the JDBC driver queries the source database with only a single thread. So write was from one partition as one partition was created, so it was using a single core. When you used pandas, it d...
have a small use case where we need to query the sql database with 1 million values(dynamically returned from python code) in the condition from python function. eg: select * from id in (1,2,23,33........1M). I feel this is very bad approach. Is ther...
You can also create a temporary view with the output from python code (one id = one row) and then inner join the view to the table. IMO will improve readability of your code.
I have a large delta table that I need to analyze in native R. The only option I have currently is to query the delta table then use collect() to bring that spark dataframe into an R dataframe. Is there an alternative method that would allow me to qu...
I wanted to query a MySQL Table using Databricks rather than reading the complete data using a dbtable option, which will help in incremental loads.remote_table = (spark.read .format("jdbc") .option("driver", driver) .option("url", URL) .option("quer...
Does it still make sense to run this job on a cluster with Photon enable when I am receiving the following?This is the code I ran:CREATE OR REPLACE TABLE ${tbl_name}_dups
SELECT src.*,
ROW_NUMBER() OVER (
PARTITION BY src.id
...
I have a query that is hitting a table I have access too. Granting access to everyone is not an option. I am using this query in a SQL Dashboard. One of the where clause conditions uses a parameter populated by another query. I want this parameter qu...
Hello, I have an issue, when I created my query, with an Order By, and my vizualization, everything was in the right order. Then, when I put it in my dashboard the order changed. As you can see with the pictures, when I clik on "edit visualization", ...
Hi @Laure Decaudin​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Tha...
Hi,I am trying to run following query:SELECT table_schema, table_name, COUNT(column_name) FROM {db_name}.INFORMATION_SCHEMA.COLUMNS GROUP BY table_schema, table_nameand I am getting following error:Error in SQL statement: AnalysisException: Catalog n...
Hi @Suman Karki​, can you check if UC is enabled in the advanced settings of the endpoint? Also, what DBR is your DE cluster, and what Security Mode did you choose?
in python cursor.executecan you call a saved query with a parameter? like call a stored procedure in relational db?https://docs.microsoft.com/en-us/azure/databricks/dev-tools/python-sql-connector#cursor-method
Hi @lizou​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Thanks!