cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

cmilligan
by Contributor II
  • 1504 Views
  • 3 replies
  • 0 kudos

Undescriptive error when trying to insert overwrite into a table

I have a query that I'm trying to insert overwrite into a table. In an effort to try and speed up the query I added a range join hint. After adding it I started getting the error below.I can get around this though by creating a temporary view of the ...

Screenshot_20230118_104626
  • 1504 Views
  • 3 replies
  • 0 kudos
Latest Reply
jose_gonzalez
Moderator
  • 0 kudos

Could you share your code and the full error stack trace please? Check the driver logs for the full stack trace.

  • 0 kudos
2 More Replies
nimble
by New Contributor
  • 1586 Views
  • 2 replies
  • 0 kudos

How can I run a streaming query on a new table with tbl property: change data feed enabled?

In Databricks on AWS, I am trying to run a streaming query (trigger=Once) with delta.enableChangeDataFeed=true in the table definition as instructed, but this always fails with :ERROR: Some streams terminated before this command could finish!   com.d...

  • 1586 Views
  • 2 replies
  • 0 kudos
Latest Reply
swethaNandan
New Contributor III
  • 0 kudos

Hi @daniel e​ Can you try running the select command on table changes from 0th version and see if you get output?SELECT * FROM table_changes('tableName', 0)Also, Please share the streaming query that you are running.

  • 0 kudos
1 More Replies
kilaki
by New Contributor II
  • 2348 Views
  • 3 replies
  • 0 kudos

Query fails with 'Error occurred while deserializing arrow data' on Databricks SQL with Channel set to Preview

Noticed with a query based on inline select and joins fails to the client with 'Error occurred while deserializing arrow data'  I.e the query succeeds on Databricks but client (DBeaver, AtScale) receives an errorThe error is only noticed with Databri...

Screen Shot 2023-01-24 at 2.08.54 PM Screen Shot 2023-01-24 at 2.11.20 PM Screen Shot 2023-01-24 at 2.03.21 PM
  • 2348 Views
  • 3 replies
  • 0 kudos
Latest Reply
franco_patano
New Contributor III
  • 0 kudos

Opened an ES on this, looks like an issue with the Preview channel. Thanks for your help!

  • 0 kudos
2 More Replies
User16783854657
by New Contributor III
  • 2426 Views
  • 4 replies
  • 6 kudos

How do I know how much of a query/job used Photon?

I'm trying to use the native execution engine, Photon. How can I tell if a query is using Photon or is falling back to the non-native Spark engine?

  • 2426 Views
  • 4 replies
  • 6 kudos
Latest Reply
venkat09
New Contributor III
  • 6 kudos

Typo error in my second point of the previous post. Click the execution plan of your task[this is available under SQL/Dataframe tab in Spark UI]. It explains what operations run in the photon engine and what didn't execute by photon.

  • 6 kudos
3 More Replies
lenonlmsv
by New Contributor II
  • 1296 Views
  • 3 replies
  • 0 kudos

Query API Result

Hi, I'm new here.Currently I have to read information from a query in databricks. I've used the query API to get the query definition but so far I'm not able to run the query and get the results.Is it possible? Thanks

  • 1296 Views
  • 3 replies
  • 0 kudos
Latest Reply
daniel_sahal
Esteemed Contributor
  • 0 kudos

When using the JobsAPI you need to specify dbutils.notebook.exit("returnValue") to pass the results once the notebook finished it's job (https://docs.databricks.com/notebooks/notebook-workflows.html#notebook-workflows-exit).Then you can get notebook_...

  • 0 kudos
2 More Replies
jonathan-dufaul
by Valued Contributor
  • 1056 Views
  • 3 replies
  • 3 kudos

Resolved! Why does chaining spark.read from one system/driver and .write to another system/driver take so much longer than doing each piece individually?

i am reading data from IBM DB2 and saving into a MS SQL server (the first step is moving the code itself to databricks, and then we will move the databases to databricks itself). Problem I'm running into is doing something like the below will take > ...

  • 1056 Views
  • 3 replies
  • 3 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 3 kudos

Hi, it is related to partitioning optimization. By default, the JDBC driver queries the source database with only a single thread. So write was from one partition as one partition was created, so it was using a single core. When you used pandas, it d...

  • 3 kudos
2 More Replies
KVNARK
by Honored Contributor II
  • 786 Views
  • 1 replies
  • 4 kudos

Resolved! a usecase to query millions of values.

have a small use case where we need to query the sql database with 1 million values(dynamically returned from python code) in the condition from python function. eg: select * from id in (1,2,23,33........1M). I feel this is very bad approach. Is ther...

  • 786 Views
  • 1 replies
  • 4 kudos
Latest Reply
daniel_sahal
Esteemed Contributor
  • 4 kudos

You can also create a temporary view with the output from python code (one id = one row) and then inner join the view to the table. IMO will improve readability of your code.

  • 4 kudos
Dave_Nithio
by Contributor
  • 763 Views
  • 0 replies
  • 1 kudos

Natively Query Delta Lake with R

I have a large delta table that I need to analyze in native R. The only option I have currently is to query the delta table then use collect() to bring that spark dataframe into an R dataframe. Is there an alternative method that would allow me to qu...

  • 763 Views
  • 0 replies
  • 1 kudos
Swapnil1998
by New Contributor III
  • 620 Views
  • 0 replies
  • 0 kudos

How to query a MySQL Table from Databricks?

I wanted to query a MySQL Table using Databricks rather than reading the complete data using a dbtable option, which will help in incremental loads.remote_table = (spark.read .format("jdbc") .option("driver", driver) .option("url", URL) .option("quer...

  • 620 Views
  • 0 replies
  • 0 kudos
lawrence009
by Contributor
  • 2053 Views
  • 4 replies
  • 8 kudos

Photon does not fully support the query because of dynamic pruning

Does it still make sense to run this job on a cluster with Photon enable when I am receiving the following?This is the code I ran:CREATE OR REPLACE TABLE ${tbl_name}_dups SELECT src.*, ROW_NUMBER() OVER ( PARTITION BY src.id ...

  • 2053 Views
  • 4 replies
  • 8 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 8 kudos

Hi @Lawrence Chen​, Please tell us your DBR version.

  • 8 kudos
3 More Replies
ncouture
by Contributor
  • 992 Views
  • 1 replies
  • 0 kudos

Resolved! How do you run a query as the owner but use a parameter as a viewer

I have a query that is hitting a table I have access too. Granting access to everyone is not an option. I am using this query in a SQL Dashboard. One of the where clause conditions uses a parameter populated by another query. I want this parameter qu...

  • 992 Views
  • 1 replies
  • 0 kudos
Latest Reply
ncouture
Contributor
  • 0 kudos

It is not possible to do what I want. Somewhat seems like a security flaw but what ever

  • 0 kudos
Laure_Decaudin
by New Contributor
  • 1248 Views
  • 2 replies
  • 0 kudos

Order By disabled in sql dashboard

Hello, I have an issue, when I created my query, with an Order By, and my vizualization, everything was in the right order. Then, when I put it in my dashboard the order changed. As you can see with the pictures, when I clik on "edit visualization", ...

  • 1248 Views
  • 2 replies
  • 0 kudos
Latest Reply
Vidula
Honored Contributor
  • 0 kudos

Hi @Laure Decaudin​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Tha...

  • 0 kudos
1 More Replies
113775
by New Contributor
  • 3320 Views
  • 1 replies
  • 0 kudos

Running query on INFORMATION_SCHEMA.COLUMNS

Hi,I am trying to run following query:SELECT table_schema, table_name, COUNT(column_name) FROM {db_name}.INFORMATION_SCHEMA.COLUMNS GROUP BY table_schema, table_nameand I am getting following error:Error in SQL statement: AnalysisException: Catalog n...

  • 3320 Views
  • 1 replies
  • 0 kudos
Latest Reply
User16741082858
Contributor III
  • 0 kudos

Hi @Suman Karki​, can you check if UC is enabled in the advanced settings of the endpoint? Also, what DBR is your DE cluster, and what Security Mode did you choose?

  • 0 kudos
lizou
by Contributor II
  • 1828 Views
  • 4 replies
  • 2 kudos

call saved query in sql warehouse

in python cursor.executecan you call a saved query with a parameter? like call a stored procedure in relational db?https://docs.microsoft.com/en-us/azure/databricks/dev-tools/python-sql-connector#cursor-method

  • 1828 Views
  • 4 replies
  • 2 kudos
Latest Reply
Vidula
Honored Contributor
  • 2 kudos

Hi @lizou​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Thanks!

  • 2 kudos
3 More Replies
Labels