Topics with Label: Query

Forum Posts

Sorted by:

by Harsh1 • New Contributor II

08-23-2022 9:36:02 AM

810 Views
2 replies
1 kudos

Query on DBFS migration

We are doing DBFS migration. In that we have a folder 'user' in Root DBFS having data 5.8 TB in legacy workspace. We performed AWS CLi Sync/cp between Legacy to Target and again performed the same between Target bucket to Target dbfs While implemen...

Data Engineering

810 Views
2 replies
1 kudos

08-23-2022 9:36:02 AM

View Replies

Latest Reply

Harsh1
New Contributor II

08-24-2022 5:43:27 AM

1 kudos

Thanks for the quick response.Regarding the suggested AWS data sync approach, we have tried data sync in multiple ways, it is creating folders in s3 bucket itself not on DBFS. As our task is to copy from bucket to DBFS.It seems that it only supports ...

1 kudos

08-24-2022 5:43:27 AM

1 More Replies

by _Orc • New Contributor

02-22-2022 10:02:34 AM

11725 Views
6 replies
3 kudos

Resolved! Precision and scale is getting changed in the dataframe while casting to decimal

When i run the below query in databricks sql the Precision and scale of the decimal column is getting changed.Select typeof(COALESCE(Cast(3.45 as decimal(15,6)),0));o/p: decimal(16,6)expected o/p: decimal(15,6)Any reason why the Precision and scale i...

Data Engineering

11725 Views
6 replies
3 kudos

02-22-2022 10:02:34 AM

View Replies

Latest Reply

berserkersap
Contributor

08-13-2022 12:05:19 PM

3 kudos

You can use typeof(COALESCE(Cast(3.45 as decimal(15,6)),0.0)); (instead of 0)

3 kudos

08-13-2022 12:05:19 PM

5 More Replies

by shan_chandra • Honored Contributor III

06-04-2022 12:11:17 PM

3345 Views
1 replies
1 kudos

Resolved! Insert query fails with error "The query is not executed because it tries to launch ***** tasks in a single stage, while maximum allowed tasks one query can launch is 100000;

Py4JJavaError: An error occurred while calling o236.sql. : org.apache.spark.SparkException: Job aborted. at org.apache.spark.sql.execution.datasources.FileFormatWriter$.write(FileFormatWriter.scala:201) at org.apache.spark.sql.execution.datasources.I...

Data Engineering

3345 Views
1 replies
1 kudos

06-04-2022 12:11:17 PM

View Replies

Latest Reply

shan_chandra
Honored Contributor III

06-04-2022 12:21:57 PM

1 kudos

could you please increase the below config (at the cluster level) to a higher value or set it to zero spark.databricks.queryWatchdog.maxQueryTasks 0The spark config while it alleviates the issue.

1 kudos

06-04-2022 12:21:57 PM

by Raymond_Garcia • Contributor II

05-20-2022 3:50:16 PM

2233 Views
3 replies
5 kudos

Resolved! Manipulate Column that is an array of objects

I have a column that is an array of objects, let's call it ARRAY, and now I would like to query / manipulate, the elements object without using explode function, this is an example, for each element in that column I would like to create a path. .wit...

Data Engineering

2233 Views
3 replies
5 kudos

05-20-2022 3:50:16 PM

View Replies

Latest Reply

Raymond_Garcia
Contributor II

05-23-2022 12:29:17 PM

5 kudos

Hello I am working with Scala, and I used somehing similar:def play(col: Column): Column = { concat_ws("", lit(imagePath), lit("/"), col("field1"), lit("/"), col("field2"), lit(".ext"))}val variable = spark.lot_of_stuff. .withColumn("...

5 kudos

05-23-2022 12:29:17 PM

2 More Replies

by Suresh1 • New Contributor

04-25-2022 11:57:44 AM

650 Views
0 replies
0 kudos

Query failures are seen during the TPC-DS performance benchmark run

When I'm running TPC-DS (1TB) benchmark on Photon 10.2 and I see the following failures: Queries Q06, Q09 and Q41 fail with the error "Query: AEValueSubQuery is not supported". Q66 fails with the error "[MISSING_COLUMN] org.apache.spark.sql.A...

Data Engineering

650 Views
0 replies
0 kudos

04-25-2022 11:57:44 AM

by TS • New Contributor III

04-05-2022 12:39:13 AM

2361 Views
3 replies
3 kudos

Resolved! Turn spark.sql query into scala function

Hello,I'm learning Scala / Spark and try to understand what's wrong with my function:I have a spark.sql query, stored in a variable:val uViewName = spark.sql(""" SELECT v.Data_View_Name FROM apoHierarchy AS h INNER JOIN apoView AS v ON h.View_N...

Data Engineering

2361 Views
3 replies
3 kudos

04-05-2022 12:39:13 AM

View Replies

Latest Reply

Hubert-Dudek
Esteemed Contributor III

04-05-2022 2:17:08 AM

3 kudos

try add .first()(0) it will return only value from first row/column as currently you are returning Dataset: var uViewName = spark.sql(s""" SELECT v.Data_View_Name FROM apoHierarchy AS h INNER JOIN apoView AS v ON h.View_Name = v.Context_View_N...

3 kudos

04-05-2022 2:17:08 AM

2 More Replies

by alejandrofm • Valued Contributor

02-12-2022 1:39:51 PM

5758 Views
11 replies
1 kudos

Resolved! How can I view the query history, duration, etc for all users

Hi! I have some jobs that stay idle for some time when getting data from a S3 mount on DBFS, this are all SQL queries on Delta, how can I know where is the bottle neck, duration, cue? to diagnose the slow spark performance that I think is on the proc...

Data Engineering

5758 Views
11 replies
1 kudos

02-12-2022 1:39:51 PM

View Replies

Latest Reply

alejandrofm
Valued Contributor

03-16-2022 7:16:45 AM

1 kudos

We found out we were regeneratig the symlink manifest for all the partitions on this case. And for some reason it was executed twice, at start and end of the job.delta_table.generate('symlink_format_manifest')We configured the table with:ALTER TABLE ...

1 kudos

03-16-2022 7:16:45 AM

10 More Replies

by prasadvaze • Valued Contributor

10-30-2021 10:20:53 AM

11830 Views
14 replies
6 kudos

Resolved! How to query delta lake using SQL desktop tools like SSMS or DBVisualizer

Is there a way to use sql desktop tools? because delta OSS or databricks does not provide desktop client (similar to azure data studio) to browse and query delta lake objects.I currently use databricks SQL , a webUI in the databricks workspace but se...

Data Engineering

11830 Views
14 replies
6 kudos

10-30-2021 10:20:53 AM

View Replies

Latest Reply

prasadvaze
Valued Contributor

03-12-2022 11:38:40 AM

6 kudos

DSR is Delta Standalone Reader. see more here - https://docs.delta.io/latest/delta-standalone.htmlIts a crate (and also now a py library) that allows you to connect to delta tables without using spark (e.g. directly from python and not using pyspa...

6 kudos

03-12-2022 11:38:40 AM

13 More Replies

by LukaszJ • Contributor III

02-23-2022 3:32:41 AM

7104 Views
5 replies
0 kudos

Resolved! Send UPDATE from Databricks to Azure SQL DataBase

Hello.I want to know how to do an UPDATE on Azure SQL DataBase from Azure Databricks using PySpark.I know how to make query as SELECT and turn it into DataFrame, but how to send back some data (as UPDATE on rows)?I want to use build in pyspark istead...

Data Engineering

7104 Views
5 replies
0 kudos

02-23-2022 3:32:41 AM

View Replies

Latest Reply

-werners-
Esteemed Contributor III

02-23-2022 11:42:44 PM

0 kudos

This is discussed on Stack Overflow. As you see for Azure Synapse there is a way, but for a plain SQL database you will have to use some kind of driver like odbc/jdbc.

0 kudos

02-23-2022 11:42:44 PM

4 More Replies

by Ian • New Contributor III

01-03-2022 10:49:20 AM

2864 Views
6 replies
0 kudos

Resolved! Databricks-Connect and Change Data Feed query error

I have installed Databricks-Connect (9.1 LTS). I am able to send queries to the cluster. However, when the query includes a call to the 'table_changes' function that is a part of Change Data Feed, I get the following error:AnalysisException("could ...

Data Engineering

2864 Views
6 replies
0 kudos

01-03-2022 10:49:20 AM

View Replies

Latest Reply

Ian
New Contributor III

01-21-2022 11:00:36 AM

0 kudos

Hi @Kaniz Fatma , the table_changes function is an internal Databricks function used in Change Data Feed (CDF).Please refer to the article below. It discusses the table_changes function.https://docs.databricks.com/delta/delta-change-data-feed.html

0 kudos

01-21-2022 11:00:36 AM

5 More Replies

by Soma • Valued Contributor

01-18-2022 3:28:09 AM

1228 Views
4 replies
2 kudos

Resolved! Query RestAPI end point in Databricks Standard Workspace

Do we have option to query delta table using Standard Workspace as a endpoint instead of JDBC

Data Engineering

1228 Views
4 replies
2 kudos

01-18-2022 3:28:09 AM

View Replies

Latest Reply

Anonymous
Not applicable

01-20-2022 8:48:13 AM

2 kudos

@somanath Sankaran - Would you be happy to mark @Hubert Dudek's answer as best if it resolved the problem? That helps other members who are searching for answers find the solution more quickly.

2 kudos

01-20-2022 8:48:13 AM

3 More Replies

by omsas • New Contributor

10-15-2021 4:48:38 AM

1727 Views
2 replies
0 kudos

How to add Columns for Automatic Fill on Pandas Python

1. I have data x,I would like to create a new column with the condition that the value are 1, 2 or 32. The name of the column is SHIFT where this SHIFT column will be filled automatically if the TIME_CREATED column meets the conditions.3. the conditi...

Data Engineering

1727 Views
2 replies
0 kudos

10-15-2021 4:48:38 AM

View Replies

Latest Reply

Ryan_Chynoweth
Honored Contributor III

10-15-2021 12:59:20 PM

0 kudos

You an do something like this in pandas. Note there could be a more performant way to do this too. import pandas as pd import numpy as np df = pd.DataFrame({'a':[1,2,3,4]}) df.head() > a > 0 1 > 1 2 > 2 3 > 3 4 conditions = [(df['a'] <=2...

0 kudos

10-15-2021 12:59:20 PM

1 More Replies

by Sam • New Contributor III

09-13-2021 4:24:14 PM

2192 Views
2 replies
1 kudos

Resolved! Query Pushdown in Snowflake

Hi,I am wondering what documentation exists on Query Pushdown in Snowflake.I noticed that a single function (monitonically_increasing_id()) prevented the entire query being pushed down to Snowflake during an ETL process. Is Pushdown coming from the S...

Data Engineering

2192 Views
2 replies
1 kudos

09-13-2021 4:24:14 PM

View Replies

Latest Reply

siddhathPanchal
New Contributor III

10-11-2021 9:18:18 AM

1 kudos

Hi Sam,The Spark Connector applies predicate and query pushdown by capturing and analyzing the Spark logical plans for SQL operations. When the data source is Snowflake, the operations are translated into a SQL query and then executed in Snowflake to...

1 kudos

10-11-2021 9:18:18 AM

1 More Replies

by User16868770416 • Contributor

06-25-2021 1:20:34 PM

450 Views
0 replies
0 kudos

Is it possible to query an RDS Logical Replication from Databricks?

Data Engineering

450 Views
0 replies
0 kudos

06-25-2021 1:20:34 PM

by Anonymous • Not applicable

06-02-2021 4:38:45 PM

467 Views
1 replies
0 kudos

Photon usage

How do I know how much of a query/job used Photon?

Data Engineering

467 Views
1 replies
0 kudos

06-02-2021 4:38:45 PM

View Replies

Latest Reply

sajith_appukutt
Honored Contributor II

06-23-2021 12:28:16 AM

0 kudos

If you are using Photon on Databricks SQLClick the Query History icon on the sidebar.Click the line containing the query you’d like to analyze.On the Query Details pop-up, click Execution Details.Look at the Task Time in Photon metric at the bottom.

0 kudos

06-23-2021 12:28:16 AM