Data Engineering

Forum Posts

Sorted by:

by kishorekumar • New Contributor

06-20-2023 5:46:52 AM

2284 Views
1 replies
0 kudos

Silent failure in DataFrameWriter when loading data to Redshift

Context:I'm using DataFrameWriter to load the dataSet into the Redshift. DataFrameWriter writes the dataSet to S3, and loads data from S3 to Redshift by issuing the Redshift copy command. Issue:In frequently we are observing, the data is present in t...

Data Engineering

2284 Views
1 replies
0 kudos

06-20-2023 5:46:52 AM

View Replies

Latest Reply

Anonymous
Not applicable

06-20-2023 8:23:29 PM

0 kudos

Hi @Kishorekumar Somasundaram Great to meet you, and thanks for your question! Let's see if your peers in the community have an answer to your question. Thanks.

0 kudos

06-20-2023 8:23:29 PM

by Edwin • New Contributor II

05-30-2023 10:57:11 AM

1562 Views
0 replies
1 kudos

Unable to load data from Redshift

I've been trying to connect to RedShift following Databrick's documentation and validated that I'm using runtime version 11.3 on my cluster and that I have read/write privileges on the tempdir bucket. But, I'm unable to load data from RedShift to a S...

Data Engineering

1562 Views
0 replies
1 kudos

05-30-2023 10:57:11 AM

by Lonnie • New Contributor

05-19-2022 1:17:15 PM

2543 Views
0 replies
0 kudos

Recommended Redshift-2-Delta Migration Path

Hello All!My team is previewing Databricks and are contemplating the steps to take to perform one-time migrations of datasets from Redshift to Delta. Based on our understandings of the tool, here are our initial thoughts:Export data from Redshift-2-S...

Data Engineering

2543 Views
0 replies
0 kudos

05-19-2022 1:17:15 PM

by LorenRD • Contributor

04-04-2022 5:02:54 AM

12887 Views
9 replies
13 kudos

Resolved! Is it possible to connect Databricks SQL with AWS Redshift DB?

I would like to know if it's possible to connect Databricks SQL module with not just internal Metastore DB and tables from Data Science and Engineering module but also connect with an AWS Redshift DB to do queries and create alerts.

Data Engineering

12887 Views
9 replies
13 kudos

04-04-2022 5:02:54 AM

View Replies

Latest Reply

LorenRD
Contributor

04-26-2022 6:20:33 AM

13 kudos

Hi @Kaniz Fatma I contacted Customer support explaining this issue, they told me that this feature is not implemented yet but it's in the roadmap with no ETA. It would be great if you ping me back when it's possible to access Redshift tables from SQ...

13 kudos

04-26-2022 6:20:33 AM

8 More Replies

by Anonymous • Not applicable

11-07-2021 12:25:16 AM

5006 Views
5 replies
0 kudos

Resolved! How to use from standalone Spark Jar running from Intellij Idea the library installed in Databricks DBR?

Hello, I tried without success to use several libraries installed by use in the Databricks 9.1 cluster (not provived by default in DBR) from a standalone Spark application runs from Intellij Idea. For instance, for connecting to Redshift it works onl...

Data Engineering

5006 Views
5 replies
0 kudos

11-07-2021 12:25:16 AM

View Replies

Latest Reply

Anonymous
Not applicable

11-22-2021 9:44:20 AM

0 kudos

Unfortunately, I did not find any solution. We have to package JAR and run it from Databricks job for test/debug. Not efficient but as no solution for remote debug has been found/provided.

0 kudos

11-22-2021 9:44:20 AM

4 More Replies

by nicole_wong • Databricks Employee

10-29-2021 10:46:11 AM

3637 Views
1 replies
1 kudos

Resolved! Best practices for working with Redshift

I have a customer with the following question - I'm posting on their behalf to introduce them to the community. For doing modeling in a python environment what is our best practice for getting the data from redshift? A "load" option seems to leave me...

Data Engineering

3637 Views
1 replies
1 kudos

10-29-2021 10:46:11 AM

View Replies

Latest Reply

jose_gonzalez
Databricks Employee

11-15-2021 4:16:55 PM

1 kudos

Hi @Nicole Wong ,Have you check the docs from here? As far as I know, this might be the only way to read/write data to/from redshift.

1 kudos

11-15-2021 4:16:55 PM

by sajith_appukutt • Databricks Employee

06-09-2021 1:28:20 AM

2778 Views
1 replies
0 kudos

Resolved! I'm using the Redshift data source to load data into spark SQL data frames. However, I'm not seeing predicate push down for my queries ran on Redshift - is that expected?

I was expecting filter operations to be pushed down to Redshift by the optimizer. However, the entire dataset is getting loaded from Redshift.

Data Engineering

2778 Views
1 replies
0 kudos

06-09-2021 1:28:20 AM

View Replies

Latest Reply

sajith_appukutt
Databricks Employee

06-21-2021 6:02:06 PM

0 kudos

The Spark driver for Redshift pushes the following operators down into Redshift:FilterProjectSortLimitAggregationJoinHowever, it does not support expressions operating on dates and timestamps today. If you have a similar requirement, please add a fea...

0 kudos

06-21-2021 6:02:06 PM

by sajith_appukutt • Databricks Employee

06-08-2021 10:22:38 PM

2809 Views
1 replies
1 kudos

Resolved! Are there any ways to automatically cleanup temporary files created in s3 by the Amazon Redshift connector

The Amazon Redshift data source in Databricks seems to be using S3 for storing intermediate results. Are there any ways to automatically cleanup temporary files created in S3

Data Engineering

2809 Views
1 replies
1 kudos

06-08-2021 10:22:38 PM

View Replies

Latest Reply

sajith_appukutt
Databricks Employee

06-17-2021 5:29:49 PM

1 kudos

You could use storage lifecycle policy for the s3 bucket used for storing intermediate results and configure expiration actions. This way temporary/intermediate results would be automatically cleaned up

1 kudos

06-17-2021 5:29:49 PM

by cfregly • Contributor

05-09-2015 2:35:31 PM

6968 Views
4 replies
0 kudos

SSL connection java.sql.SQLException with Redshift

Data Engineering

6968 Views
4 replies
0 kudos

05-09-2015 2:35:31 PM

View Replies

Latest Reply

TianziCai
New Contributor II

06-14-2017 2:09:16 PM

0 kudos

sample = (spark.read .format("com.databricks.spark.redshift") .option("url", jdbcUrl) .option("dbtable", "xx.xxx") # schema, table .option("forward_spark_s3_credentials", True) .option("tempdir", tem...

0 kudos

06-14-2017 2:09:16 PM

3 More Replies

Databricks Community

Silent failure in DataFrameWriter when loading data to Redshift

Unable to load data from Redshift

Recommended Redshift-2-Delta Migration Path

Resolved! Is it possible to connect Databricks SQL with AWS Redshift DB?

Resolved! How to use from standalone Spark Jar running from Intellij Idea the library installed in Databricks DBR?

Resolved! Best practices for working with Redshift

Resolved! I'm using the Redshift data source to load data into spark SQL data frames. However, I'm not seeing predicate push down for my queries ran on Redshift - is that expected?

Resolved! Are there any ways to automatically cleanup temporary files created in s3 by the Amazon Redshift connector

SSL connection java.sql.SQLException with Redshift