Data Engineering

Forum Posts

Sorted by:

by saninanda • New Contributor II

09-23-2019 11:48:33 PM

16534 Views
7 replies
0 kudos

how to read schema from text file stored in cloud storage

I have file a.csv or a.parquet while creating data frame reading we can explictly define schema with struct type. instead of write the schema in the notebook want to create schema lets say for all my csv i have one schema like csv_schema and stored ...

Data Engineering

16534 Views
7 replies
0 kudos

09-23-2019 11:48:33 PM

View Replies

Latest Reply

Nakeman
New Contributor II

05-14-2021 2:28:39 AM

0 kudos

@shyampsr big thanks, was searching for the solution almost 3 hours _https://luckycanadian.com/

0 kudos

05-14-2021 2:28:39 AM

6 More Replies

by NEERAJRATHORE19 • New Contributor

07-26-2019 1:07:18 PM

14078 Views
3 replies
1 kudos

org.apache.spark.sql.catalyst.errors.package$TreeNodeException: execute, tree: Exchange SinglePartition : Error

I am creating dataframe using SQL in which all the underline tables are actually tempview based on dataframes. I am getting below error everytime. Can anyone help me to uderstand the issue here. Thanks in advance.An error occurred while calling o183....

Data Engineering

14078 Views
3 replies
1 kudos

07-26-2019 1:07:18 PM

View Replies

Latest Reply

htinhk
New Contributor II

04-09-2021 8:02:35 PM

1 kudos

I also encountered the same problem...It's weird that I can do the query but not the count.

1 kudos

04-09-2021 8:02:35 PM

2 More Replies

by Dee • New Contributor

08-14-2018 10:21:15 PM

12549 Views
2 replies
0 kudos

Resolved! How to Change Schema of a Spark SQL

I am new to Spark and just started an online pyspark tutorial. I uploaded the json data in DataBrick and wrote the commands as follows: df = sqlContext.sql("SELECT * FROM people_json") df.printSchema() from pyspark.sql.types import * data_schema =...

Data Engineering

12549 Views
2 replies
0 kudos

08-14-2018 10:21:15 PM

View Replies

Latest Reply

bhanu2448
New Contributor II

07-20-2019 10:24:25 AM

0 kudos

http://www.bigdatainterview.com/

0 kudos

07-20-2019 10:24:25 AM

1 More Replies

by dshosseinyousef • New Contributor II

09-22-2016 1:29:26 AM

10174 Views
2 replies
0 kudos

how to Calculate quantile on grouped data in spark Dataframe

I have the following sparkdataframe : agent_id/ payment_amount a /1000 b /1100 a /1100 a /1200 b /1200 b /1250 a /10000 b /9000 my desire output would be something like <code>agen_id 95_quantile a whatever is95 quantile for a...

Data Engineering

10174 Views
2 replies
0 kudos

09-22-2016 1:29:26 AM

View Replies

Latest Reply

Weiluo__David_R
New Contributor II

12-30-2016 10:17:54 AM

0 kudos

For those of you who haven't run into this SO thread http://stackoverflow.com/questions/39633614/calculate-quantile-on-grouped-data-in-spark-dataframe, it's pointed out there that one work-around is to use HIVE UDF "percentile_approx". Please see th...

0 kudos

12-30-2016 10:17:54 AM

1 More Replies

by johnmcauley • New Contributor II

08-12-2015 8:51:32 AM

14116 Views
2 replies
0 kudos

How do I escape a query string in Spark SQL?

Hey all, I am trying to filter on a string but the string has a single quote - how do I escape the string in Scala? I have tried an old version of StringEscapeUtils but no luck. Sorry if a silly question - new to Scala.import org.apache.commons.lan...

Data Engineering

14116 Views
2 replies
0 kudos

08-12-2015 8:51:32 AM

View Replies

Latest Reply

antoniosarco
New Contributor II

05-04-2016 12:20:10 AM

0 kudos

generally when u deal with apostrophe u replace the the single quote(') with (''). More about....handling single quotes Antonio

0 kudos

05-04-2016 12:20:10 AM

1 More Replies

by Sri1 • New Contributor II

04-08-2016 9:57:39 AM

13585 Views
5 replies
0 kudos

Create a in-memory table in Spark and insert data into it

Hi, My requirement is I need to create a Spark In-memory table (Not pushing hive table into memory) insert data into it and finally write that back to Hive table. Idea here is to avoid the disk IO while writing into Target Hive table. There are lot ...

Data Engineering

13585 Views
5 replies
0 kudos

04-08-2016 9:57:39 AM

View Replies

Latest Reply

vida
Databricks Employee

04-12-2016 6:31:08 AM

0 kudos

Got it - how about using a UnionAll? I believe this code snippet does what you'd want:from pyspark.sql import Row array = [Row(value=1), Row(value=2), Row(value=3)] df = sqlContext.createDataFrame(sc.parallelize(array)) array2 = [Row(value=4), Ro...

0 kudos

04-12-2016 6:31:08 AM

4 More Replies

Databricks Community

how to read schema from text file stored in cloud storage

org.apache.spark.sql.catalyst.errors.package$TreeNodeException: execute, tree: Exchange SinglePartition : Error

Resolved! How to Change Schema of a Spark SQL

how to Calculate quantile on grouped data in spark Dataframe

How do I escape a query string in Spark SQL?

Create a in-memory table in Spark and insert data into it