Hello,
I'm learning Scala / Spark and try to understand what's wrong with my function:
I have a spark.sql query, stored in a variable:
val uViewName = spark.sql("""
SELECT
v.Data_View_Name
FROM
apoHierarchy AS h
INNER JOIN apoView AS v ON h.View_Name = v.Context_View_Name
WHERE
1 = 1
AND h.Step_Number = 1
""")
uViewName.show()
This gives me the right result:
+-------------------------------+
| Data_View_Name|
+-------------------------------+
|APO_S_VN_E1_APR_U...|
+-------------------------------+
I then want to pack it into a function like this:
def getUniverse(x: Int): String = {
var sn = x
var stepNumber = x.toString()
var uViewName = spark.sql(s"""
SELECT
v.Data_View_Name
FROM
apoHierarchy AS h
INNER JOIN apoView AS v ON h.View_Name = v.Context_View_Name
WHERE
1 = 1
AND h.Step_Number = $stepNumber
""")
return uViewName
}
So, what I try to achieve is:
- the StepNumber should be given as parameter
- the query takes this parameter and queries the db accordingly
But, why do I get this error message?
command-4245198:14: error: type mismatch;
found : org.apache.spark.sql.DataFrame
(which expands to) org.apache.spark.sql.Dataset[org.apache.spark.sql.Row]
required: String
return uViewName
^
And when I change the return statement to:
return uViewName.toString()
it seems to work:
getUniverse: (x: Int)String
And, finally when running the def, shouldn't I get a string "APO_S_VN_E1_APR_U.."?
val x = getUniverse(1)
println(x)
[Data_View_Name: string]
x: String = [Data_View_Name: string]
Thanks for any advice / hint!