Here is the error that I am getting when I run the following query
statement=sqlContext.sql("SELECT count(*) FROM ARDATA_2015_09_01").show()
---------------------------------------------------------------------------Py4JJavaError Traceback (most rec...
I need to convert column type from decimal to date in sparksql when the format is not yyyy-mm-dd?
A table contains column data declared as decimal (38,0) and data is in yyyymmdd format and I am unable to run sql queries on it in databrick notebook.
...
Hi,
In an SQL notebook, using this link: https://docs.databricks.com/spark/latest/spark-sql/language-manual/set.html I managed to figure out to set values and how to get the value.
SET my_val=10; //saves the value 10 for key my_val
SET my_val; //dis...
Hi @Mike K.., you can do this with widgets and getArgument. Here's a small example of what that might look like: https://community.databricks.com/s/feed/0D53f00001HKHZfCAP
I have a csv file with the first column containing data in dictionary form (keys: value). [see below]
I tried to create a table by uploading the csv file directly to databricks but the file can't be read. Is there a way for me to flatten or conver...
This is apparently a known issue, databricks has their own csv format handler which can handle this
https://github.com/databricks/spark-csv
SQL API
CSV data source for Spark can infer data types:
CREATE TABLE cars
USING com.databricks.spark.csv
OP...
When I try to run the command
spark.sql("DROP TABLE IF EXISTS table_to_drop")
and the table does not exist, I get the following error:
AnalysisException: "Table or view 'table_to_drop' not found in database 'null';;\nDropTableCommand `table_to_drop...
I agree about this being a usability bug. Documentation clearly states that if the optional flag "IF EXISTS" is provided that the statement will do nothing.https://docs.databricks.com/spark/latest/spark-sql/language-manual/drop-table.htmlDrop Table ...
Hi,
I am trying to split a record in a table to 2 records based on a column value. Please refer to the sample below. The input table displays the 3 types of Product and their price. Notice that for a specific Product (row) only its corresponding col...
Hi @rishigc
You can use something like below.
SELECT explode(arrays_zip(split(Product, '+'), split(Price, '+') ) as product_and_price from df
or
df.withColumn("product_and_price", explode(arrays_zip(split(Product, '+'), split(Price, '+'))).select(
...
<pre> Hello databricks people, I started working with databricks today. I have a sql script which I developed with sqlite3 on a laptop. I want to port the script to databricks. I started with two sql statements: select count(prop_id) from prop0; del...
Hey Dan, good to hear you're getting started with Databricks. This is not a limitation of Databricks it's a restriction built into Spark itself. Spark is not a data store, it's a distributed computation framework. Therefore deleting data would be un...
I'd like to access a table on a MS SQL Server (Microsoft). Is it possible from Databricks?
To my understanding, the syntax is something like this (in a SQL Notebook):
CREATE TEMPORARY TABLE jdbcTable
USING org.apache.spark.sql.jdbc
OPTIONS ( url...
Thanks for the trick that you have shared with us. I am really amazed to use this informational post. If you are facing MacBook error like MacBook Pro won't turn on black screen then click the link.
I imported a large csv file into databricks as a table.
I am able to run sql queries on it in a databricks notebook.
In my table, I have a column that contains date information in the mm/dd/yyyy format :
12/29/2015
12/30/2015 etc...
Databricks impo...
I've got a table I want to add some data to and it's partitoned. I want to use dynamic partitioning but I get this error
org.apache.spark.SparkException: Dynamic partition strict mode requires at least one static partition column. To turn this off ...
Bricklayers,
I want to port this sql statement from sqlite to databricks:
select cast(myage as number) as my_integer_age from ages;
Does databricks allow me to do something like this?
@dan11
We don't support number in Spark SQL. Try using int, double, float, and your query should be fine. To run SQL in a notebook, just prepend any cell with %sql.
%sql
select cast(myage as double) as my_integer_age from ages;
I'm trying to display() the results from calling first() on a DataFrame, but display() doesn't work with pyspark.sql.Row objects. How can I display this result?