11-03-2022 08:17 AM
Hi,
I would like to be able to do something like this...
create table if not exists table1
using parquet
location = '/mnt/somelocation
some location needs to be a concatenation of static and code generated string. Documentation suggests that location only accepts a string literal and based on my attempts to do this, seems to be the case. Is there any way round this?
Thanks in advance
11-03-2022 02:09 PM
Hi @Brian Labrom ,
One you can do it with Spark.sql
%python
tables = ["table1", "table2"]
for table in tables:
spark.sql(f"""
create table if not exists {table}
using parquet
location = '/mnt/{table}'""")
or maybe with sql:
create table if not exists table1
using parquet
location = '/mnt/${somelocation}'
You can use widget for example to get somelocation:
11-03-2022 02:09 PM
Hi @Brian Labrom ,
One you can do it with Spark.sql
%python
tables = ["table1", "table2"]
for table in tables:
spark.sql(f"""
create table if not exists {table}
using parquet
location = '/mnt/{table}'""")
or maybe with sql:
create table if not exists table1
using parquet
location = '/mnt/${somelocation}'
You can use widget for example to get somelocation:
11-03-2022 07:03 PM
FString Python can be used. example > spark.sql(f"CREATE TABLE {table_name} (id INT, name STRING, value DOUBLE, state STRING)")
11-04-2022 02:53 AM
Hi, I will check the suggestions this morning and share the results. Thanks @Pat Sienkiewicz and @Prasanth Mathesh for the suggestions
11-07-2022 03:56 AM
The end solution looked like this
first compute the variable part of the location (one month back from today in my case)
#first compute the archive date (or whatever else you need to be dynamic)
import datetime
from dateutil.relativedelta import relativedelta
start_date = datetime.date.today()
archive_date = start_date - relativedelta(months=1)
archive_date = archive_date.strftime('%Y-%m')
#build the table using the previously computed variable
import pyspark.sql.session
folder = 'container/folder'+archive_date+'.parquet'
spark.sql(f"CREATE TABLE Archive using parquet location '/mnt/{folder}'")
couldn't get the SQL approach to work for some reason, but this is absolutely fine for what I need.
Many thanks for your help, this will prove v useful in the future.
As an observation might it be worth updating Databricks documentation as it currently states location must be a string literal? Just a thought.
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.
If there isn’t a group near you, start one and help create a community that brings people together.
Request a New Group