โ11-03-2022 08:17 AM
Hi,
I would like to be able to do something like this...
create table if not exists table1
using parquet
location = '/mnt/somelocation
some location needs to be a concatenation of static and code generated string. Documentation suggests that location only accepts a string literal and based on my attempts to do this, seems to be the case. Is there any way round this?
Thanks in advance
โ11-03-2022 02:09 PM
Hi @Brian Labromโ ,
One you can do it with Spark.sql
%python
tables = ["table1", "table2"]
for table in tables:
spark.sql(f"""
create table if not exists {table}
using parquet
location = '/mnt/{table}'""")
or maybe with sql:
create table if not exists table1
using parquet
location = '/mnt/${somelocation}'
You can use widget for example to get somelocation:
โ11-03-2022 02:09 PM
Hi @Brian Labromโ ,
One you can do it with Spark.sql
%python
tables = ["table1", "table2"]
for table in tables:
spark.sql(f"""
create table if not exists {table}
using parquet
location = '/mnt/{table}'""")
or maybe with sql:
create table if not exists table1
using parquet
location = '/mnt/${somelocation}'
You can use widget for example to get somelocation:
โ11-03-2022 07:03 PM
FString Python can be used. example > spark.sql(f"CREATE TABLE {table_name} (id INT, name STRING, value DOUBLE, state STRING)")
โ11-04-2022 02:53 AM
Hi, I will check the suggestions this morning and share the results. Thanks @Pat Sienkiewiczโ and @Prasanth Matheshโ for the suggestions
โ11-07-2022 03:56 AM
The end solution looked like this
first compute the variable part of the location (one month back from today in my case)
#first compute the archive date (or whatever else you need to be dynamic)
import datetime
from dateutil.relativedelta import relativedelta
start_date = datetime.date.today()
archive_date = start_date - relativedelta(months=1)
archive_date = archive_date.strftime('%Y-%m')
#build the table using the previously computed variable
import pyspark.sql.session
folder = 'container/folder'+archive_date+'.parquet'
spark.sql(f"CREATE TABLE Archive using parquet location '/mnt/{folder}'")
couldn't get the SQL approach to work for some reason, but this is absolutely fine for what I need.
Many thanks for your help, this will prove v useful in the future.
As an observation might it be worth updating Databricks documentation as it currently states location must be a string literal? Just a thought.
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโt want to miss the chance to attend and share knowledge.
If there isnโt a group near you, start one and help create a community that brings people together.
Request a New Group