DAB: NameError: name '__file__' is not defined
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
10-26-2023 02:58 AM - edited 10-26-2023 03:01 AM
Hi Everyone,
I am running job task using Asset Bundle.
Bundle has been validated and deployed according to: https://learn.microsoft.com/en-us/azure/databricks/dev-tools/bundles/work-tasks
Part of the databricks.yml
bundle:
name: etldatabricks
resources:
jobs:
etldatabricks-job:
name: etldatabricks-job
tasks:
- task_key: etldatabricks-python-script-task
existing_cluster_id: xxxx-17xx21-roxx
spark_python_task:
python_file: ./ingestion/my_script_dir/my_script.py
targets:
dev:
mode: development
default: true
I am receiving following error message during run of the my_script.py as a task in dev target via asset bundle.
NameError: name '__file__' is not defined
# my_script.py
import sys
from pathlib import Path
from databricks.connect import DatabricksSession
from pyspark.sql import DataFrameWriter
from pyspark.sql.types import StructType, StructField, StringType
# add root dir to path
root_dir_path = Path(__file__).parent.parent.parent
sys.path.append(str(root_dir_path))
I cannot understand why python dunder __file__ variable cannot be resolved running the script via DAB.
File of course works without any issues during standard databricks job run.
Thank you for any help.
Rafal
- Labels:
-
Workflows
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
10-26-2023 03:20 AM
Hi Kaniz,
Thank you for the answer.
Does it mean that this is limitation of asset bundles? If yes, it is anywhere listed as limitation?
Additionally, what does it mean that " is not always defined in Python scripts that are run using an asset bundle."
It is possible that it can be defined during run using DAB?
Regards,
Rafal