cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Community Platform Discussions
Connect with fellow community members to discuss general topics related to the Databricks platform, industry trends, and best practices. Share experiences, ask questions, and foster collaboration within the community.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

vscode python project for development

Alexandru
New Contributor III

Hi,

I'm trying to set up a local development environment using python / vscode / poetry. Also, linting is enabled (Microsoft pylance extension) and the python.analysis.typeCheckingMode is set to strict.

We are using python files for our code (.py) whith the "# Databricks notebook source" comment at the begin. That kind of files are handled as notebooks by Databrick

I don't want to connect and execute databricks command from the terminal of vscode.

The question is: which modules shall I import in order to have a valid python file, withouth vscode "problems".

For instance: 

from databricks.sdk import runtime as KR
histogram = KR.spark.sql("some sql")
will raise a the following problem: 
Type of "sql" is partially unknown
Type of "sql" is "(sqlQuery: str, args: Dict[str, Any] | List[Unknown] | None = None, **kwargs: Any) -> DataFrame"PylancereportUnknownMemberType
(variable) spark: SparkSession
Regards,
Alexandru Stan

 

1 ACCEPTED SOLUTION

Accepted Solutions

Alexandru
New Contributor III

After some testing, these are the used modules.

[tool.poetry.group.dev.dependencies]
autopep8 = "^2.1.0"
mypy = "^1.9.0"
pyspark = "=3.5.1"
databricks-sdk = "^0.25.1"
databricks-connect = "^14.3.1"

 Ignore the autopep8, which is only for code quality.

mypy is used only because installs the stubgen utility, which is used for some internal missig stubs
pyspark for the standard pyspark functionality
databricks-sdk for the runtime
databricks-connect extends the sdk

Usind this modules I can set the type checkind mode to strict without any VSCode problems being raised

View solution in original post

3 REPLIES 3

artsheiko
Honored Contributor

Hi Alexandru,

Take a look at VSCode extension for Databricks : https://marketplace.visualstudio.com/items?itemName=databricks.databricks 

Alexandru
New Contributor III

Sorry @artsheiko ,

as I already said, I'm not interesting in connecting to databricks, I want just to have a valid Python workspace, when using vscode. As a matter of fact, using the extension is not possible in my projects, because the clusters are not interactive and we are not able to use PAT (blocked by our security team)

Regards,

Alex

Alexandru
New Contributor III

After some testing, these are the used modules.

[tool.poetry.group.dev.dependencies]
autopep8 = "^2.1.0"
mypy = "^1.9.0"
pyspark = "=3.5.1"
databricks-sdk = "^0.25.1"
databricks-connect = "^14.3.1"

 Ignore the autopep8, which is only for code quality.

mypy is used only because installs the stubgen utility, which is used for some internal missig stubs
pyspark for the standard pyspark functionality
databricks-sdk for the runtime
databricks-connect extends the sdk

Usind this modules I can set the type checkind mode to strict without any VSCode problems being raised

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโ€™t want to miss the chance to attend and share knowledge.

If there isnโ€™t a group near you, start one and help create a community that brings people together.

Request a New Group