Hey @carlos_tasayco, it seems like you are running a test here. If you are running a test outside of a databricks environment (like in a CI pipeline) then you need to define a spark session manually. In the DBR (databricks runtime) the spark session ...
Ah I've seen this issue many times. The databricks sdk here is trying to authenticate with Databricks API but the environment variables are set for multiple types of authentication. If you remove the DATABRICKS_CLIENT_ID, DATABRICKS_CLIENT_SECRET,env...
Hey @ABINASH, The JSON file being flattened to 620 million records seems like the area of optimization would be to restructure the JSON file. My initial thought being that the JSON file is extremely nested which is causing a large amount of redundant...
Hey @kaushalshelat, I don't think you need to call '.show()' in order to print the DF out. You should just be able to print by writing the df variable name.Docs
Hey @pemidexx, this may be a dumb question, but have you set your DATABRICKS_HOST env variable?os.environ["DATABRICKS_HOST"] = "https://dbc-1234567890123456.cloud.databricks.com" # set to your server URI
os.environ["DATABRICKS_TOKEN"] = "dapixxxxxxxx...