Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
How can I fix the IP address of my Azure Cluster so that I can whitelist the IP address to run my job daily on my python notebook? Or can I find out the IP address to perform whitelisting? Thanks
When I use the following code: df
.coalesce(1)
.write.format("com.databricks.spark.csv")
.option("header", "true")
.save("/path/mydata.csv")it writes several files, and when used with .mode("overwrite"), it will overwrite everything in th...
Hi Daniel,May I know, how did you fix this issue. I am facing similar issue while writing csv/parquet to blob/adls, it creates a separate folder with the filename and creates a partition file within that folder.I need to write just a file on to the b...
I’ve had success with R magic (R cells in a Python notebook) and running an R script from a Python notebook, up to the point of connecting R to a Spark cluster. In either case, I can’t get a `SparkSession` to initialize. 2-cell (Python) notebook exa...
The answer I can give you to have this work for you is to call the R notebooks from your Python notebook. Just save each dataframe as a delta table to pass between the languages.How to call a notebook from another notebook? here is a link
I need to configure pip file to include login credentials to allow for libraries to download from corporate artifactory. I'm trying to learn how to open a config file within databricks and add my credentials and package information. I will then have ...
Hi @Samy Syed Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Thanks!
I have a repo that have python files that use the built in logging module. Additionally in some of the notebooks of the repo I want to use logging.debug()/logging.info() instead of print statements everywhere. However when I use the root logger or cr...
Hi @Yusuf Khan Hope everything is going great.Just wanted to check in if you were able to resolve your issue. If yes, would you be happy to mark an answer as best so that other members can find the solution more quickly? If not, please tell us so we...
Hi,I'm trying to use magic command(to change to python in a notebook with sql as a default language) in a dlt pipeline,.When starting the pipeline cells containing magic command are ignored., with the warning message below:"Magic commands (e.g. %py, ...
Hi @Yassine Dehbi Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Than...
I have a notebook named ecom_sellout.sql under the path notebooks/python/dataloader/queries.I have another notebook(named dataloader under the path notebooks/python/dataloader) in which I am calling this sql notebook.My code runs perfectly fine on re...
Hello,When working in a python notebook and using tab-complete to navigate the file system, I find that pressing enter on a partially completed path will add the full path to the cell in the notebook. This is annoying behaviour, since you end up with...
I'm attempting to mount a volume using dbutils.fs.mount in a python workbookin the exception handling for this statement, I have found an exception that doesn't get caught using the standard try/except handlingfor example, if passing through a contai...
Hi @Stuart Parker Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Than...
Hi,I would like to import a python notebook to my databricks workspace from my local machine using a python script.I manages to create the folder but then I have a status code 400 when I try to import a file :create_folder = requests.post(
'{}/api/...
Hi, Thanks for your answer.Actually both your code and mine are working. However, I cannot write in the directory Repos which is reserved (but I can create subdirectories...)Thanks to your code I got an error message which helped me to understand. Wi...
Following on @Werner Stinckens response, if you can give an example then it will be good.Ideally you can read each row from excel file in python and pass each column as a parameter to a function.Eg; def apply_mapping_logic(SourceTable ,SourceColumn,...
In the release notes of May 2022 it says that we are now able to investigate our SQL results in python in a python notebook. (See also documentation here: Use notebooks - Azure Databricks | Microsoft Docs ) So I created a simple query (select * from ...
This feature was delayed and will be rolled out over Databricks platform releases 3.74 through 3.76. you can check the release notes for more info --> https://docs.databricks.com/release-notes/product/2022/may.html
I am currently using a Python notebook with a defined schema to import fairly unstructured documents in MongoDB. Some of these documents have spaces in their field names. I define the schema for the MongoDB PySpark connector like the following:Struct...
Solution: It turns out the issue is not the schema reading in, but the fact that I am writing to Delta tables, which do not currently support spaces. So, I need to transform them prior to dumping. I've been following a pattern of reading in raw data,...
Our customer is using Azure’s blob storage service to save big files so that we can work with them using an Azure online service, like Databricks.We want to read and work with these files with a computing resource obtained by Azure directly without d...
Hi All,I hope you're doing wellI am facing issue while installing an python library on ADB Cluster.lib - PyCaret ( latest version)its not getting install and showing me 'Failed' Status.It would be great if you can help here !!Thanks