Data Engineering

Forum Posts

Sorted by:

by Jon • New Contributor II

10-21-2021 12:14:40 AM

3953 Views
4 replies
5 kudos

IP address fix

How can I fix the IP address of my Azure Cluster so that I can whitelist the IP address to run my job daily on my python notebook? Or can I find out the IP address to perform whitelisting? Thanks

Data Engineering

3953 Views
4 replies
5 kudos

10-21-2021 12:14:40 AM

View Replies

Latest Reply

-werners-
Esteemed Contributor III

04-05-2024 4:08:07 AM

5 kudos

Depends on the scenario. You could expose a single ip address to the external internet, but databricks itself will always use many addresses.

5 kudos

04-05-2024 4:08:07 AM

3 More Replies

by Danielsg94 • New Contributor II

08-24-2022 12:59:47 AM

35118 Views
5 replies
1 kudos

Resolved! How can I write a single file to a blob storage using a Python notebook, to a folder with other data?

When I use the following code: df .coalesce(1) .write.format("com.databricks.spark.csv") .option("header", "true") .save("/path/mydata.csv")it writes several files, and when used with .mode("overwrite"), it will overwrite everything in th...

Data Engineering

35118 Views
5 replies
1 kudos

08-24-2022 12:59:47 AM

View Replies

Latest Reply

Simha
New Contributor II

01-17-2024 4:37:17 AM

1 kudos

Hi Daniel,May I know, how did you fix this issue. I am facing similar issue while writing csv/parquet to blob/adls, it creates a separate folder with the filename and creates a partition file within that folder.I need to write just a file on to the b...

1 kudos

01-17-2024 4:37:17 AM

4 More Replies

by rshark • New Contributor II

03-30-2023 10:25:03 AM

7463 Views
3 replies
0 kudos

Error when calling SparkR from within a Python notebook

I’ve had success with R magic (R cells in a Python notebook) and running an R script from a Python notebook, up to the point of connecting R to a Spark cluster. In either case, I can’t get a `SparkSession` to initialize. 2-cell (Python) notebook exa...

Data Engineering

7463 Views
3 replies
0 kudos

03-30-2023 10:25:03 AM

View Replies

Latest Reply

Dooley
Valued Contributor II

04-17-2023 10:52:48 AM

0 kudos

The answer I can give you to have this work for you is to call the R notebooks from your Python notebook. Just save each dataframe as a delta table to pass between the languages.How to call a notebook from another notebook? here is a link

0 kudos

04-17-2023 10:52:48 AM

2 More Replies

by ssy • New Contributor II

03-06-2023 3:33:06 PM

3383 Views
2 replies
0 kudos

How to configure pip file to include libraries from a proxy location

I need to configure pip file to include login credentials to allow for libraries to download from corporate artifactory. I'm trying to learn how to open a config file within databricks and add my credentials and package information. I will then have ...

Data Engineering

3383 Views
2 replies
0 kudos

03-06-2023 3:33:06 PM

View Replies

Latest Reply

Anonymous
Not applicable

03-31-2023 5:13:18 PM

0 kudos

Hi @Samy Syed Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Thanks!

0 kudos

03-31-2023 5:13:18 PM

1 More Replies

by YSF • New Contributor III

02-28-2023 11:39:25 AM

15208 Views
2 replies
3 kudos

Resolved! How do I use the Python Logging Module in a Repo?

I have a repo that have python files that use the built in logging module. Additionally in some of the notebooks of the repo I want to use logging.debug()/logging.info() instead of print statements everywhere. However when I use the root logger or cr...

Data Engineering

15208 Views
2 replies
3 kudos

02-28-2023 11:39:25 AM

View Replies

Latest Reply

Anonymous
Not applicable

03-12-2023 9:48:32 PM

3 kudos

Hi @Yusuf Khan Hope everything is going great.Just wanted to check in if you were able to resolve your issue. If yes, would you be happy to mark an answer as best so that other members can find the solution more quickly? If not, please tell us so we...

3 kudos

03-12-2023 9:48:32 PM

1 More Replies

by Ayur • New Contributor II

02-02-2023 6:33:50 AM

5211 Views
3 replies
4 kudos

Resolved! Unsupported_operation : Magic commands (e.g. %py, %sql and %run) are not supported with the exception of %pip within a Python notebook. Cells containing magic commands are ignored - DLT pipeline

Hi,I'm trying to use magic command(to change to python in a notebook with sql as a default language) in a dlt pipeline,.When starting the pipeline cells containing magic command are ignored., with the warning message below:"Magic commands (e.g. %py, ...

Data Engineering

5211 Views
3 replies
4 kudos

02-02-2023 6:33:50 AM

View Replies

Latest Reply

Anonymous
Not applicable

02-08-2023 11:02:45 PM

4 kudos

Hi @Yassine Dehbi Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Than...

4 kudos

02-08-2023 11:02:45 PM

2 More Replies

by NavyaD • New Contributor III

12-07-2022 6:27:15 AM

2622 Views
2 replies
4 kudos

How to read a sql notebook in python notebook on workspace

I have a notebook named ecom_sellout.sql under the path notebooks/python/dataloader/queries.I have another notebook(named dataloader under the path notebooks/python/dataloader) in which I am calling this sql notebook.My code runs perfectly fine on re...

Data Engineering

2622 Views
2 replies
4 kudos

12-07-2022 6:27:15 AM

View Replies

Latest Reply

Aviral-Bhardwaj
Esteemed Contributor III

12-10-2022 7:27:01 AM

4 kudos

use magic commands and other hand you can use python and SQL formatted there. It will work

4 kudos

12-10-2022 7:27:01 AM

1 More Replies

by jd1 • New Contributor II

11-10-2022 6:56:33 AM

955 Views
1 replies
3 kudos

Hello, When working in a python notebook and using tab-complete to navigate the file system, I find that pressing enter on a partially completed path ...

Hello,When working in a python notebook and using tab-complete to navigate the file system, I find that pressing enter on a partially completed path will add the full path to the cell in the notebook. This is annoying behaviour, since you end up with...

Data Engineering

955 Views
1 replies
3 kudos

11-10-2022 6:56:33 AM

View Replies

Latest Reply

UmaMahesh1
Honored Contributor III

12-02-2022 12:35:43 PM

3 kudos

Someone heard you In the experimental Monaco editor, I found this particular issue not appearing.

3 kudos

12-02-2022 12:35:43 PM

by StuartParker188 • New Contributor III

10-04-2022 11:46:06 PM

6886 Views
5 replies
5 kudos

How to handle java.io.Exception in python notebook

I'm attempting to mount a volume using dbutils.fs.mount in a python workbookin the exception handling for this statement, I have found an exception that doesn't get caught using the standard try/except handlingfor example, if passing through a contai...

Data Engineering

6886 Views
5 replies
5 kudos

10-04-2022 11:46:06 PM

View Replies

Latest Reply

Anonymous
Not applicable

11-12-2022 9:51:54 PM

5 kudos

Hi @Stuart Parker Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Than...

5 kudos

11-12-2022 9:51:54 PM

4 More Replies

by RantoB • Valued Contributor

11-12-2021 5:45:34 AM

8951 Views
10 replies
8 kudos

Resolved! Import notebook with python script using API

Hi,I would like to import a python notebook to my databricks workspace from my local machine using a python script.I manages to create the folder but then I have a status code 400 when I try to import a file :create_folder = requests.post( '{}/api/...

Data Engineering

8951 Views
10 replies
8 kudos

11-12-2021 5:45:34 AM

View Replies

Latest Reply

RantoB
Valued Contributor

11-15-2021 12:13:15 AM

8 kudos

Hi, Thanks for your answer.Actually both your code and mine are working. However, I cannot write in the directory Repos which is reserved (but I can create subdirectories...)Thanks to your code I got an error message which helped me to understand. Wi...

8 kudos

11-15-2021 12:13:15 AM

9 More Replies

by anonymous1 • New Contributor III

09-01-2022 4:46:06 PM

8581 Views
7 replies
5 kudos

How to implement Source to Target ETL Mapping sheet in PySpark using Delta tables

Schema Design :Source : Miltiple CSV Files like (SourceFile1 ,SourceFile2)Target : Delta Table like (Target_Table)Excel File : ETL_Mapping_SheetFile Columns : SourceTable ,SourceColumn, TargetTable, TargetColum , MappingLogicMappingLogic columns cont...

Data Engineering

8581 Views
7 replies
5 kudos

09-01-2022 4:46:06 PM

View Replies

Latest Reply

AmanSehgal
Honored Contributor III

09-20-2022 12:55:34 AM

5 kudos

Following on @Werner Stinckens response, if you can give an example then it will be good.Ideally you can read each row from excel file in python and pass each column as a parameter to a function.Eg; def apply_mapping_logic(SourceTable ,SourceColumn,...

5 kudos

09-20-2022 12:55:34 AM

6 More Replies

by Nickje56 • New Contributor

06-29-2022 1:29:15 AM

5480 Views
1 replies
1 kudos

Resolved! _sqldf not defined

In the release notes of May 2022 it says that we are now able to investigate our SQL results in python in a python notebook. (See also documentation here: Use notebooks - Azure Databricks | Microsoft Docs ) So I created a simple query (select * from ...

Data Engineering

5480 Views
1 replies
1 kudos

06-29-2022 1:29:15 AM

View Replies

Latest Reply

User16753725469
Contributor II

07-15-2022 7:32:26 AM

1 kudos

This feature was delayed and will be rolled out over Databricks platform releases 3.74 through 3.76. you can check the release notes for more info --> https://docs.databricks.com/release-notes/product/2022/may.html

1 kudos

07-15-2022 7:32:26 AM

by Mr__E • Contributor II

02-15-2022 3:49:45 PM

3247 Views
3 replies
3 kudos

Resolved! Importing MongoDB with field names containing spaces

I am currently using a Python notebook with a defined schema to import fairly unstructured documents in MongoDB. Some of these documents have spaces in their field names. I define the schema for the MongoDB PySpark connector like the following:Struct...

Data Engineering

3247 Views
3 replies
3 kudos

02-15-2022 3:49:45 PM

View Replies

Latest Reply

Mr__E
Contributor II

02-15-2022 9:29:42 PM

3 kudos

Solution: It turns out the issue is not the schema reading in, but the fact that I am writing to Delta tables, which do not currently support spaces. So, I need to transform them prior to dumping. I've been following a pattern of reading in raw data,...

3 kudos

02-15-2022 9:29:42 PM

2 More Replies

by SettlerOfCatan • New Contributor

01-21-2022 5:34:25 AM

5201 Views
0 replies
0 kudos

Access data within the blob storage without downloading

Our customer is using Azure’s blob storage service to save big files so that we can work with them using an Azure online service, like Databricks.We want to read and work with these files with a computing resource obtained by Azure directly without d...

Data Engineering

5201 Views
0 replies
0 kudos

01-21-2022 5:34:25 AM

by Development • New Contributor III

12-28-2021 11:37:40 PM

787 Views
0 replies
0 kudos

Hi All, I hope you're doing well I am facing issue while installing an python library on ADB Cluster. lib - PyCaret ( latest version) its not gett...

Hi All,I hope you're doing wellI am facing issue while installing an python library on ADB Cluster.lib - PyCaret ( latest version)its not getting install and showing me 'Failed' Status.It would be great if you can help here !!Thanks

Data Engineering

787 Views
0 replies
0 kudos

12-28-2021 11:37:40 PM