Data Engineering

Forum Posts

Sorted by:

by MattPython • New Contributor

02-01-2023 5:20:15 AM

23937 Views
4 replies
0 kudos

How do you read files from the DBFS with OS and Pandas Python libraries?

I created translations for decoded values and want to save the dictionary object the DBFS for mapping. However, I am unable to access the DBFS without using dbutils or PySpark library. Is there a way to access the DBFS with OS and Pandas Python libra...

Data Engineering

23937 Views
4 replies
0 kudos

02-01-2023 5:20:15 AM

View Replies

Latest Reply

User16789202230
Databricks Employee

12-21-2023 2:38:02 AM

0 kudos

db_path = 'file:///Workspace/Users/l<xxxxx>@databricks.com/TITANIC_DEMO/tested.csv' df = spark.read.csv(db_path, header = "True", inferSchema="True")

0 kudos

12-21-2023 2:38:02 AM

3 More Replies

by Braxx • Contributor II

10-17-2021 11:52:14 AM

7855 Views
6 replies
4 kudos

Resolved! Object of type bool_ is not JSON serializable

I am doing a convertion of a data frame to nested dict/json. One of the column called "Problematic__c" is boolean type.For some reason json does not accept this data type retriving error: "Object of type bool_ is not JSON serializable" I need this as...

Data Engineering

7855 Views
6 replies
4 kudos

10-17-2021 11:52:14 AM

View Replies

Latest Reply

Braxx
Contributor II

10-22-2021 1:16:15 AM

4 kudos

Thanks Dan, that make sens!

4 kudos

10-22-2021 1:16:15 AM

5 More Replies

by omsas • New Contributor

10-15-2021 4:48:38 AM

2729 Views
2 replies
0 kudos

How to add Columns for Automatic Fill on Pandas Python

1. I have data x,I would like to create a new column with the condition that the value are 1, 2 or 32. The name of the column is SHIFT where this SHIFT column will be filled automatically if the TIME_CREATED column meets the conditions.3. the conditi...

Data Engineering

2729 Views
2 replies
0 kudos

10-15-2021 4:48:38 AM

View Replies

Latest Reply

Ryan_Chynoweth
Esteemed Contributor

10-15-2021 12:59:20 PM

0 kudos

You an do something like this in pandas. Note there could be a more performant way to do this too. import pandas as pd import numpy as np df = pd.DataFrame({'a':[1,2,3,4]}) df.head() > a > 0 1 > 1 2 > 2 3 > 3 4 conditions = [(df['a'] <=2...

0 kudos

10-15-2021 12:59:20 PM

1 More Replies

Databricks Community

How do you read files from the DBFS with OS and Pandas Python libraries?

Resolved! Object of type bool_ is not JSON serializable

How to add Columns for Automatic Fill on Pandas Python