Data Engineering

Forum Posts

Sorted by:

by lenonlmsv • New Contributor II

01-17-2023 4:58:56 AM

1014 Views
3 replies
0 kudos

Query API Result

Hi, I'm new here.Currently I have to read information from a query in databricks. I've used the query API to get the query definition but so far I'm not able to run the query and get the results.Is it possible? Thanks

Data Engineering

1014 Views
3 replies
0 kudos

01-17-2023 4:58:56 AM

View Replies

Latest Reply

daniel_sahal
Esteemed Contributor

01-17-2023 5:12:36 AM

0 kudos

When using the JobsAPI you need to specify dbutils.notebook.exit("returnValue") to pass the results once the notebook finished it's job (https://docs.databricks.com/notebooks/notebook-workflows.html#notebook-workflows-exit).Then you can get notebook_...

0 kudos

01-17-2023 5:12:36 AM

2 More Replies

by databicky • Contributor II

01-16-2023 5:29:49 PM

2503 Views
6 replies
1 kudos

Resolved! how to check dataframe column value

in my dataframe it have one column name like count, if that particular column value is greater than zero, the job needs to get failed, how can i perform that one?

Data Engineering

2503 Views
6 replies
1 kudos

01-16-2023 5:29:49 PM

View Replies

Latest Reply

Hubert-Dudek
Esteemed Contributor III

01-17-2023 1:44:03 AM

1 kudos

Code without collect, which should not be used in production:if df.filter("count > 0").count() > 0: dbutils.notebook.exit('Notebook Failed')you can also use a more aggressive version:if df.filter("count > 0").count() > 0: raise Exception("count bigge...

1 kudos

01-17-2023 1:44:03 AM

5 More Replies

by 151640 • New Contributor III

11-02-2022 9:45:21 AM

1476 Views
5 replies
3 kudos

Resolved! Is there a known issue regarding Databricks JDBC driver character values such as Japanese etc?

A Parquet file contains character data for various languages and is shown by the Data Explorer UX. A simple "select *" query using the Databricks JDBC driver (version 2.6.29) with a tool such as SQLSquirrel displays invalid characters.

Data Engineering

1476 Views
5 replies
3 kudos

11-02-2022 9:45:21 AM

View Replies

Latest Reply

151640
New Contributor III

01-17-2023 4:15:24 AM

3 kudos

The issue encountered has been confirmed to be a defect in the Databricks JDBC driver.

3 kudos

01-17-2023 4:15:24 AM

4 More Replies

by JD410993 • New Contributor II

01-16-2023 11:26:46 PM

1313 Views
3 replies
2 kudos

Job runs indefinitely after integrating with PyDeequ

I'm using PyDeequ data quality checks in one of our jobs. After adding this check, I noticed that the job does not complete and keeps running indefinitely after PyDeequ checks are completed and results are returned.As stated in Pydeequ documentation ...

Data Engineering

1313 Views
3 replies
2 kudos

01-16-2023 11:26:46 PM

View Replies

Latest Reply

-werners-
Esteemed Contributor III

01-17-2023 2:26:58 AM

2 kudos

Hm, deequ certainly works as I have read about multiple people using it.And when reading the issues (open/closed) on the github pages of pydeequ, databricks is mentioned in some issues so it might be possible after all.But I think you need to check y...

2 kudos

01-17-2023 2:26:58 AM

2 More Replies

by KVNARK • Honored Contributor II

01-16-2023 10:25:38 PM

1612 Views
4 replies
6 kudos

Resolved! How to parameterize key of spark config in the job clusterlinked service from ADF

how can we parameterize key of the spark-config in the job cluster linked service from Azure datafactory, we can parameterize the values but any idea how can we parameterize the key so that when deploying to further environment it takes the PROD/QA v...

Data Engineering

1612 Views
4 replies
6 kudos

01-16-2023 10:25:38 PM

View Replies

Latest Reply

daniel_sahal
Esteemed Contributor

01-17-2023 12:07:20 AM

6 kudos

@KVNARK . You can use Databricks Secrets (create a Secret scope from AKV https://learn.microsoft.com/en-us/azure/databricks/security/secrets/secret-scopes) and then reference a secret in spark configuration (https://learn.microsoft.com/en-us/azure/d...

6 kudos

01-17-2023 12:07:20 AM

3 More Replies

by Orianh • Valued Contributor II

01-10-2023 8:05:09 AM

2341 Views
3 replies
1 kudos

Resolved! Attach instance profile to service principal.

Hey Guys, I'm having some permission issues using service principal and instance profile and i hope you could help me.I created a service principal and attached to it an instance profile - databricks-my-profile.I have a s3 bucket with policy that all...

Data Engineering

2341 Views
3 replies
1 kudos

01-10-2023 8:05:09 AM

View Replies

Latest Reply

Orianh
Valued Contributor II

01-17-2023 1:40:55 AM

1 kudos

Hey @Kaniz Fatma , @Debayan Mukherjee, Thanks for your answers.Actually, Databricks is not support using DBFS API with service principal & attached instance profile on a mounted s3 bucket.I'm not sure if this exists in docs (might miss it) but thi...

1 kudos

01-17-2023 1:40:55 AM

2 More Replies

by chanansh • Contributor

01-11-2023 9:42:02 AM

4113 Views
3 replies
0 kudos

Relative path in absolute URI when reading a folder with files containing ":" colons in filename

I am trying to read a folder with partition files where each partition is date/hour/timestamp.csv where timestamp is the exact timestamp in ISO format, e.g. 09-2022-12-05T20:35:15.2786966Z It seems like spark having issues with reading files with col...

Data Engineering

4113 Views
3 replies
0 kudos

01-11-2023 9:42:02 AM

View Replies

Latest Reply

Kaniz
Community Manager

01-16-2023 3:22:29 AM

0 kudos

Hi @Hanan Shteingart (Customer), We haven’t heard from you since the last response from @Debayan Mukherjee (Customer) , and I was checking back to see if his suggestions helped you.Or else, If you have any solution, please share it with the co...

0 kudos

01-16-2023 3:22:29 AM

2 More Replies

by databicky • Contributor II

01-10-2023 4:34:21 AM

654 Views
2 replies
1 kudos

how to add the title excelsheet with python

i want to write title with some combination of rows in pandas df, and write into excel sheet. i tried some method but i could see styler object is not subscriptable

Data Engineering

654 Views
2 replies
1 kudos

01-10-2023 4:34:21 AM

View Replies

Latest Reply

Kaniz
Community Manager

01-17-2023 1:32:07 AM

1 kudos

Hi @Mohammed sadamusean (Customer), We haven’t heard from you since the last response from @Ratna Chaitanya Raju Bandaru, and I was checking back to see if his suggestions helped you.Or else, If you have any solution, please share it with the com...

1 kudos

01-17-2023 1:32:07 AM

1 More Replies

by tariq • New Contributor III

10-25-2022 10:43:55 PM

5605 Views
5 replies
7 kudos

Databricks Azure Blob Storage access

I am trying to access files stored in Azure blob storage and have followed the documentation linked below:https://docs.databricks.com/external-data/azure-storage.htmlI was successful in mounting the Azure blob storage on dbfs but it seems that the me...

Data Engineering

5605 Views
5 replies
7 kudos

10-25-2022 10:43:55 PM

View Replies

Latest Reply

Debayan
Esteemed Contributor III

01-15-2023 8:35:08 PM

7 kudos

Hi, @Ravindra Ch , could you please check the firewall settings in Azure networking?

7 kudos

01-15-2023 8:35:08 PM

4 More Replies

by wim_schmitz_per • New Contributor II

01-03-2023 9:53:04 AM

1766 Views
2 replies
2 kudos

Transforming/Saving Python Class Instances to Delta Rows

I'm trying to reuse a Python Package to do a very complex series of parsing binary files into workable data in Delta Format. I have made the first part (binary file parsing) work with a UDF:asffileparser = F.udf(File()._parseBytes,AsfFileDelta.getSch...

Data Engineering

1766 Views
2 replies
2 kudos

01-03-2023 9:53:04 AM

View Replies

Latest Reply

Debayan
Esteemed Contributor III

01-05-2023 9:45:32 PM

2 kudos

Hi, did you try to follow, "Fix it by registering a custom IObjectConstructor for this class."?Also, could you please provide us the full error?

2 kudos

01-05-2023 9:45:32 PM

1 More Replies

by ramravi • Contributor II

01-12-2023 12:38:58 AM

1493 Views
1 replies
0 kudos

Unable to connect to databricks cluster from Windows using databricks-connect

I am trying to setup databricks-connect in my windows machine. While doing databricks-connect test I am getting the below error complaining java certificate is not found. ''Caused by: sun.security.validator.ValidatorException: PKIX path building fail...

Data Engineering

1493 Views
1 replies
0 kudos

01-12-2023 12:38:58 AM

View Replies

Latest Reply

ramravi
Contributor II

01-16-2023 10:39:45 PM

0 kudos

Adding the certificate from the root level worked for me. This problem is solved.

0 kudos

01-16-2023 10:39:45 PM

by dotan • New Contributor II

11-02-2022 11:00:08 AM

853 Views
4 replies
2 kudos

Poor Auto Loader performance with CSV files on S3

I setup a notebook to ingest data using Auto Loader from an S3 bucket that contains over 500K CSV files into a hive table.Recently the amount of rows (and input files) in the table grew from around 150M to 530M and now each batch takes around an hour...

Data Engineering

853 Views
4 replies
2 kudos

11-02-2022 11:00:08 AM

View Replies

Latest Reply

Anonymous
Not applicable

01-16-2023 10:10:40 PM

2 kudos

Hi @Dotan Schachter Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Th...

2 kudos

01-16-2023 10:10:40 PM

3 More Replies

by SQL_DB • New Contributor II

11-01-2022 3:19:56 PM

1212 Views
2 replies
2 kudos

Sharing CSV export from a dashboard

Is it possible to schedule refresh and share a csv format of a table visual in a dashboard? Also, is it possible to share only one visual in a dashboard when there are more than one?

Data Engineering

1212 Views
2 replies
2 kudos

11-01-2022 3:19:56 PM

View Replies

Latest Reply

Anonymous
Not applicable

01-16-2023 9:54:01 PM

2 kudos

Hi @Sujitha Bommayan Hope everything is going great.Does @Kaniz Fatma response answer your question? If yes, would you be happy to mark it as best so that other members can find the solution more quickly?We'd love to hear from you.Thanks!

2 kudos

01-16-2023 9:54:01 PM

1 More Replies

by Abhijeet • New Contributor III

01-07-2023 6:01:59 AM

1608 Views
5 replies
5 kudos

How to Read Terabytes of data in Databricks

I want to read 1000 GB data. As in spark we do in memory transformation. Do I need worker nodes with combined size of 1000 GB.Also Just want to understand if will reading we store 1000 GB in memory. So how the Cache Data frame is different from the a...

Data Engineering

1608 Views
5 replies
5 kudos

01-07-2023 6:01:59 AM

View Replies

Latest Reply

Ajay-Pandey
Esteemed Contributor III

01-07-2023 11:24:59 PM

5 kudos

Hi @Abhijeet Singh below blog might help you-Link

5 kudos

01-07-2023 11:24:59 PM

4 More Replies

by Ulf • New Contributor II

01-16-2023 2:40:12 AM

698 Views
1 replies
0 kudos

Github and task integration

I have the same problem as described in this post (https://community.databricks.com/s/question/0D58Y00009ObQgdSAF/running-jobs-using-notebooks-in-a-remote-azure-devops-services-repos-git-repository-is-generating-notebook-not-found-error) and get this...

Data Engineering

698 Views
1 replies
0 kudos

01-16-2023 2:40:12 AM

View Replies

Latest Reply

Debayan
Esteemed Contributor III

01-16-2023 8:30:20 PM

0 kudos

Hi,Could you please check and let us know if this helps. https://community.databricks.com/s/question/0D53f00001GHVTNCA5/notebook-path-cant-be-in-dbfs

0 kudos

01-16-2023 8:30:20 PM

User

Count

1601

736

343

284

246

Databricks

Forum Posts

Query API Result

Resolved! how to check dataframe column value

Resolved! Is there a known issue regarding Databricks JDBC driver character values such as Japanese etc?

Job runs indefinitely after integrating with PyDeequ

Resolved! How to parameterize key of spark config in the job clusterlinked service from ADF

Resolved! Attach instance profile to service principal.

Relative path in absolute URI when reading a folder with files containing ":" colons in filename

how to add the title excelsheet with python

Databricks Azure Blob Storage access

Transforming/Saving Python Class Instances to Delta Rows

Unable to connect to databricks cluster from Windows using databricks-connect

Poor Auto Loader performance with CSV files on S3

Sharing CSV export from a dashboard

How to Read Terabytes of data in Databricks

Github and task integration

DELTA_EXCEED_CHAR_VARCHAR_LIMIT

Not able to set run_as service_principal_name

Pyspark operations slowness in CLuster 14.3LTS as ...

[Databricks Assets Bundles] Workflow trigger on fi...

Addressing Pipeline Error Handling in Databricks b...