cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 
Data + AI Summit 2024 - Data Engineering & Streaming

Forum Posts

bonyfus
by New Contributor II
  • 2412 Views
  • 3 replies
  • 0 kudos

Error when accessing the file from azure blob storage

I am getting the following error when accessing the file in Azure blob storagejava.io.FileNotFoundException: File /10433893690638/mnt/22200/22200Ver1.sps does not exist.Code:ves_blob = dbutils.widgets.get("ves_blob") try: dbutils.fs.ls(ves_blob ) e...

  • 2412 Views
  • 3 replies
  • 0 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 0 kudos

that is certainly an invalid path, as the error shows.with %fs ls /mnt you can show the directory structure of the /mnt directory, assuming the blob storage is mounted.if not, you need to define the access ( URL etc.)

  • 0 kudos
2 More Replies
lenonlmsv
by New Contributor II
  • 1639 Views
  • 3 replies
  • 0 kudos

Query API Result

Hi, I'm new here.Currently I have to read information from a query in databricks. I've used the query API to get the query definition but so far I'm not able to run the query and get the results.Is it possible? Thanks

  • 1639 Views
  • 3 replies
  • 0 kudos
Latest Reply
daniel_sahal
Esteemed Contributor
  • 0 kudos

When using the JobsAPI you need to specify dbutils.notebook.exit("returnValue") to pass the results once the notebook finished it's job (https://docs.databricks.com/notebooks/notebook-workflows.html#notebook-workflows-exit).Then you can get notebook_...

  • 0 kudos
2 More Replies
databicky
by Contributor II
  • 4911 Views
  • 6 replies
  • 1 kudos

Resolved! how to check dataframe column value

in my dataframe it have one column name like count, if that particular column value is greater than zero, the job needs to get failed, how can i perform that one?​

  • 4911 Views
  • 6 replies
  • 1 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 1 kudos

Code without collect, which should not be used in production:if df.filter("count > 0").count() > 0: dbutils.notebook.exit('Notebook Failed')you can also use a more aggressive version:if df.filter("count > 0").count() > 0: raise Exception("count bigge...

  • 1 kudos
5 More Replies
151640
by New Contributor III
  • 2715 Views
  • 5 replies
  • 3 kudos

Resolved! Is there a known issue regarding Databricks JDBC driver character values such as Japanese etc?

A Parquet file contains character data for various languages and is shown by the Data Explorer UX. A simple "select *" query using the Databricks JDBC driver (version 2.6.29) with a tool such as SQLSquirrel displays invalid characters.

image
  • 2715 Views
  • 5 replies
  • 3 kudos
Latest Reply
151640
New Contributor III
  • 3 kudos

The issue encountered has been confirmed to be a defect in the Databricks JDBC driver.

  • 3 kudos
4 More Replies
JD410993
by New Contributor II
  • 2000 Views
  • 3 replies
  • 2 kudos

Job runs indefinitely after integrating with PyDeequ

I'm using PyDeequ data quality checks in one of our jobs. After adding this check, I noticed that the job does not complete and keeps running indefinitely after PyDeequ checks are completed and results are returned.As stated in Pydeequ documentation ...

  • 2000 Views
  • 3 replies
  • 2 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 2 kudos

Hm, deequ certainly works as I have read about multiple people using it.And when reading the issues (open/closed) on the github pages of pydeequ, databricks is mentioned in some issues so it might be possible after all.But I think you need to check y...

  • 2 kudos
2 More Replies
Orianh
by Valued Contributor II
  • 3497 Views
  • 3 replies
  • 1 kudos

Resolved! Attach instance profile to service principal.

Hey Guys, I'm having some permission issues using service principal and instance profile and i hope you could help me.I created a service principal and attached to it an instance profile - databricks-my-profile.I have a s3 bucket with policy that all...

  • 3497 Views
  • 3 replies
  • 1 kudos
Latest Reply
Orianh
Valued Contributor II
  • 1 kudos

Hey @Kaniz Fatma​ , @Debayan Mukherjee​, Thanks for your answers.Actually, Databricks is not support using DBFS API with service principal & attached instance profile on a mounted s3 bucket.I'm not sure if this exists in docs (might miss it) but thi...

  • 1 kudos
2 More Replies
chanansh
by Contributor
  • 6283 Views
  • 3 replies
  • 0 kudos

Relative path in absolute URI when reading a folder with files containing ":" colons in filename

I am trying to read a folder with partition files where each partition is date/hour/timestamp.csv where timestamp is the exact timestamp in ISO format, e.g. 09-2022-12-05T20:35:15.2786966Z It seems like spark having issues with reading files with col...

  • 6283 Views
  • 3 replies
  • 0 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 0 kudos

Hi @Hanan Shteingart​  (Customer)​, We haven’t heard from you since the last response from @Debayan Mukherjee​  (Customer)​ , and I was checking back to see if his suggestions helped you.Or else, If you have any solution, please share it with the co...

  • 0 kudos
2 More Replies
databicky
by Contributor II
  • 1186 Views
  • 2 replies
  • 1 kudos

how to add the title excelsheet with python

i want to write title with some combination of rows in pandas df, and ​write into excel sheet. i tried some method but i could see styler object is not subscriptable

  • 1186 Views
  • 2 replies
  • 1 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 1 kudos

Hi @Mohammed sadamusean​  (Customer)​, We haven’t heard from you since the last response from @Ratna Chaitanya Raju Bandaru​, and I was checking back to see if his suggestions helped you.Or else, If you have any solution, please share it with the com...

  • 1 kudos
1 More Replies
tariq
by New Contributor III
  • 8172 Views
  • 5 replies
  • 7 kudos

Databricks Azure Blob Storage access

I am trying to access files stored in Azure blob storage and have followed the documentation linked below:https://docs.databricks.com/external-data/azure-storage.htmlI was successful in mounting the Azure blob storage on dbfs but it seems that the me...

  • 8172 Views
  • 5 replies
  • 7 kudos
Latest Reply
Debayan
Esteemed Contributor III
  • 7 kudos

Hi, @Ravindra Ch​ , could you please check the firewall settings in Azure networking?

  • 7 kudos
4 More Replies
wim_schmitz_per
by New Contributor II
  • 3201 Views
  • 2 replies
  • 2 kudos

Transforming/Saving Python Class Instances to Delta Rows

I'm trying to reuse a Python Package to do a very complex series of parsing binary files into workable data in Delta Format. I have made the first part (binary file parsing) work with a UDF:asffileparser = F.udf(File()._parseBytes,AsfFileDelta.getSch...

  • 3201 Views
  • 2 replies
  • 2 kudos
Latest Reply
Debayan
Esteemed Contributor III
  • 2 kudos

Hi, did you try to follow, "Fix it by registering a custom IObjectConstructor for this class."?Also, could you please provide us the full error?

  • 2 kudos
1 More Replies
ramravi
by Contributor II
  • 2099 Views
  • 1 replies
  • 0 kudos

Unable to connect to databricks cluster from Windows using databricks-connect

I am trying to setup databricks-connect in my windows machine. While doing databricks-connect test I am getting the below error complaining java certificate is not found. ''Caused by: sun.security.validator.ValidatorException: PKIX path building fail...

cer
  • 2099 Views
  • 1 replies
  • 0 kudos
Latest Reply
ramravi
Contributor II
  • 0 kudos

Adding the certificate from the root level worked for me. This problem is solved.

  • 0 kudos
dotan
by New Contributor II
  • 1617 Views
  • 4 replies
  • 2 kudos

Poor Auto Loader performance with CSV files on S3

I setup a notebook to ingest data using Auto Loader from an S3 bucket that contains over 500K CSV files into a hive table.Recently the amount of rows (and input files) in the table grew from around 150M to 530M and now each batch takes around an hour...

  • 1617 Views
  • 4 replies
  • 2 kudos
Latest Reply
Anonymous
Not applicable
  • 2 kudos

Hi @Dotan Schachter​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Th...

  • 2 kudos
3 More Replies
SQL_DB
by New Contributor II
  • 1998 Views
  • 2 replies
  • 2 kudos

Sharing CSV export from a dashboard

Is it possible to schedule refresh and share a csv format of a table visual in a dashboard? Also, is it possible to share only one visual in a dashboard when there are more than one?

  • 1998 Views
  • 2 replies
  • 2 kudos
Latest Reply
Anonymous
Not applicable
  • 2 kudos

Hi @Sujitha Bommayan​ Hope everything is going great.Does @Kaniz Fatma​  response answer your question? If yes, would you be happy to mark it as best so that other members can find the solution more quickly?We'd love to hear from you.Thanks!

  • 2 kudos
1 More Replies
Abhijeet
by New Contributor III
  • 3038 Views
  • 5 replies
  • 5 kudos

How to Read Terabytes of data in Databricks

I want to read 1000 GB data. As in spark we do in memory transformation. Do I need worker nodes with combined size of 1000 GB.Also Just want to understand if will reading we store 1000 GB in memory. So how the Cache Data frame is different from the a...

  • 3038 Views
  • 5 replies
  • 5 kudos
Latest Reply
Ajay-Pandey
Esteemed Contributor III
  • 5 kudos

Hi @Abhijeet Singh​ below blog might help you-Link

  • 5 kudos
4 More Replies
Ulf
by New Contributor II
  • 1052 Views
  • 1 replies
  • 0 kudos

Github and task integration

I have the same problem as described in this post (https://community.databricks.com/s/question/0D58Y00009ObQgdSAF/running-jobs-using-notebooks-in-a-remote-azure-devops-services-repos-git-repository-is-generating-notebook-not-found-error) and get this...

  • 1052 Views
  • 1 replies
  • 0 kudos
Latest Reply
Debayan
Esteemed Contributor III
  • 0 kudos

Hi,Could you please check and let us know if this helps. https://community.databricks.com/s/question/0D53f00001GHVTNCA5/notebook-path-cant-be-in-dbfs

  • 0 kudos

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group
Labels