cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 
Data + AI Summit 2024 - Data Engineering & Streaming

Forum Posts

Vibhor
by Contributor
  • 5271 Views
  • 5 replies
  • 2 kudos

Resolved! Databricks Data Type Conversion error

In databricks while writing data to curated layer, see error - Failed to execute user defined function (Double => decimal(38,18)). Does anyone know if faced such issue and how to resolve it.

  • 5271 Views
  • 5 replies
  • 2 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 2 kudos

What happens if you explicitly cast it?I remember having such issues with implicit casting when goin from spark 2.x to 3.x, but these were solved by using explicit casting (not round()).

  • 2 kudos
4 More Replies
Anonymous
by Not applicable
  • 750 Views
  • 1 replies
  • 2 kudos

The Next Databricks Office HoursOur next Office Hours session is scheduled for March 23 2022 - 8:00 am PDT Do you have questions about how to set up o...

The Next Databricks Office HoursOur next Office Hours session is scheduled for March 23 2022 - 8:00 am PDTDo you have questions about how to set up or use Databricks? Do you want to get best practices for deploying your use case or tips on data archi...

  • 750 Views
  • 1 replies
  • 2 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 2 kudos

Signed in!

  • 2 kudos
bchaubey
by Contributor II
  • 1605 Views
  • 1 replies
  • 0 kudos
  • 1605 Views
  • 1 replies
  • 0 kudos
Latest Reply
User16764241763
Honored Contributor
  • 0 kudos

@Bhagwan Chaubey​ May be you can give this a try, if this is a Blob Storage account.https://docs.microsoft.com/en-us/azure/storage/blobs/storage-quickstart-blobs-python?tabs=environment-variable-windowsFor Datalake storage, please try belowhttps://do...

  • 0 kudos
Santosh09
by New Contributor II
  • 6114 Views
  • 4 replies
  • 3 kudos

Resolved! Writing Spark data frame to ADLS is taking Huge time when Data Frame is of Text data.

Spark data frame with text data when schema is in Struct type spark is taking too much time to write / save / push data to ADLS or SQL Db or download as csv.

image.png
  • 6114 Views
  • 4 replies
  • 3 kudos
Latest Reply
User16764241763
Honored Contributor
  • 3 kudos

@shiva Santosh​ Have to checked the count of the dataframe that you are trying to save to ADLS?As @Joseph Kambourakis​  mentioned the explode can result in 1-many rows, better to check data frame count and see if Spark OOMs in the workspace.

  • 3 kudos
3 More Replies
pawelmitrus
by Contributor
  • 1849 Views
  • 1 replies
  • 2 kudos

github.com

Do I always need to manually invite members of my AAD tenant to ADB workspace, if I don't have SCIM integration configured?EDIT: solved, it works when you go through Azure Portal and get in with "Launch Workspace" button on the ADB resource overview ...

  • 1849 Views
  • 1 replies
  • 2 kudos
Latest Reply
User16764241763
Honored Contributor
  • 2 kudos

Hello @pawelmitrus​ For users with Owner or Contributor roles, they should click on the "Launch Workspace" button in the Azure portal. For other users they should be explicitly granted access to the workspace to be able to login.Regards,Arvind

  • 2 kudos
rachelk05
by New Contributor II
  • 1996 Views
  • 1 replies
  • 4 kudos

Resolved! Databricks Community: Cluster Terminated Reason: Unexpected Launch Failure

Hi,I've been encountering the following error when I try to start a cluster, but the status page says everything is fine. Is something happening or are there other steps I can try?Time2022-03-13 14:40:51 EDTMessageCluster terminated.Reason:Unexpected...

  • 1996 Views
  • 1 replies
  • 4 kudos
Latest Reply
User16753724663
Valued Contributor
  • 4 kudos

Hi @Rachel Kelley​ We have some internal service interruptions due to which we had this issue. Our engineering has applied the fix and the cluster startup works as expected. Sincerely apologies for the inconvenience caused here.Regards,Darshan

  • 4 kudos
Anonymous
by Not applicable
  • 5678 Views
  • 2 replies
  • 0 kudos

How to read a compressed file in spark if the filename does not include the file extension for that compression format?

For example, let's say I have a file called some-file, which is a gzipped text file. If I try spark.read.text('some-file'), it will return a bunch of gibberish since it doesn't know that the file is gzipped. I'm looking to manually tell spark the fil...

  • 5678 Views
  • 2 replies
  • 0 kudos
Latest Reply
Francie
New Contributor II
  • 0 kudos

The community is field for the approval of the terms. The struggle of a great site is recommend for the norms. The value is suggested for the top of the vital paths for the finding members.

  • 0 kudos
1 More Replies
Doaa_Rashad
by New Contributor III
  • 4092 Views
  • 4 replies
  • 3 kudos

Resolved! databricks cli

i install databricks but give databricks not recognize

IMG20220309192203
  • 4092 Views
  • 4 replies
  • 3 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 3 kudos

probably it is not in PATH, you can add it to PATH as described here https://ganeshchandrasekaran.com/how-to-install-databricks-cli-and-get-the-path-of-databricks-executable-on-windows-74f83040dde7

  • 3 kudos
3 More Replies
rajib76
by New Contributor II
  • 2318 Views
  • 1 replies
  • 2 kudos

Resolved! DBFS with Google Cloud Storage(GCS)

Does DBFS support GCS?

  • 2318 Views
  • 1 replies
  • 2 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 2 kudos

Yes you need just to create service account for databricks and than assign storage admin role to bucket. After that you can mount GCS standard way:bucket_name = "<bucket-name>" mount_name = "<mount-name>" dbutils.fs.mount("gs://%s" % bucket_name, "/m...

  • 2 kudos
gbrueckl
by Contributor II
  • 13687 Views
  • 9 replies
  • 2 kudos

Setup Git Integration via REST API

We are currently setting up CI/CD for our Databricks workspace using Databricks Repos following the approach described in the offical docs: https://docs.databricks.com/repos.html#best-practices-for-integrating-databricks-repos-with-cicd-workflowsObvi...

  • 13687 Views
  • 9 replies
  • 2 kudos
Latest Reply
New1
New Contributor II
  • 2 kudos

Hi, how can i trigger a job externally using Github actions?

  • 2 kudos
8 More Replies
gzenz
by New Contributor II
  • 2163 Views
  • 1 replies
  • 1 kudos

Resolved! concat_ws() throws AnalysisException when too many columns are supplied

Hi,i'm using concat_ws in scala to calculate a checksum for the dataframe, i.e.:df.withColumn("CHECKSUM", sha2(functions.concat_ws("", dataframe.columns.map(col): _*), 512))I have one example here with just 24 columns that already throws the followin...

  • 2163 Views
  • 1 replies
  • 1 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 1 kudos

at least one of column names can have some strange char, whitespace or something,or at least one of column type is not compatible (for example StructType)you can separate your code to two or more steps. First generate list of columns as some variable...

  • 1 kudos
SG_
by New Contributor II
  • 3180 Views
  • 2 replies
  • 2 kudos

Resolved! How do i changes the fonts and color of the title of widget and the background color of widget?

Currently there is no documentation on how I can change the fonts and background color of widget? Is there a way to do so?

  • 3180 Views
  • 2 replies
  • 2 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 2 kudos

I saw on roadmap presentation that in future there will be more widget options but for now is like @Kavya Manohar Parag​ said.

  • 2 kudos
1 More Replies
Anuj93
by New Contributor III
  • 2259 Views
  • 2 replies
  • 2 kudos

Resolved! a user has been deleted from databricks workspace . Is there any way to find who deleted the user?

a user has been deleted from databricks workspace . Is there any way to find who deleted the user?

  • 2259 Views
  • 2 replies
  • 2 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 2 kudos

To do that you need to have enabled audit logs (if event already happened and it was not "on" I am afraid now it is too late).For Azure https://docs.microsoft.com/en-us/azure/databricks/administration-guide/account-settings/azure-diagnostic-logsFor A...

  • 2 kudos
1 More Replies
Rb29
by New Contributor
  • 745 Views
  • 0 replies
  • 0 kudos

Image Display in Dockerized Cluster

I am using a docker recipe for configuring my databricks cluster. It is working fine for everything else however when I tried to display any image data using any python utility such as matplotlib, PIL or Opencv etc. the image does not get displayed o...

  • 745 Views
  • 0 replies
  • 0 kudos

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group
Labels