cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 
Data + AI Summit 2024 - Data Engineering & Streaming

Forum Posts

Doaa_Rashad
by New Contributor III
  • 3720 Views
  • 4 replies
  • 3 kudos

Resolved! databricks cli

i install databricks but give databricks not recognize

IMG20220309192203
  • 3720 Views
  • 4 replies
  • 3 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 3 kudos

probably it is not in PATH, you can add it to PATH as described here https://ganeshchandrasekaran.com/how-to-install-databricks-cli-and-get-the-path-of-databricks-executable-on-windows-74f83040dde7

  • 3 kudos
3 More Replies
rajib76
by New Contributor II
  • 2089 Views
  • 1 replies
  • 2 kudos

Resolved! DBFS with Google Cloud Storage(GCS)

Does DBFS support GCS?

  • 2089 Views
  • 1 replies
  • 2 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 2 kudos

Yes you need just to create service account for databricks and than assign storage admin role to bucket. After that you can mount GCS standard way:bucket_name = "<bucket-name>" mount_name = "<mount-name>" dbutils.fs.mount("gs://%s" % bucket_name, "/m...

  • 2 kudos
gbrueckl
by Contributor II
  • 12683 Views
  • 9 replies
  • 2 kudos

Setup Git Integration via REST API

We are currently setting up CI/CD for our Databricks workspace using Databricks Repos following the approach described in the offical docs: https://docs.databricks.com/repos.html#best-practices-for-integrating-databricks-repos-with-cicd-workflowsObvi...

  • 12683 Views
  • 9 replies
  • 2 kudos
Latest Reply
New1
New Contributor II
  • 2 kudos

Hi, how can i trigger a job externally using Github actions?

  • 2 kudos
8 More Replies
gzenz
by New Contributor II
  • 1962 Views
  • 1 replies
  • 1 kudos

Resolved! concat_ws() throws AnalysisException when too many columns are supplied

Hi,i'm using concat_ws in scala to calculate a checksum for the dataframe, i.e.:df.withColumn("CHECKSUM", sha2(functions.concat_ws("", dataframe.columns.map(col): _*), 512))I have one example here with just 24 columns that already throws the followin...

  • 1962 Views
  • 1 replies
  • 1 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 1 kudos

at least one of column names can have some strange char, whitespace or something,or at least one of column type is not compatible (for example StructType)you can separate your code to two or more steps. First generate list of columns as some variable...

  • 1 kudos
SG_
by New Contributor II
  • 2597 Views
  • 2 replies
  • 2 kudos

Resolved! How do i changes the fonts and color of the title of widget and the background color of widget?

Currently there is no documentation on how I can change the fonts and background color of widget? Is there a way to do so?

  • 2597 Views
  • 2 replies
  • 2 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 2 kudos

I saw on roadmap presentation that in future there will be more widget options but for now is like @Kavya Manohar Parag​ said.

  • 2 kudos
1 More Replies
Anuj93
by New Contributor III
  • 2068 Views
  • 2 replies
  • 2 kudos

Resolved! a user has been deleted from databricks workspace . Is there any way to find who deleted the user?

a user has been deleted from databricks workspace . Is there any way to find who deleted the user?

  • 2068 Views
  • 2 replies
  • 2 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 2 kudos

To do that you need to have enabled audit logs (if event already happened and it was not "on" I am afraid now it is too late).For Azure https://docs.microsoft.com/en-us/azure/databricks/administration-guide/account-settings/azure-diagnostic-logsFor A...

  • 2 kudos
1 More Replies
Rb29
by New Contributor
  • 677 Views
  • 0 replies
  • 0 kudos

Image Display in Dockerized Cluster

I am using a docker recipe for configuring my databricks cluster. It is working fine for everything else however when I tried to display any image data using any python utility such as matplotlib, PIL or Opencv etc. the image does not get displayed o...

  • 677 Views
  • 0 replies
  • 0 kudos
AmanSehgal
by Honored Contributor III
  • 5524 Views
  • 4 replies
  • 15 kudos

Resolved! What's the best way to run a databricks notebook from AWS Lambda ?

I have a trigger in lambda that gets triggered when a new file arrives in S3. I want this file to be straightaway processed using a notebook to Upsert all the data into a delta table.I'm looking for a solution with minimum latency.

  • 5524 Views
  • 4 replies
  • 15 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 15 kudos

There are two possible solution:autoloader/cloudfiles, better with "File notification" queue to avoid unnecessary scans,ORfrom lambda sending post request to /api/2.1/jobs/run-nowAdditionally in both solution it is important to have private link and...

  • 15 kudos
3 More Replies
imgaboy
by New Contributor III
  • 2708 Views
  • 4 replies
  • 3 kudos

Resolved! pySpark Dataframe to DeepLearning model

I have a large time series with many measuring stations recording the same 5 data (Temperature, Humidity, etc.) I want to predict a future moment with a time series model, for which I pass the data from all the measuring stations to the Deep Learning...

image image
  • 2708 Views
  • 4 replies
  • 3 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 3 kudos

df.groupBy("date").pivot("Node").agg(first("Temp"))It is converting to classic crosstable so pivot will help. Example above.

  • 3 kudos
3 More Replies
Sarvagna_Mahaka
by New Contributor III
  • 4004 Views
  • 3 replies
  • 1 kudos

Resolved! Unable to clone GitLab Enterprise Edition repo in Databricks

Below are the steps that I followed. I still get an error message.Create a repo in gitlab enterprise editionIn GitLab, create a personal access token that allows access to your repositories ( with read_repository and write_repository permissions)Save...

error
  • 4004 Views
  • 3 replies
  • 1 kudos
Latest Reply
User16725394280
Contributor II
  • 1 kudos

Hi @Sarvagna Mahakali​  the repository which you are trying to add might be behind the VPN, our service cannot access it since it has no access to the VPN network.You may need the Enterprise Git / VPC to connect to the repository.Kindly check and let...

  • 1 kudos
2 More Replies
shan_chandra
by Esteemed Contributor
  • 16858 Views
  • 1 replies
  • 3 kudos

Resolved! dataframe - cast string to decimal when encountering zeros returns OE-16

The user is trying to cast string to decimal when encountering zeros. The cast function displays the  '0' as '0E-16'. could you please let us know your thoughts on whether 0s can be displayed as 0s?from pyspark.sql import functions as F df = spark.s...

Screen Shot 2022-03-09 at 12.13.11 PM
  • 16858 Views
  • 1 replies
  • 3 kudos
Latest Reply
shan_chandra
Esteemed Contributor
  • 3 kudos

If the scale of decimal type is greater than 6, scientific notation kicks in hence seeing 0E-16.This behavior is described in the existing OSS spark issue - https://issues.apache.org/jira/browse/SPARK-25177Kindly cast the column to a decimal type les...

  • 3 kudos
LukaszJ
by Contributor III
  • 3771 Views
  • 7 replies
  • 1 kudos

Resolved! Long time turning on another notebook

Hello,I want to run some notebooks from notebook "A".And regardless of the contents of the some notebook, it is run for a long time (20 seconds). It is constans value and I do not know why it takes so long.I tried run simple notebook with one input p...

  • 3771 Views
  • 7 replies
  • 1 kudos
Latest Reply
LukaszJ
Contributor III
  • 1 kudos

Okay I am not able to set the same session for the both notebooks (parent and children).So my result is to use %run ./notebook_name .I put all the code to functions and now I can use them.Example:# Children notebook def do_something(param1, param2): ...

  • 1 kudos
6 More Replies
Anonymous
by Not applicable
  • 5153 Views
  • 8 replies
  • 2 kudos

Resolved! Issue in creating workspace - Custom AWS Configuration

We have tried to create new workspace using "Custom AWS Configuration" and we have given our own VPC (Customer managed VPC) and tried but workspace failed to launch. We are getting below error which couldn't understand where the issue is in.Workspace...

  • 5153 Views
  • 8 replies
  • 2 kudos
Latest Reply
Anonymous
Not applicable
  • 2 kudos

@Mitesh Patel​ - As Atanu thinks the issue may be resolved, I wanted to check in with you, also. How goes it?

  • 2 kudos
7 More Replies
dzlab
by New Contributor
  • 704 Views
  • 0 replies
  • 0 kudos

Determine what is the interval in a timestamp column

OK so I'm trying to determine if a timestamp column has a regular interval or not, i.e. the difference between each consecutive value is the same across the entire column.I tried something like thisval timeColumn: String =   val groupByColumn: String...

  • 704 Views
  • 0 replies
  • 0 kudos

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group
Labels