cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

imgaboy
by New Contributor III
  • 5838 Views
  • 4 replies
  • 3 kudos

Resolved! pySpark Dataframe to DeepLearning model

I have a large time series with many measuring stations recording the same 5 data (Temperature, Humidity, etc.) I want to predict a future moment with a time series model, for which I pass the data from all the measuring stations to the Deep Learning...

image image
  • 5838 Views
  • 4 replies
  • 3 kudos
Latest Reply
Hubert-Dudek
Databricks MVP
  • 3 kudos

df.groupBy("date").pivot("Node").agg(first("Temp"))It is converting to classic crosstable so pivot will help. Example above.

  • 3 kudos
3 More Replies
Sarvagna_Mahaka
by New Contributor III
  • 7153 Views
  • 3 replies
  • 1 kudos

Resolved! Unable to clone GitLab Enterprise Edition repo in Databricks

Below are the steps that I followed. I still get an error message.Create a repo in gitlab enterprise editionIn GitLab, create a personal access token that allows access to your repositories ( with read_repository and write_repository permissions)Save...

error
  • 7153 Views
  • 3 replies
  • 1 kudos
Latest Reply
User16725394280
Databricks Employee
  • 1 kudos

Hi @Sarvagna Mahakali​  the repository which you are trying to add might be behind the VPN, our service cannot access it since it has no access to the VPN network.You may need the Enterprise Git / VPC to connect to the repository.Kindly check and let...

  • 1 kudos
2 More Replies
shan_chandra
by Databricks Employee
  • 30510 Views
  • 1 replies
  • 3 kudos

Resolved! dataframe - cast string to decimal when encountering zeros returns OE-16

The user is trying to cast string to decimal when encountering zeros. The cast function displays the  '0' as '0E-16'. could you please let us know your thoughts on whether 0s can be displayed as 0s?from pyspark.sql import functions as F df = spark.s...

Screen Shot 2022-03-09 at 12.13.11 PM
  • 30510 Views
  • 1 replies
  • 3 kudos
Latest Reply
shan_chandra
Databricks Employee
  • 3 kudos

If the scale of decimal type is greater than 6, scientific notation kicks in hence seeing 0E-16.This behavior is described in the existing OSS spark issue - https://issues.apache.org/jira/browse/SPARK-25177Kindly cast the column to a decimal type les...

  • 3 kudos
LukaszJ
by Contributor III
  • 6037 Views
  • 7 replies
  • 2 kudos

Resolved! Long time turning on another notebook

Hello,   I want to run some notebooks from notebook "A". And regardless of the contents of the some notebook, it is run for a long time (20 seconds). It is constans value and I do not know why it takes so long. I tried run simple notebook with one in...

  • 6037 Views
  • 7 replies
  • 2 kudos
Latest Reply
LukaszJ
Contributor III
  • 2 kudos

Okay I am not able to set the same session for the both notebooks (parent and children).So my result is to use %run ./notebook_name .I put all the code to functions and now I can use them.Example:# Children notebook def do_something(param1, param2): ...

  • 2 kudos
6 More Replies
dzlab
by New Contributor
  • 1506 Views
  • 0 replies
  • 0 kudos

Determine what is the interval in a timestamp column

OK so I'm trying to determine if a timestamp column has a regular interval or not, i.e. the difference between each consecutive value is the same across the entire column.I tried something like thisval timeColumn: String =   val groupByColumn: String...

  • 1506 Views
  • 0 replies
  • 0 kudos
Tahseen0354
by Valued Contributor
  • 3883 Views
  • 3 replies
  • 2 kudos

Databricks Ganglia

Hi, is there any way to get alert automatically from databricks ganglia ? That means that a developer don’t need to review the logs manually but would get notification that resources are underutilized for example.

  • 3883 Views
  • 3 replies
  • 2 kudos
Latest Reply
Hubert-Dudek
Databricks MVP
  • 2 kudos

Databricks implementation of ganglia is so limited that it makes me laughing (png snapshots and not working tabs) so I think no You can go with datadog installation via global init script and get stats in datadog accounts.

  • 2 kudos
2 More Replies
Jed
by New Contributor II
  • 5600 Views
  • 2 replies
  • 0 kudos

Enabled 2.1 jobs api feature and unable to create a shared jobs cluster.

Hello, We enabled the 2.1 jobs api feature and when I attempt to create a "shared" job cluster in the configuration I always get this response:{'error_code': 'FEATURE_DISABLED', 'message': 'Shared job cluster feature is not enabled.'}Please could you...

  • 5600 Views
  • 2 replies
  • 0 kudos
Latest Reply
Jed
New Contributor II
  • 0 kudos

I am able to access now. To summarize the problem from my perspective the shared cluster API did not work as expected. Some direct manual intervention by databricks support was required.

  • 0 kudos
1 More Replies
shreyag
by New Contributor II
  • 3200 Views
  • 2 replies
  • 0 kudos

scheduling tasks through CLI

Is there a way to schedule tasks or jobs through the Databricks CLI instead of the GUI? I want to be able to create a job flow with different notebook through the CLI.

  • 3200 Views
  • 2 replies
  • 0 kudos
Latest Reply
Atanu
Databricks Employee
  • 0 kudos

I agreed with @Kaniz Fatma​  https://docs.databricks.com/dev-tools/cli/jobs-cli.html?_ga=2.101966982.684786035.1646666830-480220406.1638459894 this is the job CLI we currently support @Shreya Gupta​ 

  • 0 kudos
1 More Replies
Alx
by Databricks Partner
  • 4300 Views
  • 1 replies
  • 0 kudos

Problem with network security group (NSG) rules in case of VNet injection

Hi everyone,Our internal company security policy for the Cloud infrastructure requires to have custom outbound NSG rule that denies all traffic. The rules attributes should be as follows: Priority: 4096Port: AnyProtocol: AnySource: AnyDestination: An...

  • 4300 Views
  • 1 replies
  • 0 kudos
Latest Reply
Atanu
Databricks Employee
  • 0 kudos

HELLO @Alexey Tyulyaev​  please check https://docs.microsoft.com/en-us/azure/virtual-network/manage-network-security-group

  • 0 kudos
alejandrofm
by Valued Contributor
  • 6500 Views
  • 3 replies
  • 3 kudos

Resolved! Delta, the specified key does not exist error

Hi, I'm having this error too frequently on a few tables, I check on S3 and the partition exists and the file is there on the partition.error: Spectrum Scan Error: DeltaManifestcode: 15005context: Error fetching Delta Lake manifest delta/product/sub_...

  • 6500 Views
  • 3 replies
  • 3 kudos
Latest Reply
alejandrofm
Valued Contributor
  • 3 kudos

@Hubert Dudek​ , I'll add that sometimes, just running:GENERATE symlink_format_manifest FOR TABLE schema.tablesolves it, but, how can the symlink get broken?Thanks!

  • 3 kudos
2 More Replies
study_community
by New Contributor III
  • 17877 Views
  • 8 replies
  • 3 kudos

Not able to move files from local to dbfs through dbfs CLI

Hi Folks,I have installed and configured databricks CLI in my local machine. I tried to move a local file from my personal computer using dbfs cp to dbfs:/ path. I can see the file is copied from local, and is only visible in local. I am not able to ...

image image
  • 17877 Views
  • 8 replies
  • 3 kudos
Latest Reply
Anonymous
Not applicable
  • 3 kudos

Hi, Could you try to save the file from your local machine to dbfs:/FileStore location?# Put local file test.py to dbfs:/FileStore/test.pydbfs cp test.py dbfs:/FileStore/test.py

  • 3 kudos
7 More Replies
shrewdTurtle
by New Contributor II
  • 4125 Views
  • 2 replies
  • 3 kudos

Cannot open Jobs tab in Databricks Community edition.

Hi,I get the following exception when I try to open jobs tab.Uncaught TypeError: Cannot read properties of undefined (reading 'apply')   Reload the page and try again. If the error persists, contact support. Reference error code: fd9ae37c18c1400cb15...

  • 4125 Views
  • 2 replies
  • 3 kudos
Latest Reply
shrewdTurtle
New Contributor II
  • 3 kudos

@Kaniz Fatma​ , @Werner Stinckens​ thanks for the clarification. I agree with @Werner Stinckens​ , Error message should be more useful.

  • 3 kudos
1 More Replies
Jan_A
by New Contributor III
  • 7392 Views
  • 3 replies
  • 5 kudos

Resolved! Move/Migrate database from dbfs root (s3) to other mounted s3 bucket

Hi,I have a databricks database that has been created in the dbfs root S3 bucket, containing managed tables. I am looking for a way to move/migrate it to a mounted S3 bucket instead, and keep the database name.Any good ideas on how this can be done?T...

  • 7392 Views
  • 3 replies
  • 5 kudos
Latest Reply
User16753724663
Databricks Employee
  • 5 kudos

Hi @Jan Ahlbeck​ we can use below property to set the default location:"spark.sql.warehouse.dir": "S3 URL/dbfs path"Please let me know if this helps.

  • 5 kudos
2 More Replies
databrick_comm
by New Contributor II
  • 6816 Views
  • 3 replies
  • 0 kudos

Not able to connecting Denodo VDP from databricks

I would like connect Denodo VDP from databrick workspace installed ODBC client and Installed denodo Jar in cluster ,not able to understanding other steps.Could you please me

  • 6816 Views
  • 3 replies
  • 0 kudos
Latest Reply
User16753724663
Databricks Employee
  • 0 kudos

Hi @sathyanarayan kokku​ Are you trying to install denodo vdp server in databricks?

  • 0 kudos
2 More Replies
NAS
by New Contributor III
  • 3065 Views
  • 1 replies
  • 1 kudos

Resolved! "import pandas as pd" => [Errno 5]

When I type import pandas as pdfrom a Notebook in a Repo I get:--------------------------------------------------------------------------- AttributeError Traceback (most recent call last) /usr/lib/python3.8/importlib/_boots...

  • 3065 Views
  • 1 replies
  • 1 kudos
Latest Reply
NAS
New Contributor III
  • 1 kudos

Thanks to Elliott Hertz, I found out that the ML Experiments cannot be stored in the repo. After I moved them to my Workspace everything seems to work.

  • 1 kudos
Labels