cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

shan_chandra
by Databricks Employee
  • 27219 Views
  • 1 replies
  • 3 kudos

Resolved! dataframe - cast string to decimal when encountering zeros returns OE-16

The user is trying to cast string to decimal when encountering zeros. The cast function displays the  '0' as '0E-16'. could you please let us know your thoughts on whether 0s can be displayed as 0s?from pyspark.sql import functions as F df = spark.s...

Screen Shot 2022-03-09 at 12.13.11 PM
  • 27219 Views
  • 1 replies
  • 3 kudos
Latest Reply
shan_chandra
Databricks Employee
  • 3 kudos

If the scale of decimal type is greater than 6, scientific notation kicks in hence seeing 0E-16.This behavior is described in the existing OSS spark issue - https://issues.apache.org/jira/browse/SPARK-25177Kindly cast the column to a decimal type les...

  • 3 kudos
LukaszJ
by Contributor III
  • 4951 Views
  • 7 replies
  • 2 kudos

Resolved! Long time turning on another notebook

Hello,   I want to run some notebooks from notebook "A". And regardless of the contents of the some notebook, it is run for a long time (20 seconds). It is constans value and I do not know why it takes so long. I tried run simple notebook with one in...

  • 4951 Views
  • 7 replies
  • 2 kudos
Latest Reply
LukaszJ
Contributor III
  • 2 kudos

Okay I am not able to set the same session for the both notebooks (parent and children).So my result is to use %run ./notebook_name .I put all the code to functions and now I can use them.Example:# Children notebook def do_something(param1, param2): ...

  • 2 kudos
6 More Replies
dzlab
by New Contributor
  • 995 Views
  • 0 replies
  • 0 kudos

Determine what is the interval in a timestamp column

OK so I'm trying to determine if a timestamp column has a regular interval or not, i.e. the difference between each consecutive value is the same across the entire column.I tried something like thisval timeColumn: String =   val groupByColumn: String...

  • 995 Views
  • 0 replies
  • 0 kudos
Tahseen0354
by Valued Contributor
  • 3135 Views
  • 3 replies
  • 2 kudos

Databricks Ganglia

Hi, is there any way to get alert automatically from databricks ganglia ? That means that a developer don’t need to review the logs manually but would get notification that resources are underutilized for example.

  • 3135 Views
  • 3 replies
  • 2 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 2 kudos

Databricks implementation of ganglia is so limited that it makes me laughing (png snapshots and not working tabs) so I think no You can go with datadog installation via global init script and get stats in datadog accounts.

  • 2 kudos
2 More Replies
Jed
by New Contributor II
  • 5020 Views
  • 2 replies
  • 0 kudos

Enabled 2.1 jobs api feature and unable to create a shared jobs cluster.

Hello, We enabled the 2.1 jobs api feature and when I attempt to create a "shared" job cluster in the configuration I always get this response:{'error_code': 'FEATURE_DISABLED', 'message': 'Shared job cluster feature is not enabled.'}Please could you...

  • 5020 Views
  • 2 replies
  • 0 kudos
Latest Reply
Jed
New Contributor II
  • 0 kudos

I am able to access now. To summarize the problem from my perspective the shared cluster API did not work as expected. Some direct manual intervention by databricks support was required.

  • 0 kudos
1 More Replies
shreyag
by New Contributor II
  • 2667 Views
  • 2 replies
  • 0 kudos

scheduling tasks through CLI

Is there a way to schedule tasks or jobs through the Databricks CLI instead of the GUI? I want to be able to create a job flow with different notebook through the CLI.

  • 2667 Views
  • 2 replies
  • 0 kudos
Latest Reply
Atanu
Databricks Employee
  • 0 kudos

I agreed with @Kaniz Fatma​  https://docs.databricks.com/dev-tools/cli/jobs-cli.html?_ga=2.101966982.684786035.1646666830-480220406.1638459894 this is the job CLI we currently support @Shreya Gupta​ 

  • 0 kudos
1 More Replies
Alx
by New Contributor
  • 3539 Views
  • 1 replies
  • 0 kudos

Problem with network security group (NSG) rules in case of VNet injection

Hi everyone,Our internal company security policy for the Cloud infrastructure requires to have custom outbound NSG rule that denies all traffic. The rules attributes should be as follows: Priority: 4096Port: AnyProtocol: AnySource: AnyDestination: An...

  • 3539 Views
  • 1 replies
  • 0 kudos
Latest Reply
Atanu
Databricks Employee
  • 0 kudos

HELLO @Alexey Tyulyaev​  please check https://docs.microsoft.com/en-us/azure/virtual-network/manage-network-security-group

  • 0 kudos
alejandrofm
by Valued Contributor
  • 5279 Views
  • 3 replies
  • 3 kudos

Resolved! Delta, the specified key does not exist error

Hi, I'm having this error too frequently on a few tables, I check on S3 and the partition exists and the file is there on the partition.error: Spectrum Scan Error: DeltaManifestcode: 15005context: Error fetching Delta Lake manifest delta/product/sub_...

  • 5279 Views
  • 3 replies
  • 3 kudos
Latest Reply
alejandrofm
Valued Contributor
  • 3 kudos

@Hubert Dudek​ , I'll add that sometimes, just running:GENERATE symlink_format_manifest FOR TABLE schema.tablesolves it, but, how can the symlink get broken?Thanks!

  • 3 kudos
2 More Replies
study_community
by New Contributor III
  • 14638 Views
  • 8 replies
  • 3 kudos

Not able to move files from local to dbfs through dbfs CLI

Hi Folks,I have installed and configured databricks CLI in my local machine. I tried to move a local file from my personal computer using dbfs cp to dbfs:/ path. I can see the file is copied from local, and is only visible in local. I am not able to ...

image image
  • 14638 Views
  • 8 replies
  • 3 kudos
Latest Reply
Anonymous
Not applicable
  • 3 kudos

Hi, Could you try to save the file from your local machine to dbfs:/FileStore location?# Put local file test.py to dbfs:/FileStore/test.pydbfs cp test.py dbfs:/FileStore/test.py

  • 3 kudos
7 More Replies
shrewdTurtle
by New Contributor II
  • 3494 Views
  • 2 replies
  • 3 kudos

Cannot open Jobs tab in Databricks Community edition.

Hi,I get the following exception when I try to open jobs tab.Uncaught TypeError: Cannot read properties of undefined (reading 'apply')   Reload the page and try again. If the error persists, contact support. Reference error code: fd9ae37c18c1400cb15...

  • 3494 Views
  • 2 replies
  • 3 kudos
Latest Reply
shrewdTurtle
New Contributor II
  • 3 kudos

@Kaniz Fatma​ , @Werner Stinckens​ thanks for the clarification. I agree with @Werner Stinckens​ , Error message should be more useful.

  • 3 kudos
1 More Replies
Jan_A
by New Contributor III
  • 5961 Views
  • 3 replies
  • 5 kudos

Resolved! Move/Migrate database from dbfs root (s3) to other mounted s3 bucket

Hi,I have a databricks database that has been created in the dbfs root S3 bucket, containing managed tables. I am looking for a way to move/migrate it to a mounted S3 bucket instead, and keep the database name.Any good ideas on how this can be done?T...

  • 5961 Views
  • 3 replies
  • 5 kudos
Latest Reply
User16753724663
Valued Contributor
  • 5 kudos

Hi @Jan Ahlbeck​ we can use below property to set the default location:"spark.sql.warehouse.dir": "S3 URL/dbfs path"Please let me know if this helps.

  • 5 kudos
2 More Replies
databrick_comm
by New Contributor II
  • 5818 Views
  • 3 replies
  • 0 kudos

Not able to connecting Denodo VDP from databricks

I would like connect Denodo VDP from databrick workspace installed ODBC client and Installed denodo Jar in cluster ,not able to understanding other steps.Could you please me

  • 5818 Views
  • 3 replies
  • 0 kudos
Latest Reply
User16753724663
Valued Contributor
  • 0 kudos

Hi @sathyanarayan kokku​ Are you trying to install denodo vdp server in databricks?

  • 0 kudos
2 More Replies
NAS
by New Contributor III
  • 2370 Views
  • 1 replies
  • 1 kudos

Resolved! "import pandas as pd" => [Errno 5]

When I type import pandas as pdfrom a Notebook in a Repo I get:--------------------------------------------------------------------------- AttributeError Traceback (most recent call last) /usr/lib/python3.8/importlib/_boots...

  • 2370 Views
  • 1 replies
  • 1 kudos
Latest Reply
NAS
New Contributor III
  • 1 kudos

Thanks to Elliott Hertz, I found out that the ML Experiments cannot be stored in the repo. After I moved them to my Workspace everything seems to work.

  • 1 kudos
RohanB
by New Contributor III
  • 6381 Views
  • 8 replies
  • 3 kudos

Resolved! Spark Streaming - Checkpoint State EOF Exception

I have a Spark Structured Streaming job which reads from 2 Delta tables in streams , processes the data and then writes to a 3rd Delta table. The job is being run with the Databricks service on GCP.Sometimes the job fails with the following exception...

  • 6381 Views
  • 8 replies
  • 3 kudos
Latest Reply
RohanB
New Contributor III
  • 3 kudos

Hi @Jose Gonzalez​ ,Do you require any more information regarding the code? Any idea what could be cause for the issue?Thanks and Regards,Rohan

  • 3 kudos
7 More Replies
SCOR
by New Contributor II
  • 2719 Views
  • 3 replies
  • 4 kudos

SparkJDBC42.jar Issue ?

Hi there!I am using the SparkJDBC42.jar in my Java application to use my delta lake tables , The connection is made through databricks sql endpoint in where I created a database and store in it my delta tables. I have a simple code to open connection...

  • 2719 Views
  • 3 replies
  • 4 kudos
Latest Reply
jose_gonzalez
Databricks Employee
  • 4 kudos

Hi @Seifeddine SNOUSSI​ ,Are you still having issue or you were able to resolve this issue? please let us know

  • 4 kudos
2 More Replies

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels