cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 
Data + AI Summit 2024 - Data Engineering & Streaming

Forum Posts

alejandrofm
by Valued Contributor
  • 2372 Views
  • 4 replies
  • 2 kudos

Resolved! Orphan (?) files on Databricks S3 bucket

Hi, I'm seeing a lot of empty (and not) directories on routes like:xxxxxx.jobs/FileStore/job-actionstats/xxxxxx.jobs/FileStore/job-result/xxxxxx.jobs/command-results/Can I create a lifecycle to delete old objects (files/directories)? how many days? w...

  • 2372 Views
  • 4 replies
  • 2 kudos
Latest Reply
alejandrofm
Valued Contributor
  • 2 kudos

Hi! I didn't know that, Purging right now, is there a way to schedule that so logs are retained for less time? Maybe I want to maintain the last 7 days for everything?Thanks!

  • 2 kudos
3 More Replies
rt2
by New Contributor III
  • 1401 Views
  • 2 replies
  • 3 kudos

Resolved! Fundamentals of Databricks Lakehouse Badge not recieved.

I passed the databricks fundamental exam and like many others I too did not recieved my badge.I am very much intrested in putting this badge on my linkedin profile, please help.My email id is: rahul.psit.ec@gmail.comWhich databricks is resolving as: ...

  • 1401 Views
  • 2 replies
  • 3 kudos
Latest Reply
rt2
New Contributor III
  • 3 kudos

I got the badge now. Thanks.

  • 3 kudos
1 More Replies
r-g-s-j
by New Contributor
  • 2223 Views
  • 1 replies
  • 0 kudos

How to Configure PySpark Jobs Using PEX

IssueI am attempting to create a PySpark job via the Databricks UI (with spark-submit) using the parameters below (dependencies are on the PEX file), but I am getting the an exception that the pex file does not exist. It's my understanding that the -...

  • 2223 Views
  • 1 replies
  • 0 kudos
Latest Reply
franck
New Contributor II
  • 0 kudos

Hi,I'm facing the same issue trying to execute a pyspark job with spark-submit.I have explored the same solution as you : --files optionspark.pyspark.driver.pythonspark.executorEnv.PEX_ROOTDo you make some progress in the resolution of the problem ?

  • 0 kudos
draculla1208
by New Contributor
  • 1086 Views
  • 0 replies
  • 0 kudos

Able to read .hdf files but not able to write to .hdf files from worker nodes and save to dbfs

I have a set of .hdf files that I want to distribute and read on Worker nodes under Databricks environment using PySpark. I am able to read .hdf files on worker nodes and get the data from the files. The next requirement is that now each worker node ...

  • 1086 Views
  • 0 replies
  • 0 kudos
karthik_p
by Esteemed Contributor
  • 2832 Views
  • 3 replies
  • 8 kudos

odbc connectivity Issues with Databricks when we are out of VPN in GCP

HI Team,We are getting below error when we are trying to connect our tool by using ODBC connection with out logging to VPN, when we are in VPN we are not getting below issue.[Simba][ThriftExtension] (14) Unexpected response from server during a HTTP ...

  • 2832 Views
  • 3 replies
  • 8 kudos
Latest Reply
karthik_p
Esteemed Contributor
  • 8 kudos

@Kaniz Fatma​ our team working with data bricks, we can close this thread

  • 8 kudos
2 More Replies
tinendra
by New Contributor III
  • 4587 Views
  • 2 replies
  • 2 kudos

How to read a file in pandas in a databricks environment?

Hi, When I was trying to read the CSV files using pandas I am getting an error which I have mentioned below.df=pd.read_csv("/dbfs/FileStore/tables/badrecord-1.csv")Error: FileNotFoundError: [Errno 2] No such file or directory: '/dbfs/FileStore/tables...

  • 4587 Views
  • 2 replies
  • 2 kudos
Latest Reply
tinendra
New Contributor III
  • 2 kudos

dbutils.fs.ls("/FileStore/tables/badrecord-1.csv")the above file is there in that particular location but still getting the same error

  • 2 kudos
1 More Replies
Sujeeth
by New Contributor III
  • 2178 Views
  • 5 replies
  • 5 kudos

cluster not create

Hiwhen i try to create cluster it just loading but not create cluster.

  • 2178 Views
  • 5 replies
  • 5 kudos
Latest Reply
karthik_p
Esteemed Contributor
  • 5 kudos

@Sujit Tibe​ Have you configured Databricks in your own VPC and all pre-requisites related to ports are enabled. Another reason can be cloud provider quota issues. If you click on terminated reason, you can see more information

  • 5 kudos
4 More Replies
labromb
by Contributor
  • 992 Views
  • 0 replies
  • 1 kudos

Capturing notebook return codes in databricks jobs

Hi, I currently am running a number of notebook jobs from Azure Data Factory. A new requirement has come up where I need to capture a return code in ADF that has been generated from the note. I tried using  dbutils.notebook.exit(json.dumps({"return_v...

  • 992 Views
  • 0 replies
  • 1 kudos
TheKnightCoder
by New Contributor II
  • 1033 Views
  • 1 replies
  • 1 kudos

How to set parameters in the dashboard?

In Databricks SQL, I have a query that has a dropdown parameter. How do I get this in the Databricks dashboard? I see the option of adding filters but there is nothing for dashboards

  • 1033 Views
  • 1 replies
  • 1 kudos
Latest Reply
TheKnightCoder
New Contributor II
  • 1 kudos

To answer my own question, it seems like if I add the visualisation from the query page then the parameters are automatically added. Added from the dashboard page seems to be bugged

  • 1 kudos
Mradul07
by New Contributor II
  • 786 Views
  • 0 replies
  • 1 kudos

Spark behavior while dealing with Actions & Transformations ?

Hi, My question is - what happens to the initial RDD after the action is performed on it. Does it disappear or stays in the memory or does it needs to be explicitly cached() if we want to use it again.For eg : If I execute this in a sequence :df_outp...

  • 786 Views
  • 0 replies
  • 1 kudos
karthik_p
by Esteemed Contributor
  • 1415 Views
  • 1 replies
  • 6 kudos

do we have any API or way to find out how we have mounted s3 bucket in AWS

HI Team,we have mounted our s3 buckets 2-3 years back and we don't have a place that has been maintained with config that are used to mount s3 bucket on data bricks instance. can we get complete config and keys/profiles that are used to mount s3 buck...

  • 1415 Views
  • 1 replies
  • 6 kudos
Latest Reply
" src="" />
This widget could not be displayed.
This widget could not be displayed.
This widget could not be displayed.
  • 6 kudos

This widget could not be displayed.
HI Team,we have mounted our s3 buckets 2-3 years back and we don't have a place that has been maintained with config that are used to mount s3 bucket on data bricks instance. can we get complete config and keys/profiles that are used to mount s3 buck...

This widget could not be displayed.
  • 6 kudos
This widget could not be displayed.
Trung
by Contributor
  • 1376 Views
  • 2 replies
  • 1 kudos

Resolved! DataBricks best practice to manage resource correspond deleted user

currently I have some prblem about my DataBricks workspace when an user was deleted and it cause some issue:Applications or scripts that use the tokens generated by the user will no longer be able to access the Databricks APIJobs owned by the user wi...

  • 1376 Views
  • 2 replies
  • 1 kudos
Latest Reply
Trung
Contributor
  • 1 kudos

@Vivian Wilfred​ it really useful for my case, many thanks!

  • 1 kudos
1 More Replies
Nath
by New Contributor II
  • 2139 Views
  • 3 replies
  • 2 kudos

Resolved! Error with multiple FeatureLookup calls outside databricks

I access databricks feature store outside databricks with databricks-connect on my IDE pycharm.The problem is just outside Databricks, not with a notebook inside Databricks.I use FeatureLookup mecanism to pull data from Feature store tables in my cus...

  • 2139 Views
  • 3 replies
  • 2 kudos
Latest Reply
shan_chandra
Databricks Employee
  • 2 kudos

Also, Please refer to the below KB for additional resolution - https://learn.microsoft.com/en-us/azure/databricks/kb/dev-tools/dbconnect-protoserializer-stackoverflow

  • 2 kudos
2 More Replies

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group
Labels