cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

User15787040559
by Databricks Employee
  • 2032 Views
  • 1 replies
  • 1 kudos

What is the equivalent command for constructing the filepath in Databricks on AWS? filepath = f"{working_dir}/keras_checkpoint_weights.ckpt"

dbutils.fs.mkdirs("/foobar/")See https://docs.databricks.com/data/databricks-file-system.html

  • 2032 Views
  • 1 replies
  • 1 kudos
Latest Reply
User16752239289
Databricks Employee
  • 1 kudos

To access DBFS via local file APIs, you can try /dbfs/<foobar>https://docs.databricks.com/data/databricks-file-system.html?_ga=2.41953189.1820496821.1627689131-1247613683.1627514237#local-file-apis

  • 1 kudos
caleyfeli85
by New Contributor
  • 1085 Views
  • 0 replies
  • 0 kudos

Databricks-connect: is it safe to store my PAT in plaintext?

i am getting began with databricks-connect to hook up with my azure databricks cluster the use of a private get admission to token. it looks 192.168.1.254 like this token is being stored in a uncooked text report. if it is proper, is there a extra r...

  • 1085 Views
  • 0 replies
  • 0 kudos
phylialyn47
by New Contributor
  • 1348 Views
  • 0 replies
  • 0 kudos

How can I connect to Databricks via a local IDE?

I want to run some unit tests on my code, but Databricks can't seem to handle running formal unit testing libraries due to the lack of command line. From Googling, it appears it's possible to run notebooks and such from IntelliJ if using Scala, rath...

  • 1348 Views
  • 0 replies
  • 0 kudos
Anonymous
by Not applicable
  • 19746 Views
  • 3 replies
  • 0 kudos
  • 19746 Views
  • 3 replies
  • 0 kudos
Latest Reply
User16857281974
Databricks Employee
  • 0 kudos

@Ryan Chynoweth​ and @Sean Owen​  are both right, but I have a different perspective on this.Quick side note: you can also configure your cluster to execute with only a driver, and thus reducing the cost to the cheapest single VM available. In the cl...

  • 0 kudos
2 More Replies
Anonymous
by Not applicable
  • 3470 Views
  • 3 replies
  • 0 kudos

Resolved! Cluster Sizing

How big should my cluster be? How do I know how many nodes to use or the kind of instance to use?

  • 3470 Views
  • 3 replies
  • 0 kudos
Latest Reply
sajith_appukutt
Databricks Employee
  • 0 kudos

>How big should my cluster be? This would really depend on the use case. Some general guiding principles could be found here https://docs.databricks.com/clusters/cluster-config-best-practices.html

  • 0 kudos
2 More Replies
Anonymous
by Not applicable
  • 3089 Views
  • 2 replies
  • 0 kudos

Issue loading spark Scala library

We have a proprietary spark scala library, which is necessary for me to do my work. We build a release version once a week and store it in a specific s3 location (so the most up-to-date prod version is always stored in the same place). But so far I c...

  • 3089 Views
  • 2 replies
  • 0 kudos
Latest Reply
User16857281974
Databricks Employee
  • 0 kudos

Databrick's curriculum team solved this problem by creating our own Maven repo and it's easier than it sounds. To do this, we took an S3 bucket, converted it to a public website, allowing for standard file downloads, and then within that bucket creat...

  • 0 kudos
1 More Replies
User16844444140
by Databricks Employee
  • 5012 Views
  • 3 replies
  • 0 kudos

Why does the display name of widgets not match the specified name in SQL?

However, I have no problem accessing the widget with the specified name.

Screen Shot 2021-03-18 at 2.07.34 PM
  • 5012 Views
  • 3 replies
  • 0 kudos
Latest Reply
User16844444140
Databricks Employee
  • 0 kudos

Yep, I figured out the issue now. Both of you gave the right information to solve the problem. My first mistake was as Jacob mentioned, `date` is actually a dataframe object here. To get the string date, I had to do similar to what Amine suggested. S...

  • 0 kudos
2 More Replies
Anonymous
by Not applicable
  • 5026 Views
  • 2 replies
  • 0 kudos

Resolved! Is there a way to validate the values of spark configs?

We can set for example:spark.conf.set('aaa.test.junk.config', 99999) , and then run spark.conf.get("aaa.test.junk.config”) which will return a value.The problem occurs when incorrectly setting to a similar matching property.spark.conf.set('spark.sql....

  • 5026 Views
  • 2 replies
  • 0 kudos
Latest Reply
User16857281974
Databricks Employee
  • 0 kudos

You would solve this just like we solve this problem for all lose string references. Namely, that is to create a constant that represents the key-value you want to ensure doesn't get mistyped.Naturally, if you type it wrong the first time, it will be...

  • 0 kudos
1 More Replies
User16752241457
by Databricks Employee
  • 18108 Views
  • 2 replies
  • 2 kudos

How can I programmatically get my notebook path?

I'm writing some code that trains a ML model using MLflow and a given set of hyperparameters. This code is going to be run by several folks on my team and I want to make sure that the experiment that get's created is created in the same directory as ...

  • 18108 Views
  • 2 replies
  • 2 kudos
Latest Reply
User16857281974
Databricks Employee
  • 2 kudos

In Scala the call is dbutils.notebook.getContext.notebookPath.getIn Python the call isdbutils.entry_point.getDbutils().notebook().getContext().notebookPath().getOrElse(None)If you need it in another language, a common practice would be to pass it thr...

  • 2 kudos
1 More Replies
User16790091296
by Databricks Employee
  • 1474 Views
  • 1 replies
  • 0 kudos
  • 1474 Views
  • 1 replies
  • 0 kudos
Latest Reply
Ryan_Chynoweth
Databricks Employee
  • 0 kudos

You have a couple options to write data into a Data Warehouse. Some DWs have special connectors that allow for high performance between Databricks and the DW (for example there is a Spark connector for Snowflake and for Azure Synapse DW). Some data w...

  • 0 kudos
Anonymous
by Not applicable
  • 3368 Views
  • 1 replies
  • 1 kudos

Auto-deletion of unused jobs

Is there a setting that will auto-cleanup/delete jobs that are of a certain age (say 90 days old for example)?

  • 3368 Views
  • 1 replies
  • 1 kudos
Latest Reply
Ryan_Chynoweth
Databricks Employee
  • 1 kudos

It is not available natively in Databricks. But you can write an administration script that analyzes your jobs data and automatically cleans up the older jobs as needed. It would be easiest to do this with the jobs API. List your jobs to get all the ...

  • 1 kudos
ZeykUtra
by New Contributor
  • 1160 Views
  • 0 replies
  • 0 kudos

java.io.IOException: While processing file s3://test/abc/request_dt=2021-07-28/someParquetFile. [XYZ] BINARY is not in the store

Hi Team, I am facing an issue "java.io.IOException: While processing file s3://test/abc/request_dt=2021-07-28/someParquetFile. [XYZ] BINARY is not in the store" The things i did before getting the above exception: 1. Alter table tableName1 add colum...

  • 1160 Views
  • 0 replies
  • 0 kudos
sandip_yadav
by New Contributor
  • 1236 Views
  • 0 replies
  • 0 kudos

Databricks secrets visible in cleartext

I have a requirement that I need a secret when starting a cluster in databricks. And I found following way of providing the secret to my init script. https://docs.databricks.com/security/secrets/secrets.html#store-the-path-to-a-secret-in-an-environm...

  • 1236 Views
  • 0 replies
  • 0 kudos

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels