cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

User16752239289
by Databricks Employee
  • 3021 Views
  • 1 replies
  • 2 kudos

Resolved! EC2 instances are not stoped after the cluster terminated

I found that Databricks did not stop and delete ec2 instances of clusters. After the cluster terminate , the ec2 instances are still running.

  • 3021 Views
  • 1 replies
  • 2 kudos
Latest Reply
User16752239289
Databricks Employee
  • 2 kudos

Please make sure the Databricks IAM role does have all required permission mentioned https://docs.databricks.com/administration-guide/account-api/iam-role.html#0-language-Your%C2%A0VPC,%C2%A0defaultMake sure you did not change the EC2 tags especially...

  • 2 kudos
Vu_QuangNguyen
by New Contributor
  • 2850 Views
  • 0 replies
  • 0 kudos

Structured streaming from an overwrite delta path

Hi experts, I need to ingest data from an existing delta path to my own delta lake. The dataflow is as shown in the diagram: Data team reads full snapshot of a database table and overwrite to a delta path. This is done many times per day, but...

0693f000007OoRcAAK
  • 2850 Views
  • 0 replies
  • 0 kudos
RajuNagarajan
by New Contributor
  • 839 Views
  • 0 replies
  • 0 kudos

GroupBy in a multi node environment

I have a group of rows with Information on a nested product calls. example- Trxn1-product1-caller1-local1 Trxn1-Product1-local1-local2 Trxn1-Product1-local2-local3 here’s is a expected calls for a product product1-caller1-local1 Product1-local1-loc...

  • 839 Views
  • 0 replies
  • 0 kudos
User15787040559
by Databricks Employee
  • 1338 Views
  • 1 replies
  • 1 kudos

What is the equivalent command for constructing the filepath in Databricks on AWS? filepath = f"{working_dir}/keras_checkpoint_weights.ckpt"

dbutils.fs.mkdirs("/foobar/")See https://docs.databricks.com/data/databricks-file-system.html

  • 1338 Views
  • 1 replies
  • 1 kudos
Latest Reply
User16752239289
Databricks Employee
  • 1 kudos

To access DBFS via local file APIs, you can try /dbfs/<foobar>https://docs.databricks.com/data/databricks-file-system.html?_ga=2.41953189.1820496821.1627689131-1247613683.1627514237#local-file-apis

  • 1 kudos
phylialyn47
by New Contributor
  • 893 Views
  • 0 replies
  • 0 kudos

How can I connect to Databricks via a local IDE?

I want to run some unit tests on my code, but Databricks can't seem to handle running formal unit testing libraries due to the lack of command line. From Googling, it appears it's possible to run notebooks and such from IntelliJ if using Scala, rath...

  • 893 Views
  • 0 replies
  • 0 kudos
Anonymous
by Not applicable
  • 13234 Views
  • 3 replies
  • 0 kudos
  • 13234 Views
  • 3 replies
  • 0 kudos
Latest Reply
User16857281974
Contributor
  • 0 kudos

@Ryan Chynoweth​ and @Sean Owen​  are both right, but I have a different perspective on this.Quick side note: you can also configure your cluster to execute with only a driver, and thus reducing the cost to the cheapest single VM available. In the cl...

  • 0 kudos
2 More Replies
Anonymous
by Not applicable
  • 2156 Views
  • 3 replies
  • 0 kudos

Resolved! Cluster Sizing

How big should my cluster be? How do I know how many nodes to use or the kind of instance to use?

  • 2156 Views
  • 3 replies
  • 0 kudos
Latest Reply
sajith_appukutt
Honored Contributor II
  • 0 kudos

>How big should my cluster be? This would really depend on the use case. Some general guiding principles could be found here https://docs.databricks.com/clusters/cluster-config-best-practices.html

  • 0 kudos
2 More Replies
Anonymous
by Not applicable
  • 2072 Views
  • 2 replies
  • 0 kudos

Issue loading spark Scala library

We have a proprietary spark scala library, which is necessary for me to do my work. We build a release version once a week and store it in a specific s3 location (so the most up-to-date prod version is always stored in the same place). But so far I c...

  • 2072 Views
  • 2 replies
  • 0 kudos
Latest Reply
User16857281974
Contributor
  • 0 kudos

Databrick's curriculum team solved this problem by creating our own Maven repo and it's easier than it sounds. To do this, we took an S3 bucket, converted it to a public website, allowing for standard file downloads, and then within that bucket creat...

  • 0 kudos
1 More Replies
User16844444140
by New Contributor II
  • 3384 Views
  • 3 replies
  • 0 kudos

Why does the display name of widgets not match the specified name in SQL?

However, I have no problem accessing the widget with the specified name.

Screen Shot 2021-03-18 at 2.07.34 PM
  • 3384 Views
  • 3 replies
  • 0 kudos
Latest Reply
User16844444140
New Contributor II
  • 0 kudos

Yep, I figured out the issue now. Both of you gave the right information to solve the problem. My first mistake was as Jacob mentioned, `date` is actually a dataframe object here. To get the string date, I had to do similar to what Amine suggested. S...

  • 0 kudos
2 More Replies
Anonymous
by Not applicable
  • 3251 Views
  • 2 replies
  • 0 kudos

Resolved! Is there a way to validate the values of spark configs?

We can set for example:spark.conf.set('aaa.test.junk.config', 99999) , and then run spark.conf.get("aaa.test.junk.config”) which will return a value.The problem occurs when incorrectly setting to a similar matching property.spark.conf.set('spark.sql....

  • 3251 Views
  • 2 replies
  • 0 kudos
Latest Reply
User16857281974
Contributor
  • 0 kudos

You would solve this just like we solve this problem for all lose string references. Namely, that is to create a constant that represents the key-value you want to ensure doesn't get mistyped.Naturally, if you type it wrong the first time, it will be...

  • 0 kudos
1 More Replies
User16752241457
by New Contributor II
  • 13568 Views
  • 2 replies
  • 2 kudos

How can I programmatically get my notebook path?

I'm writing some code that trains a ML model using MLflow and a given set of hyperparameters. This code is going to be run by several folks on my team and I want to make sure that the experiment that get's created is created in the same directory as ...

  • 13568 Views
  • 2 replies
  • 2 kudos
Latest Reply
User16857281974
Contributor
  • 2 kudos

In Scala the call is dbutils.notebook.getContext.notebookPath.getIn Python the call isdbutils.entry_point.getDbutils().notebook().getContext().notebookPath().getOrElse(None)If you need it in another language, a common practice would be to pass it thr...

  • 2 kudos
1 More Replies
User16790091296
by Contributor II
  • 977 Views
  • 1 replies
  • 0 kudos
  • 977 Views
  • 1 replies
  • 0 kudos
Latest Reply
Ryan_Chynoweth
Esteemed Contributor
  • 0 kudos

You have a couple options to write data into a Data Warehouse. Some DWs have special connectors that allow for high performance between Databricks and the DW (for example there is a Spark connector for Snowflake and for Azure Synapse DW). Some data w...

  • 0 kudos

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group
Labels