cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

bruno_duarte
by New Contributor
  • 4728 Views
  • 5 replies
  • 0 kudos

Resolved! Cluster does not start (Coursera Training Distributed Computing with Spark SQL)

Following the instruction on Week 1 > The Databricks Environment, it is supposed to create a new cluster. However, the cluster is not starting or able to attached notebook and due that I can not continue the tasks/assignments. related documents not...

  • 4728 Views
  • 5 replies
  • 0 kudos
Latest Reply
motuchsznajder
New Contributor II
  • 0 kudos

Hey there, creating a cluster at the Community Edition is showing the same problem as last week. This time, I'm not getting any error coz the process of creating it is taking forever. Any suggestion, @databricks?

  • 0 kudos
4 More Replies
vas610
by New Contributor III
  • 3172 Views
  • 5 replies
  • 0 kudos

Error loading h2o model in mlflow

I'm getting the following error when I'm trying to load a h2o model using mlflow for prediction Error: Error Job with key $03017f00000132d4ffffffff$_990da74b0db027b33cc49d1d90934149 failed with an exception: java.lang.IllegalArgumentException:...

  • 3172 Views
  • 5 replies
  • 0 kudos
Latest Reply
Dan_Z
Databricks Employee
  • 0 kudos

I ran this in Databricks and it worked with no issues. I suggest you make sure your wget path is correct, because the one you posted downloads HTML, not the raw csv. That may cause the problem. %sh wget https://raw.githubusercontent.com/mlflow/mlflo...

  • 0 kudos
4 More Replies
GabrieleMuciacc
by New Contributor III
  • 1063 Views
  • 0 replies
  • 0 kudos

Query table access control metadata from Databricks SQL

I'm trying to create a dashboard in Databricks SQL, parameterized by table name. We have a metadata table which contains the names of all the eligible tables, and we use it to populate a drop-down box for the dashboard. This is a simplified version ...

  • 1063 Views
  • 0 replies
  • 0 kudos
justinbuo53
by New Contributor
  • 819 Views
  • 0 replies
  • 0 kudos

Azure Databricks, how to learn to use practically?

Not sure whether better do ask this in an Azure or Spark subject, but I thought I might get responses appropriate to our use cases here. We have Azure Databricks set up and working, and not had any problems following along the tutorials, but I don't...

  • 819 Views
  • 0 replies
  • 0 kudos
hasinketi48
by New Contributor
  • 748 Views
  • 0 replies
  • 0 kudos

How is Databricks Spark different than Spark?

Hey guys, I am looking to create a real-time analytics application and I am pretty new to Data engineering. Any advice here would be appreciated. So I have been l appvalleyooking into spark streaming for my transformation process, so th tutuappe ove...

  • 748 Views
  • 0 replies
  • 0 kudos
lawregill92
by New Contributor
  • 888 Views
  • 1 replies
  • 0 kudos

Question About Access and Filter Data in Databricks

Hi guys, im new using databricks and i have a challenge in my new work. routerlogin I need to access to one the database (the database is on DBFS) result of some ETLS trough any service, can be ODBC or by some API. I need to connect there because I...

  • 888 Views
  • 1 replies
  • 0 kudos
Latest Reply
Dan_Z
Databricks Employee
  • 0 kudos

Use the Simba ODBC connector: https://docs.databricks.com/integrations/bi/jdbc-odbc-bi.html

  • 0 kudos
Skier
by New Contributor
  • 3427 Views
  • 1 replies
  • 1 kudos

Multiple Clusters stuck in pending state during creation.

I have been trying to create a new cluster to use and multiple attempts have gotten stuck in pending: "Finding instances for new nodes, acquiring more instances if necessary" until they time out. Up to today I have had no problems creating clusters ...

  • 3427 Views
  • 1 replies
  • 1 kudos
Latest Reply
Dan_Z
Databricks Employee
  • 1 kudos

This is typically a cloud provider issue. You can file a support ticket if the issue persists.

  • 1 kudos
maffeenAF
by New Contributor
  • 1297 Views
  • 1 replies
  • 0 kudos

How do I make approxSimilarityJoin work on 25k 300-d vectors?

I’m trying to use LSH approxSimilarityJoin on a dataset with ~25k 300-d vectors of floats. It gets stuck and eventually fails with ’Slave lost’ error. The size of cluster and memory are likely not a problem, the failure happens even with 16 nodes, 1...

  • 1297 Views
  • 1 replies
  • 0 kudos
Latest Reply
Dan_Z
Databricks Employee
  • 0 kudos

Use a PandasUDF with Arrow enabled. They are improved in Spark 3, but you can use them in Spark 2.4.5.

  • 0 kudos
davidmory38
by New Contributor
  • 1987 Views
  • 1 replies
  • 0 kudos

Best Database for facial recognition/ Fast comparisons of Euclidean distance

Hello people,I'm trying to build a facial recognition application, and I have a working API, that takes in an image of a face and spits out a vector that encodes it. I need to run this on a million faces, store them in a db and when the system goes o...

  • 1987 Views
  • 1 replies
  • 0 kudos
Latest Reply
Dan_Z
Databricks Employee
  • 0 kudos

You could do this with Spark storing in parquet/Delta. For each face you would write out a record with a column for metadata, a column for the encoded vector array, and other columns for hashing. You could use a PandasUDF to do the distributed dista...

  • 0 kudos
austiamel47
by New Contributor
  • 977 Views
  • 1 replies
  • 0 kudos

Databricks delta lake

Can we use databricks delta lake as a data warehouse kind of thing where business analysts can explore data according to their needs ? Delta lake provides following features which I think supports this idea support to sql syntaxprovide ACID guarante...

  • 977 Views
  • 1 replies
  • 0 kudos
Latest Reply
Dan_Z
Databricks Employee
  • 0 kudos

@austiamel47, Yes, you can certainly do this. Delta Lake is designed to be competitive with traditional data warehouses and with some tuning can power low-latency dashboards.https://databricks.com/glossary/data-lakehouse

  • 0 kudos
Ryan_Chynoweth
by Esteemed Contributor
  • 1860 Views
  • 1 replies
  • 2 kudos
  • 1860 Views
  • 1 replies
  • 2 kudos
Latest Reply
Ryan_Chynoweth
Esteemed Contributor
  • 2 kudos

Yes. A new workspace would need to be deployed because Azure allows people to change the vnet cidr but it requires you to remove all the vnet resources first. This includes the Databricks deployment, therefore, this is an Azure restriction on how VNE...

  • 2 kudos
wallystart
by New Contributor III
  • 1352 Views
  • 0 replies
  • 1 kudos

Is possible to use jupyter extensions in databricks?

Hi, we need create an interactive map from ipyleaflet library and this use jupyterlab extensionjupyter labextension install @jupyter-widgets/jupyterlab-manager jupyter-leafletWe achieved to show with displayHTML but we lose the widget events

  • 1352 Views
  • 0 replies
  • 1 kudos
jholder
by New Contributor II
  • 1692 Views
  • 2 replies
  • 1 kudos

Cluster Pending

Hello, Relatively new to Databricks and I've been using the Community Edition for a little bit now. I've recently been having more and more issues with my clusters pending until they time out before ever starting up. I've seen a few other posts here...

  • 1692 Views
  • 2 replies
  • 1 kudos
Latest Reply
GustavoRocha
New Contributor III
  • 1 kudos

It seems that it's working now. At least for me...

  • 1 kudos
1 More Replies

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group
Labels