cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 
Data + AI Summit 2024 - Data Engineering & Streaming

Forum Posts

109005
by New Contributor III
  • 2396 Views
  • 5 replies
  • 5 kudos

Not able to install geomesa on my Databricks cluster

Hi team, I have attempting to install Geomesa (2.12:3.4.1) library on my cluster but it keeps failing with the below error:Library installation attempted on the driver node of cluster 0824-052900-76icyj32 and failed. Please refer to the following err...

  • 2396 Views
  • 5 replies
  • 5 kudos
Latest Reply
Prabakar
Esteemed Contributor III
  • 5 kudos

Hi @Ayushi Pandey​ I could see the package is available in the maven repo. https://mvnrepository.com/artifact/org.locationtech.geomesa/geomesa_2.12/3.4.1Have you tried downloading the package to dbfs location and installed on the cluster?

  • 5 kudos
4 More Replies
Ank
by New Contributor II
  • 1079 Views
  • 1 replies
  • 2 kudos

Why am I getting a FileNotFoundError after providing the file path?

I used copy file path to get the file path of the notebook I am trying to run from another notebook.file_path = "/Users/ankur.lohiya@workday.com/PAS/Training/Ingest/TrainingQueries-Cloned.py/"ddi = DatabricksDataIngestion(file_path=file_path,        ...

  • 1079 Views
  • 1 replies
  • 2 kudos
Latest Reply
Vidula
Honored Contributor
  • 2 kudos

Hello @Ankur Lohiya​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Th...

  • 2 kudos
umarkhan
by New Contributor II
  • 3188 Views
  • 2 replies
  • 1 kudos

Driver context not found for python spark for spark_submit_task using Jobs API submit run endpoint

I am trying to run a multi file python job in databricks without using notebooks. I have tried setting this up by:creating a docker image using the DBRT 10.4 LTS as a base and adding the zipped python application to that.make a call to the run submit...

  • 3188 Views
  • 2 replies
  • 1 kudos
Latest Reply
Vidula
Honored Contributor
  • 1 kudos

Hi @Umar Khan​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Thanks!

  • 1 kudos
1 More Replies
data_boy_2022
by New Contributor III
  • 2437 Views
  • 2 replies
  • 1 kudos

Resolved! What are the options to offer a low latency API for small tables derived from big tables?

I have a big dataset which gets divided into smaller datasets. For some of these smaller datasets I'd like to offer a low latency API (*** ms) to query them. Big dataset 1B entriesSmaller dataset 1 Mio entriesWhat's the best way to do it?I thought ab...

  • 2437 Views
  • 2 replies
  • 1 kudos
Latest Reply
Vidula
Honored Contributor
  • 1 kudos

Hi @Jan R​ Does @Tian Tan​  response answer your question? If yes, would you be happy to mark it as best so that other members can find the solution more quickly?We'd love to hear from you.Thanks!

  • 1 kudos
1 More Replies
data_boy_2022
by New Contributor III
  • 2458 Views
  • 2 replies
  • 0 kudos

Resolved! Writing transformed DataFrame to a persistent table is unbearable slow

I want to transform a DF with a simple UDF. Afterwards I want to store the resulting DF in a new table (see code below)key = "test_key"   schema = StructType([ StructField("***", StringType(), True), StructField("yyy", StringType(), True), StructF...

  • 2458 Views
  • 2 replies
  • 0 kudos
Latest Reply
Vidula
Honored Contributor
  • 0 kudos

Hello @Jan R​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Thanks!

  • 0 kudos
1 More Replies
komplex
by New Contributor
  • 992 Views
  • 2 replies
  • 1 kudos

I need help finding the right mode for my course

How do I find the Data Brick Community edition?

  • 992 Views
  • 2 replies
  • 1 kudos
Latest Reply
Vidula
Honored Contributor
  • 1 kudos

Hi @Kester Truman​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Than...

  • 1 kudos
1 More Replies
Jessevds
by New Contributor II
  • 2453 Views
  • 2 replies
  • 2 kudos

Create dropdown-list in Markdown

In the first cell of my notebooks, I record a changelog for all changes done in the notebook in Markdown. However, as this list becomes longer and longer, I want to implement a dropdown list. Is there anyway to do this in Markdown in databricks?For t...

  • 2453 Views
  • 2 replies
  • 2 kudos
Latest Reply
Vidula
Honored Contributor
  • 2 kudos

Hi @Jesse vd S​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Thanks!

  • 2 kudos
1 More Replies
mghildiy
by New Contributor
  • 1056 Views
  • 1 replies
  • 0 kudos

A basic DataFrame transformation query

I want to know how dataframe transformations work.Suppose I have a DataFrame instance df1. I apply some operation on it, say a filter. As every operation gives a new dataframe, so lets say now we have df2. So we have two DataFrame instances now, df1 ...

  • 1056 Views
  • 1 replies
  • 0 kudos
Latest Reply
Vidula
Honored Contributor
  • 0 kudos

Hi @mghildiy​ Does @Kaniz Fatma​  response answer your question? If yes, would you be happy to mark it as best so that other members can find the solution more quickly?We'd love to hear from you.Thanks!

  • 0 kudos
Erik
by Valued Contributor II
  • 1745 Views
  • 2 replies
  • 2 kudos

Resolved! Where is Databricks Tunnel (and is Databricks connect cool again?)

Two related questions:1: There has been several mentions in this forum about "Databricks Tunnel", which should allow us to connect from our local IDE to a remote databricks cluster and develop stuff locally. The roumors said early 2022, is there some...

  • 1745 Views
  • 2 replies
  • 2 kudos
Latest Reply
Vidula
Honored Contributor
  • 2 kudos

Hi there @Erik Parmann​ Does @Youssef Mrini​  response answer your question? If yes, would you be happy to mark it as best so that other members can find the solution more quickly?We'd love to hear from you.Thanks

  • 2 kudos
1 More Replies
dimsh
by Contributor
  • 1426 Views
  • 3 replies
  • 1 kudos

Any plans to provide Databricks SQL / Alerts API

Hi, Databricks! You are my favorite Big Data tool, but I've recently faced an issue I didn't expect to have. For our agriculture customers, we're trying to use Databricks SQL Platform to keep our data accurate all day. We use Alerts to validate our d...

  • 1426 Views
  • 3 replies
  • 1 kudos
Latest Reply
Vidula
Honored Contributor
  • 1 kudos

Hi @Dmytro Imshenetskyi​ Does @Hubert Dudek​  response answer your question? If yes, would you be happy to mark it as best so that other members can find the solution more quickly?We'd love to hear from you.Thanks!

  • 1 kudos
2 More Replies
Cosimo_F_
by Contributor
  • 1246 Views
  • 3 replies
  • 0 kudos

Autoloader schema inference

Hello,is it possible to turn off schema inference with AutoLoader? Thank you,Cosimo

  • 1246 Views
  • 3 replies
  • 0 kudos
Latest Reply
" src="" />
This widget could not be displayed.
This widget could not be displayed.
This widget could not be displayed.
  • 0 kudos

This widget could not be displayed.
Hello,is it possible to turn off schema inference with AutoLoader? Thank you,Cosimo

This widget could not be displayed.
  • 0 kudos
This widget could not be displayed.
2 More Replies
satishnamu
by New Contributor II
  • 986 Views
  • 1 replies
  • 0 kudos

Cannot sign in at databricks partner-academy portal

Hi thereI have used my company email to register an account for customer-academy.databricks.com a while back.Now what I need to do is create an account with partner-academy.databricks.com using my company email too.However when I register at partner-...

  • 986 Views
  • 1 replies
  • 0 kudos
Latest Reply
" src="" />
This widget could not be displayed.
This widget could not be displayed.
This widget could not be displayed.
  • 0 kudos

This widget could not be displayed.
Hi thereI have used my company email to register an account for customer-academy.databricks.com a while back.Now what I need to do is create an account with partner-academy.databricks.com using my company email too.However when I register at partner-...

This widget could not be displayed.
  • 0 kudos
This widget could not be displayed.
ronaldolopes
by New Contributor
  • 2378 Views
  • 2 replies
  • 1 kudos

Resolved! Error deleting a table

I'm trying to delete a table that was created from a csv and due to the file deletion, I can't execute the deletion, with the following error: I'm new to Databricks and I don't know how to fix this. Some help?

Captura de tela de 2022-08-29 14-35-04
  • 2378 Views
  • 2 replies
  • 1 kudos
Latest Reply
AmanSehgal
Honored Contributor III
  • 1 kudos

To delete the table, it's looking for underlying delta log file and because the file doesn't exist, it's throwing you that error.Just drop the table.drop table <table_name>

  • 1 kudos
1 More Replies
RohitKulkarni
by Contributor II
  • 2033 Views
  • 4 replies
  • 7 kudos

Resolved! Azure data bricks delta tables .Issue

Hello Team,I have written Spark SQL Query in data bricks :DROP TABLE IF EXISTS Salesforce.Location;CREATE EXTERNAL TABLE Salesforce.Location (Id STRING,OwnerId STRING,IsDeleted bigint,Name STRING,CurrencyIsoCode STRING,CreatedDate bigint,CreatedById ...

  • 2033 Views
  • 4 replies
  • 7 kudos
Latest Reply
AmanSehgal
Honored Contributor III
  • 7 kudos

You need to provide one of the following value for 'data_source':TEXTAVROCSVJSONPARQUETORCDELTAeg: USING PARQUETIf you skip USING clause, then the default data source is DELTAhttps://docs.databricks.com/sql/language-manual/sql-ref-syntax-ddl-create-t...

  • 7 kudos
3 More Replies
LearnerShahid
by New Contributor II
  • 4964 Views
  • 6 replies
  • 4 kudos

Resolved! Lesson 6.1 of Data Engineering. Error when reading stream - java.lang.UnsupportedOperationException: com.databricks.backend.daemon.data.client.DBFSV1.resolvePathOnPhysicalStorage(path: Path)

Below function executes fine: def autoload_to_table(data_source, source_format, table_name, checkpoint_directory):  query = (spark.readStream         .format("cloudFiles")         .option("cloudFiles.format", source_format)         .option("cloudFile...

I have verified that source data exists.
  • 4964 Views
  • 6 replies
  • 4 kudos
Latest Reply
Anonymous
Not applicable
  • 4 kudos

Autoloader is not supported on community edition.

  • 4 kudos
5 More Replies

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group
Labels