cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Megan05
by New Contributor III
  • 3795 Views
  • 4 replies
  • 1 kudos

Trying to write to S3 bucket but executed code not showing any progress

I am trying to write data from databricks to an S3 bucket but when I submit the code, it runs and runs and does not make any progress. I am not getting any errors and the logs don't seem to recognize I've submitted anything. The cluster also looks un...

image
  • 3795 Views
  • 4 replies
  • 1 kudos
Latest Reply
User16753725469
Contributor II
  • 1 kudos

Can you please check the driver log4j to see what is happening?

  • 1 kudos
3 More Replies
Graham
by New Contributor III
  • 5387 Views
  • 4 replies
  • 8 kudos

Resolved! [Databricks SQL] Commented Escape Character (\) Causes Unexpected Behavior

The Problem:I've observed erratic behavior when I add a comment containing a trailing escape character (\) to a CREATE TABLE statement.For example, this query returns data (though it shouldn't):CREATE TABLE example_table SELECT 1 -- This comment has ...

  • 5387 Views
  • 4 replies
  • 8 kudos
Latest Reply
BilalAslamDbrx
Databricks Employee
  • 8 kudos

@Graham Carman​ we're tracking this as a defect / issue on our side. For now, please don't include the escape character in comments.

  • 8 kudos
3 More Replies
sh23
by New Contributor II
  • 2262 Views
  • 1 replies
  • 1 kudos

Need help with loading 11 TB data into spark dataframe using managed gcp databricks.

I am using managed databricks on gcp. I have 11TB of data with 5B rows. Data from source is not partitioned. I'm having trouble loading the data into dataframe and do further data processing. I have tried couple of executors configuration , none of t...

  • 2262 Views
  • 1 replies
  • 1 kudos
Athar
by New Contributor
  • 2354 Views
  • 3 replies
  • 3 kudos

How to import blob storage container with sub-directories as a database in databricks sql?

I am trying to upload blob storage on databricks sql warehouse. I followed this document https://docs.databricks.com/data/data-sources/azure/azure-storage.html. but this doesn't seem to be working. Query executed fine but created schema was empty. An...

  • 2354 Views
  • 3 replies
  • 3 kudos
Latest Reply
BilalAslamDbrx
Databricks Employee
  • 3 kudos

@Athar Abbas​ the simplest thing would be to create a SAS token to the ADLS Gen 2 container and then use the COPY INTO command with the AZURE_SAS_TOKEN credential: https://docs.microsoft.com/en-us/azure/databricks/data/data-sources/azure/adls-gen2/az...

  • 3 kudos
2 More Replies
smarter-living
by New Contributor
  • 645 Views
  • 0 replies
  • 0 kudos

smarter living shop

We design and manufacture intelligent products and smart solutions that make your life easier, safer and more comfortable. Via the registered 2USB brand we produce innovative easy to install USB stopcontact & powerful USB charging solutions. We are a...

  • 645 Views
  • 0 replies
  • 0 kudos
chris_kimmel
by New Contributor II
  • 1065 Views
  • 0 replies
  • 2 kudos

Bug report: Switching branches duplicates cells

I'm using Databricks' support for GitHub repos. When I switch from one branch to another while a notebook is open, it messes up my notebook. Specifically, every notebook cell appears twice after switching branches.

  • 1065 Views
  • 0 replies
  • 2 kudos
Rahul_Samant
by Contributor
  • 14681 Views
  • 9 replies
  • 2 kudos

SSL Error While Setting up databricks cli or installing library

How to fix below SSL error while setting up databricks Cli or installing library in clusterLibrary installation attempted on the driver node of cluster *** and failed.Please refer to the following error message to fix the library or contact Databric...

  • 14681 Views
  • 9 replies
  • 2 kudos
Latest Reply
Megan05
New Contributor III
  • 2 kudos

I was getting an SSL when trying to set up secrets using the Databricks CLI. To fix the CLI SSL error I went to %USERPROFILE%\.databrickscfg (~/.databrickscfg on Unix, Linux, or macOS) from the file explorer on my local machine and added the insecure...

  • 2 kudos
8 More Replies
pawelmitrus
by Contributor
  • 1760 Views
  • 1 replies
  • 1 kudos

Resolved! Shutting down a job cluster, when streaming is over

Hi,As for now we already know that our application will be running 24/7 streaming constantly incoming data. The stream pipeline is very basic, however as for now it's enough to run this pipeline 1x per day (to save the costs of constantly running clu...

  • 1760 Views
  • 1 replies
  • 1 kudos
Latest Reply
Shasidhar_ES
Databricks Employee
  • 1 kudos

Use .trigger(once=True) or .trigger(availableNow=True) option which will pick only the new files https://docs.databricks.com/structured-streaming/triggers.html#configuring-incremental-batch-processing

  • 1 kudos
155647
by New Contributor II
  • 1622 Views
  • 3 replies
  • 2 kudos

Databricks unmanaged table from Snowflake

Is there a way to create Databricks unmanaged table that's actually Snowflake table, not some S3 or DBFS location?From documentation is rather vague is this possible: "You can create an unmanaged table with your data in data sources such as Cassandra...

  • 1622 Views
  • 3 replies
  • 2 kudos
Latest Reply
Anonymous
Not applicable
  • 2 kudos

Hey @Stefan Stojanovic​ Hope everything is going great!Does @Hubert Dudek​'s response answer your question? If yes, would you be happy to mark it as best so that other members can find the solution more quickly? Else please let us know if you need mo...

  • 2 kudos
2 More Replies
NathanLaw
by New Contributor III
  • 5284 Views
  • 5 replies
  • 1 kudos

Model Training Data Adapter Error.

We are converting Pyspark dataframe to Tensorflow using PetaStorm and have encountered a “data adapter” error. What do you recommend for diagnosing and fixing this error?https://docs.microsoft.com/en-us/azure/databricks/applications/machine-learning/...

DataAdpaterErrorCluster DataAdpaterError
  • 5284 Views
  • 5 replies
  • 1 kudos
Latest Reply
Anonymous
Not applicable
  • 1 kudos

Hey @Nathan Law​ Thank you so much for getting back to us. We will await your response.We really appreciate your time.

  • 1 kudos
4 More Replies
bearys
by New Contributor II
  • 3186 Views
  • 1 replies
  • 2 kudos

Illegal character in partition path when attempting REORG ... (PURGE)

I have a large delta table partitioned by an identifier column that I now have discovered has blank spaces in some of the identifiers, e.g. one partition can be defined by "Identifier=first identifier". Most partitions does not have these blank space...

  • 3186 Views
  • 1 replies
  • 2 kudos
Latest Reply
bearys
New Contributor II
  • 2 kudos

FYI similar issue with partitions with "%" in the identifier. Used the filter clause of the REORG to exclude partitions with " " or "%" to be able to move forward with my work but will continue looking for a solution. I've never seen any pointers not...

  • 2 kudos
Dicer
by Valued Contributor
  • 24221 Views
  • 12 replies
  • 13 kudos

Resolved! Failed to convert Spark.sql to Pandas Dataframe using .toPandas()

I wrote the following code:​data = spark.sql (" SELECT A_adjClose, AA_adjClose, AAL_adjClose, AAP_adjClose, AAPL_adjClose FROM deltabase.a_30min_delta, deltabase.aa_30min_delta, deltabase.aal_30min_delta, deltabase.aap_30min_delta ,deltabase.aapl_30m...

  • 24221 Views
  • 12 replies
  • 13 kudos
Latest Reply
Dicer
Valued Contributor
  • 13 kudos

I just discovered a solution.Today, I opened Azure Databricks. When I imported python libraries. Databricks told me that toPandas() was deprecated and it suggested me to use toPandas.The following solution works: Use toPandas instead of toPandas() da...

  • 13 kudos
11 More Replies
AlbinLindmark
by New Contributor II
  • 4344 Views
  • 3 replies
  • 3 kudos

Resolved! Git integration for enterprises with a private git server behind VPN

The documentation states that DataBricks does not support private Git servers behind a VPN. The forum does however state in two places (place1, place2) that enterprise customers can reach out to their 'account team' and request to be added to somethi...

  • 4344 Views
  • 3 replies
  • 3 kudos
Latest Reply
derft102
New Contributor II
  • 3 kudos

Hey all, What do you say about the below post. I am little bit confused about it. If someone will help me, it will be appreciated. https://community.databricks.com/s/question/0D53f00001GHVYnCAP/will-databricks-support-selfservice-web-application-fire...

  • 3 kudos
2 More Replies
cchalc
by New Contributor III
  • 12824 Views
  • 2 replies
  • 5 kudos

How to understand what dropDuplicates is doing?

Smashed our heads against this one for a while and though I think it’s more of a spark question than a Databricks one, wanting to get your thoughts on it. Essentially the gist is this:We select into a DF from a delta tableWe display the DF and see 2 ...

  • 12824 Views
  • 2 replies
  • 5 kudos
Latest Reply
cchalc
New Contributor III
  • 5 kudos

Great answer @Aman Sehgal​. I also received another answer from @Ryan Chynoweth​ I will paste here:1) Have you seen anything like this before and if so, can you provide any insight on it?Yes this does happen due to the lazy execution of spark and due...

  • 5 kudos
1 More Replies

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels