cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

drewster
by New Contributor III
  • 7872 Views
  • 17 replies
  • 14 kudos

Resolved! Spark streaming autoloader slow second batch - checkpoint issues?

I am running a massive history of about 250gb ~6mil phone call transcriptions (json read in as raw text) from a raw -> bronze pipeline in Azure Databricks using pyspark. The source is mounted storage and is continuously having files added and we do n...

  • 7872 Views
  • 17 replies
  • 14 kudos
Latest Reply
Brooksjit
New Contributor III
  • 14 kudos

Thank you for the explanation.

  • 14 kudos
16 More Replies
sarveshpandey23
by New Contributor II
  • 1588 Views
  • 5 replies
  • 3 kudos

There is some issue in Column

I have loaded the csv file from my local PC to data bricks. Now, when i am trying to query the table over column its throwing error. Note - select * from table_name is working but select column_name from table_name is throwing error.

image
  • 1588 Views
  • 5 replies
  • 3 kudos
Latest Reply
Anonymous
Not applicable
  • 3 kudos

Thanks for the information, I will try to figure it out for more. Keep sharing such informative post keep suggesting such post.

  • 3 kudos
4 More Replies
Scorpius
by New Contributor II
  • 1203 Views
  • 2 replies
  • 1 kudos

Unregistering / Removing A SparkListener

Hey, So currently I'm using Pyspark and utilizing SparkListener to track metrics of my spark jobs. The problem I can't seem to solve is why in databricks the listener cannot be removed with removeSparkListener as it still is attached to the context.#...

  • 1203 Views
  • 2 replies
  • 1 kudos
Latest Reply
Kaniz
Community Manager
  • 1 kudos

Hi @CJ W​, Please go through this GitHub link which resolves this patch:-https://github.com/apache/spark/pull/16382/commits/347bb0e74e16cf19511ccfa3ed763db71c57aa4fPlease let us know if it helps you in any way.

  • 1 kudos
1 More Replies
Raymond_Garcia
by Contributor II
  • 1553 Views
  • 5 replies
  • 3 kudos

Migrating from Databricks Notebooks to IDE for Development

Hello, we are developers who have been creating a system in Databricks with Scala. We enabled the Git feature, so the project is in a repository. The project has a lot of notebooks and a lot of calls to other notebooks. Sometimes it is a little overw...

  • 1553 Views
  • 5 replies
  • 3 kudos
Latest Reply
Raymond_Garcia
Contributor II
  • 3 kudos

it is true that we can't work without data bricks but we can develop an IDE and send the jar to databricks, this will allow us to create unit tests, and use the IDE capabilities (i.e fast navigation among classes).

  • 3 kudos
4 More Replies
Kody_Devl
by New Contributor II
  • 15249 Views
  • 2 replies
  • 2 kudos

Resolved! Export to Excel xlsx

Hi All Does anyone have some code or example of how to export my Databricks SQL results directly to an existing spreadsheet?Many ThanksKody_Devl

  • 15249 Views
  • 2 replies
  • 2 kudos
Latest Reply
Anonymous
Not applicable
  • 2 kudos

Hey there @Ross Crill​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? It would be really helpful for the other members too.We'd love to hear fro...

  • 2 kudos
1 More Replies
Phani1
by Valued Contributor
  • 2984 Views
  • 5 replies
  • 4 kudos

Resolved! is it possible to have multiple tabs in Dashboard? if not is there any workaround for this.

is it possible to have multiple tabs in Dashboard? if not is there any workaround for this.

  • 2984 Views
  • 5 replies
  • 4 kudos
Latest Reply
Anonymous
Not applicable
  • 4 kudos

Hey @Janga Reddy​ Hope you are well. Just wanted to see if you were able to find an answer to your question and would you like to mark an answer as best? It would be really helpful for the other members too.Cheers!

  • 4 kudos
4 More Replies
griffinw
by New Contributor III
  • 945 Views
  • 2 replies
  • 1 kudos

Plotly in python notebook- not seeing all graphs

Hello,I have written code that produces plotly charts in a python notebook (by facet) - each output is 4 line charts side-by-side with a few points highlighted.When I run this in a loop to produce more than 7 or 8 of my graphs, some of the features ...

  • 945 Views
  • 2 replies
  • 1 kudos
Latest Reply
griffinw
New Contributor III
  • 1 kudos

unfortunately I'm not able to share as it is in reference to proprietary company dataessentially, I can produce any one plotly graph like:[line graph with scatter]but if i produce, say, 8 of them (looping on a list of inputs), I get something like th...

  • 1 kudos
1 More Replies
irfanijaz
by New Contributor
  • 467 Views
  • 0 replies
  • 0 kudos

Differently named storage accounts in different environments

Hi,I have a solution design question on which I am looking for some help. We have 2 environments in azure (dev and prod), each env has its own ADLS storage account with a different name of course. Within Databricks code we are NOT leveraging the mou...

  • 467 Views
  • 0 replies
  • 0 kudos
Megan05
by New Contributor III
  • 1941 Views
  • 4 replies
  • 4 kudos

Resolved! Out of Memory/Connection Lost When Writing to External SQL Server from Databricks Using JDBC Connection

I am working on writing a large amount of data from Databricks to an external SQL server using a JDB connection. I keep getting timeout errors/connection lost but digging deeper it appears to be a memory problem. I am wondering what cluster configura...

  • 1941 Views
  • 4 replies
  • 4 kudos
Latest Reply
hotrabattecom
New Contributor II
  • 4 kudos

Thanks for the answer. I am also get in this problem. Hotrabatt

  • 4 kudos
3 More Replies
Maverick1
by Valued Contributor II
  • 4048 Views
  • 4 replies
  • 8 kudos

Resolved! How to get the list of all jobs available for a particular user?

As of now, if I try to list the jobs via "list job" API then there is a limit of 25 jobs only.Is there a way to list all the available/visible jobs to a user?

  • 4048 Views
  • 4 replies
  • 8 kudos
Latest Reply
Kaniz
Community Manager
  • 8 kudos

Hi @Saurabh Verma​, We haven’t heard from you on the last response from @Arvind Ravish​, and I was checking back to see if his suggestions helped you. Or else, If you have any solution, please share it with the community as it can be helpful to other...

  • 8 kudos
3 More Replies
anmol_deep
by New Contributor III
  • 2428 Views
  • 3 replies
  • 2 kudos

How to restore DatabricksRoot(FileStore) data after Databricks Workspace is decommissioned?

My Azure Databricks workspace was decommissioned. I forgot to copy files stored in the DatabricksRoot storage (dbfs:/FileStore/...).Can the workspace be recommissioned/restored? Is there any way to get my data back?Also, is there any difference betwe...

  • 2428 Views
  • 3 replies
  • 2 kudos
Latest Reply
Kaniz
Community Manager
  • 2 kudos

Hi @Anmol Deep​, We haven’t heard from you since your last response, and I was checking back to see if you were able to recover your data.

  • 2 kudos
2 More Replies
korilium
by New Contributor III
  • 7891 Views
  • 11 replies
  • 3 kudos

Resolved! Databricks-connect invalid shard address

I want to use databricks inside vscode and I therefore need Databricks-connect I configure my settings using databricks-connect configure as follows: Databricks Host [https://adb-1409757184094616.16.azuredatabricks.net]Databricks Token [<my token>]Cl...

  • 7891 Views
  • 11 replies
  • 3 kudos
Latest Reply
Justin09
New Contributor II
  • 3 kudos

In case it helps anyone, I ran into this issue and had to remove the trailing / from the host name. It used to work fine with the trailing / so something must have changed.

  • 3 kudos
10 More Replies
BeginnerBob
by New Contributor III
  • 1572 Views
  • 4 replies
  • 1 kudos

Loading Dimensions including SCDType2

I have a customer dimension and for every incremental load I am applying type2 or type1 to the dimension.This dimension is based off a silver table in my delta lake where I am applying a merge statement.What happens if I need to go back and track ad...

  • 1572 Views
  • 4 replies
  • 1 kudos
Latest Reply
Kaniz
Community Manager
  • 1 kudos

Hi @Lloyd Vickery​, We haven’t heard from you on the last response from @Werner Stinckens​, and I was checking back to see if his suggestions helped you. Or else, If you have any solution, please share it with the community as it can be helpful to ot...

  • 1 kudos
3 More Replies
Join 100K+ Data Experts: Register Now & Grow with Us!

Excited to expand your horizons with us? Click here to Register and begin your journey to success!

Already a member? Login and join your local regional user group! If there isn’t one near you, fill out this form and we’ll create one for you to join!

Labels