cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Vamsee
by New Contributor II
  • 3757 Views
  • 5 replies
  • 1 kudos
  • 3757 Views
  • 5 replies
  • 1 kudos
Latest Reply
User16871418122
Contributor III
  • 1 kudos

Hi @Vamsee krishna kanth Arcot​ Yes, currently you will have to download the JDBC from https://databricks.com/spark/jdbc-drivers-download and connect from other applications with JDBC URL just like you mentioned in your example. There is an internal ...

  • 1 kudos
4 More Replies
Jiri_Koutny
by New Contributor III
  • 4071 Views
  • 8 replies
  • 4 kudos

Resolved! Programatic access to Files in Repos

Hi, we are testing the new Files support in Databricks repos. Is there a way how to programmatically read notebooks?Thanks

Untitled
  • 4071 Views
  • 8 replies
  • 4 kudos
Latest Reply
User16871418122
Contributor III
  • 4 kudos

Hi @Jiri Koutny​ these files anyway should be synced to your remote repository (git, bitbucket, GitLab etc). The APIs from version control tools Git API for example might help you achieve what you want. https://stackoverflow.com/questions/38491722/r...

  • 4 kudos
7 More Replies
Anonymous
by Not applicable
  • 495 Views
  • 1 replies
  • 0 kudos

Is there an equivalent of the %debug from Jupyter notebooks in Databricks notebooks for debugging python notebooks?

Is there an equivalent of the %debug from Jupyter notebooks in Databricks notebooks for debugging python notebooks?

  • 495 Views
  • 1 replies
  • 0 kudos
Latest Reply
Dileep_Vidyadar
New Contributor III
  • 0 kudos

Hi @Nathan Tong​ You can go through the 2 articles below that I found online for Debugging in Databricks.1. 7 Tips to Debug Apache Spark Code Faster with Databricks 2. Easier Spark Code Debugging

  • 0 kudos
ashu208
by New Contributor
  • 1013 Views
  • 4 replies
  • 0 kudos

I am not able to create a cluster

Hi,I am new on the Databricks platform, few weeks before I created a community version and it was working perfectly till 2 days before, now I can not create a cluster anymore, after few minutes it time out whenever I am trying to create a new cluster...

  • 1013 Views
  • 4 replies
  • 0 kudos
Latest Reply
Dileep_Vidyadar
New Contributor III
  • 0 kudos

Hi @Ashwinkumar Jayakumar​  and @Prabakar Ammeappin​ , I am facing the same issue for 3-4 days.Is there something wrong with Community Edition right now or does my account facing some issues?

  • 0 kudos
3 More Replies
Kaniz
by Community Manager
  • 990 Views
  • 1 replies
  • 1 kudos
  • 990 Views
  • 1 replies
  • 1 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 1 kudos

Basically all what is needed is to create api token in databricks and than use Jobs API as described here:https://docs.databricks.com/dev-tools/api/latest/jobs.htmlfollowing endpoints are available:POST https://<databricks-instance>/api/2.1/jobs/crea...

  • 1 kudos
missyT
by New Contributor III
  • 1533 Views
  • 2 replies
  • 5 kudos

Resolved! How to distinguish arrow-key from escape character with getch in C?

I want to know weather an arrow key or the escape character has ben pressed. But in order to check which arrow key has been pressed I need to do multiple blocking getch-calls bc the arrow-key sequence is bigger than 1 char. This is a problem when I c...

  • 1533 Views
  • 2 replies
  • 5 kudos
Latest Reply
Kaniz
Community Manager
  • 5 kudos

Hi @Missy Trussell​ , Did @Hubert Dudek​ 's reply answer your question?If yes, would you like to mark his reply as the best answer?Thanks.

  • 5 kudos
1 More Replies
User16869510359
by Esteemed Contributor
  • 900 Views
  • 2 replies
  • 0 kudos

Resolved! External metastore version

I am setting up an external metastore to connect my Databricks cluster. Which is the preferred and recommended Hive metastore version? Also are there any preference or recommendations on the database instance size/type

  • 900 Views
  • 2 replies
  • 0 kudos
Latest Reply
prasadvaze
Valued Contributor
  • 0 kudos

@Harikrishnan Kunhumveettil​  we use databricks runtime 7.3LTS and 9.1LTS. And external hive metastore hosted on azue sql db. Using global init script I have set spark.sql.hive.metastore.version 2.3.7 and downloaded spark.sql.hive.metastore.jars f...

  • 0 kudos
1 More Replies
sarvesh
by Contributor III
  • 696 Views
  • 0 replies
  • 0 kudos

Exception in thread "main" org.apache.spark.sql.AnalysisException: Cannot modify the value of a Spark config: spark.executor.memory;

I am trying to read a 16mb excel file and I was getting a gc overhead limit exceeded error to resolve that i tried to increase my executor memory with,spark.conf.set("spark.executor.memory", "8g")but i got the following stack :Using Spark's default l...

  • 696 Views
  • 0 replies
  • 0 kudos
amichel
by New Contributor III
  • 3394 Views
  • 3 replies
  • 2 kudos

Resolved! Is there a way to refresh tokens issued on behalf of service principal?

I want to be able to refresh tokens generated on behalf of a service principal via Token Management API, just like with any other service where OAuth is used and refresh token endpoint is available. Allowing indefinite or very long expiration for acc...

  • 3394 Views
  • 3 replies
  • 2 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 2 kudos

Refresh option would be useful.In Azure you could use Azure automation to make "refresh" script: delete if still existscreate token via: "databricks tokens create" put it to Azure Key Vault with expiration data

  • 2 kudos
2 More Replies
AzureDatabricks
by New Contributor III
  • 5747 Views
  • 7 replies
  • 2 kudos

Resolved! Can we store 300 million records and what is the preferable compute type and config?

How we can persist 300 million records? What is the best option to persist data databricks hive metastore/Azure storage/Delta table?What is the limitations we have for deltatables of databricks in terms of data?We have usecase where testers should be...

  • 5747 Views
  • 7 replies
  • 2 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 2 kudos

You can certainly store 300 million records without any problem.The best option kinda depends on the use case. If you want to do a lot of online querying on the table, I suggest using delta lake, which is optimeized (using z-order, bloom filter, par...

  • 2 kudos
6 More Replies
AzureDatabricks
by New Contributor III
  • 2668 Views
  • 8 replies
  • 4 kudos

Resolved! Need to see all the records in DeltaTable. Exception - java.lang.OutOfMemoryError: GC overhead limit exceeded

Truncate False not working in Delta table.  df_delta.show(df_delta.count(),False)Computer size Single Node - Standard_F4S - 8GB Memory, 4 coresHow much max data we can persist in Delta table in Parquet file and How fast we can retrieve data.

  • 2668 Views
  • 8 replies
  • 4 kudos
Latest Reply
AzureDatabricks
New Contributor III
  • 4 kudos

thank you !!!

  • 4 kudos
7 More Replies
Hola1801
by New Contributor
  • 956 Views
  • 3 replies
  • 3 kudos

Resolved! Float Value change when Load with spark? Full Path?

Hello,I have created my table in Databricks, at this point everything is perfect i got the same value than in my CSV. for my column "Exposure" I have :0 0,00 1 0,00 2 0,00 3 0,00 4 0,00 ...But when I load my fi...

  • 956 Views
  • 3 replies
  • 3 kudos
Latest Reply
jose_gonzalez
Moderator
  • 3 kudos

Hi @Anis Ben Salem​ ,How do you read your CSV file? do you use Pandas or Pyspark APIs? also, how do you created your table?could you share more details on the code you are trying to run?

  • 3 kudos
2 More Replies
Abela
by New Contributor III
  • 1205 Views
  • 3 replies
  • 3 kudos

Resolved! Specify cluster name in notebook

Anyway to specify to use a particular cluster in the python cell of a notebook?

  • 1205 Views
  • 3 replies
  • 3 kudos
Latest Reply
Anonymous
Not applicable
  • 3 kudos

@Alina Bella​ - If werners' answer solved the issue, would you be happy to mark their answer as best? That will help others find the solution more easily in the future.

  • 3 kudos
2 More Replies
sarvesh
by Contributor III
  • 3273 Views
  • 4 replies
  • 7 kudos

Resolved! Can we use spark-stream to read/write data from mysql? I can't find an example.

If someone can link me an example where stream is used to read or write to mysql please do.

  • 3273 Views
  • 4 replies
  • 7 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 7 kudos

Regarding writing (sink) is possible without problem via foreachBatch .I use it in production - stream autoload csvs from data lake and writing foreachBatch to SQL (inside foreachBatch function you have temporary dataframe with records and just use w...

  • 7 kudos
3 More Replies
Labels
Top Kudoed Authors