cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

A_Jabbar
by New Contributor
  • 2746 Views
  • 2 replies
  • 2 kudos

Resolved! I am unable to create databricks community edition account!!!!!!

This is what I am doing,enter all the details on page 1 click on the Getting stated with community edition, after verification, I get the following error

Error Message on the second page of Registration
  • 2746 Views
  • 2 replies
  • 2 kudos
Latest Reply
Anonymous
Not applicable
  • 2 kudos

Hi @Abdul Jabbar​ Thank you for reaching out, and we’re sorry to hear about this log-in issue! We have this Community Edition login troubleshooting post on Community. Please take a look, and follow the troubleshooting steps. If the steps do not resol...

  • 2 kudos
1 More Replies
KKo
by Contributor III
  • 1897 Views
  • 2 replies
  • 5 kudos

Read and write to XMLA from Databricks notebook

I am trying to process power bi dataset partition refresh from Azure Databricks, using XMLA endpoint. I have power bi premium capacity and read/write enabled. Tried few approaches found in google did not work with one or the other reason. If any of y...

  • 1897 Views
  • 2 replies
  • 5 kudos
Latest Reply
Anonymous
Not applicable
  • 5 kudos

Hi @Kris Koirala​  Great to meet you, and thanks for your question! Let's see if your peers in the community have an answer to your question first. Or else bricksters will get back to you soon. Thanks.

  • 5 kudos
1 More Replies
KVNARK
by Honored Contributor II
  • 2955 Views
  • 4 replies
  • 6 kudos

Resolved! best practices for SQL DB authentication from data bricks

I would like to know the best practices to authenticate SQL db from databricks/python. More interested to hear about some token based DB authentication methods other than credential based(username/password)

  • 2955 Views
  • 4 replies
  • 6 kudos
Latest Reply
Vivian_Wilfred
Databricks Employee
  • 6 kudos

@KVNARK .​ Have you checked on the PAT token for authentication? https://docs.databricks.com/sql/api/authentication.html

  • 6 kudos
3 More Replies
jm99
by New Contributor III
  • 4006 Views
  • 1 replies
  • 1 kudos

Resolved! ForeachBatch() - Get results from batchDF._jdf.sparkSession().sql('merge stmt')

Most python examples show the structure of the foreachBatch method as:def foreachBatchFunc(batchDF, batchId): batchDF.createOrReplaceTempView('viewName') ( batchDF ._jdf.sparkSession() .sql( ...

  • 4006 Views
  • 1 replies
  • 1 kudos
Latest Reply
jm99
New Contributor III
  • 1 kudos

Just found a solution...Need to convert the Java Dataframe (jdf) to a DataFramefrom pyspark import sql   def batchFunc(batchDF, batchId): batchDF.createOrReplaceTempView('viewName') sparkSession = batchDF._jdf.sparkSession()   resJdf = sparkSes...

  • 1 kudos
ks1248
by New Contributor III
  • 2919 Views
  • 2 replies
  • 5 kudos

Resolved! Autoloader creates columns not present in the source

I have been exploring Autoloader to ingest gzipped JSON files from an S3 source.The notebook fails in the first run due to schema mismatch, after re-running the notebook, the schema evolves and the ingestion runs successfully.On analysing the schema ...

  • 2919 Views
  • 2 replies
  • 5 kudos
Latest Reply
ks1248
New Contributor III
  • 5 kudos

Hi @Debayan Mukherjee​ , @Kaniz Fatma​ Thank you for replying to my question.I was able to figure out the issue. I was creating the schema and checkpoint folders in the same path as the source location for the autoloader. This caused the schema to ch...

  • 5 kudos
1 More Replies
Phani1
by Valued Contributor II
  • 3124 Views
  • 2 replies
  • 0 kudos

SUBNET_EXHAUSTED_FAILURE(CLOUD_FAILURE): or No more address space to create NIC within injected virtual network

Currently we are using an all-purpose compute cluster. When we tried to allocate the scheduled jobs to job cluster, we are blocked at the following error:SUBNET_EXHAUSTED_FAILURE(CLOUD_FAILURE): azure_error_code:SubnetIsFull,azure_error_message:No mo...

  • 3124 Views
  • 2 replies
  • 0 kudos
Latest Reply
daniel_sahal
Esteemed Contributor
  • 0 kudos

Answering your questions - yes, your vnet/subnet is out of non-occupied IPs and this can be fixed by allocating more IPs to your network address space.Each cluster requires it's own IP, so if there are none available, it simply cannot start.

  • 0 kudos
1 More Replies
lewit
by New Contributor II
  • 1861 Views
  • 2 replies
  • 1 kudos

Is it possible to create a feature store training set directly from a feature store table?

Rather than joining features from different tables, I just wanted to use a single feature store table and select some of its features, but still log the model in the feature store. The problem I am facing is that I do not know how to create the train...

  • 1861 Views
  • 2 replies
  • 1 kudos
Latest Reply
Debayan
Databricks Employee
  • 1 kudos

Hi, Could you please refer https://docs.databricks.com/machine-learning/feature-store/train-models-with-feature-store.html#create-a-trainingset-using-the-same-feature-multiple-times and let us know if this helps.

  • 1 kudos
1 More Replies
gpzz
by New Contributor II
  • 1849 Views
  • 2 replies
  • 1 kudos

MEMORY_ONLY not working

val doubledAmount = premiumCustomers.map(x=>(x._1, x._2*2)).persist(StorageLevel.MEMORY_ONLY) error: not found: value StorageLevel

  • 1849 Views
  • 2 replies
  • 1 kudos
Latest Reply
Chaitanya_Raju
Honored Contributor
  • 1 kudos

Hi @Gaurav Poojary​ ,Can you please try the below as displayed in the image it is working for me without any issues.Happy Learning!!

  • 1 kudos
1 More Replies
bozhu
by Contributor
  • 1986 Views
  • 3 replies
  • 3 kudos

Set taskValues in DLT workbooks

Is "setting taskValues in DLT workbooks" supported?I tried setting a task value in a DLT workbook, but it does not seem supported, so downstream workbooks within the same workflows job cannot consume this task value.

  • 1986 Views
  • 3 replies
  • 3 kudos
Latest Reply
Lê_Ngọc_Lợi
New Contributor III
  • 3 kudos

I have the same issue, I also want to know databricks support taskValue between taskJob and DLT or not?

  • 3 kudos
2 More Replies
Vik1
by New Contributor II
  • 9346 Views
  • 3 replies
  • 5 kudos

Some very simple functions in Pandas on Spark are very slow

I have a pandas on spark dataframe with 8 million rows and 20 columns. It took 3.48 minutes to run df.shape and it takes. It also takes a long time to run df.head took 4.55 minutes . By contrast df.var1.value_counts().reset_index() took only 0.18 sec...

  • 9346 Views
  • 3 replies
  • 5 kudos
Latest Reply
PeterDowdy
New Contributor II
  • 5 kudos

The reason why this is slow is because pandas needs an index column to perform `shape` or `head`. If you don't provide one, pyspark pandas enumerates the entire dataframe to create a default one. For example, given columns A, B, and C in dataframe `d...

  • 5 kudos
2 More Replies
sunil_smile
by Contributor
  • 4807 Views
  • 2 replies
  • 1 kudos

Vnet peering settings is not enable in Azure databricks premium , even though its deployed inside my VNET?

Hi All,Vnet peering settings is not enabled in Azure databricks , even though its deployed inside my VNET?Here i not mentioned my vnet and subnet details , but filled this and created databricks (without private endpoint - allow public access)virtual...

image image image
  • 4807 Views
  • 2 replies
  • 1 kudos
Latest Reply
Debayan
Databricks Employee
  • 1 kudos

Hi, VNET peering is not supported or possible on VNET-injected workspaces. Please refer: https://learn.microsoft.com/en-us/azure/databricks/administration-guide/cloud-configurations/azure/vnet-peering#requirements

  • 1 kudos
1 More Replies
patdev
by New Contributor III
  • 2128 Views
  • 2 replies
  • 2 kudos

load new data in delta table

Hello all,I want to know how to update new data in delta table from new csv file.here is the code that i have used to create delta table from csv file and loaded data. but i have go new updated file and trying to load new data but not able to any gui...

  • 2128 Views
  • 2 replies
  • 2 kudos
Latest Reply
patdev
New Contributor III
  • 2 kudos

Thank you, i tried that and it ended in error, the table created with delta are from csv which must have converted to parquet file and all the columns are varchar or string. so not if i want to entered new file it ends in incmopatibility error for da...

  • 2 kudos
1 More Replies
sunil_smile
by Contributor
  • 12783 Views
  • 8 replies
  • 10 kudos

Resolved! How i can add ADLS Gen2 - OAuth 2.0 as Cluster scope for my High concurrency Shared Cluster (without unity catalog)?

Hi All,Kindly help me , how i can add the ADLS gen2 OAuth 2.0 authentication to my high concurrency shared cluster. I want to scope this authentication to entire cluster not for particular notebook.Currently i have added them as spark configuration o...

image.png image
  • 12783 Views
  • 8 replies
  • 10 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 10 kudos

error is because of missing default settings (create new cluster and do not remove them),the warning is because secrets should be put in secret scope, and then you should reference secrets in settings

  • 10 kudos
7 More Replies
JesseS
by New Contributor II
  • 6131 Views
  • 2 replies
  • 1 kudos

Resolved! How to extract source data from on-premise databases into a data lake and load with AutoLoader?

Here is the situation I am working with. I am trying to extract source data using Databricks JDBC connector using SQL Server databases as my data source. I want to write those into a directory in my data lake as JSON files, then have AutoLoader ing...

  • 6131 Views
  • 2 replies
  • 1 kudos
Latest Reply
Aashita
Databricks Employee
  • 1 kudos

To add to @werners point, I would use ADF to load SQL server data into ADLS Gen 2 as json. Then Load these Raw Json files from your ADLS base location into a Delta table using Autoloader.Delta Live Tables can be used in this scenario.You can also reg...

  • 1 kudos
1 More Replies
databicky
by Contributor II
  • 1005 Views
  • 1 replies
  • 0 kudos

Resolved! How to create border for sme specific cells?

i tried some code to create border for excel sheet, for particular cell iam able to write but while i am trying with some set of cells means it is showing error.​

  • 1005 Views
  • 1 replies
  • 0 kudos
Latest Reply
Chaitanya_Raju
Honored Contributor
  • 0 kudos

Hi @Mohammed sadamusean​ ,Can you please try similar to below code using loops, I have implemented a similar use case that might be useful, please let me know if you need further top = Side(border_style = 'thin',color = '00000000') bottom = Side(bor...

  • 0 kudos

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group
Labels