Data Engineering

Forum Posts

Sorted by:

by Mado • Valued Contributor II

10-19-2022 6:26:06 AM

4031 Views
4 replies
4 kudos

Resolved! Difference between "spark.table" & "spark.read.table"?

Hi,I want to make a PySpark DataFrame from a Table. I would like to ask about the difference of the following commands:spark.read.table(TableName)&spark.table(TableName)Both return PySpark DataFrame and look similar. Thanks.

Data Engineering

4031 Views
4 replies
4 kudos

10-19-2022 6:26:06 AM

View Replies

Latest Reply

Mado
Valued Contributor II

10-20-2022 6:21:17 AM

4 kudos

Hi @Kaniz Fatma I selected answer from @Kedar Deshpande as the best answer.

4 kudos

10-20-2022 6:21:17 AM

3 More Replies

by 829023 • New Contributor

10-14-2022 12:36:05 AM

1056 Views
2 replies
0 kudos

Faced error using Databricks SQL Connector

I installed databricks-sql-connector in Pycharm.Then i run the query below based on docs.I refer this docs.(https://docs.databricks.com/dev-tools/python-sql-connector.html)==========================================from databricks import sqlimport osw...

Data Engineering

1056 Views
2 replies
0 kudos

10-14-2022 12:36:05 AM

View Replies

Latest Reply

Hubert-Dudek
Esteemed Contributor III

10-20-2022 4:53:55 AM

0 kudos

It seems that one of your environment variables is incorrect. Please print them and compare them with the connection settings from the cluster or SQL warehouse endpoint.

0 kudos

10-20-2022 4:53:55 AM

1 More Replies

by Tahseen0354 • Contributor III

10-17-2022 3:03:33 AM

2544 Views
2 replies
4 kudos

Resolved! How do I track databricks cluster users ?

Hi, is there a way to find out/monitor which users has used my cluster, how long and how many times in an azure databricks workspace ?

Data Engineering

2544 Views
2 replies
4 kudos

10-17-2022 3:03:33 AM

View Replies

Latest Reply

youssefmrini
Honored Contributor III

10-20-2022 3:04:29 AM

4 kudos

Hello, You can activate Audit logs ( More specifically Cluster logs) https://learn.microsoft.com/en-us/azure/databricks/administration-guide/account-settings/azure-diagnostic-logs It can be very helpful to track all the metrics.

4 kudos

10-20-2022 3:04:29 AM

1 More Replies

by ramankr48 • Contributor II

10-19-2022 4:01:39 AM

20673 Views
6 replies
8 kudos

Resolved! how to find the size of a table in python or sql?

let's suppose there is a database db, inside that so many tables are there and , i want to get the size of tables . how to get in either sql, python, pyspark.even if i have to get one by one it's fine.

Data Engineering

20673 Views
6 replies
8 kudos

10-19-2022 4:01:39 AM

View Replies

Latest Reply

shan_chandra
Honored Contributor III

10-19-2022 10:54:01 AM

8 kudos

@Raman Gupta - could you please try the below %python spark.sql("describe detail delta-table-name").select("sizeInBytes").collect()

8 kudos

10-19-2022 10:54:01 AM

5 More Replies

by User16835756816 • Valued Contributor

09-16-2022 4:20:11 PM

1057 Views
1 replies
6 kudos

How can I simplify my data ingestion by processing the data as it arrives in cloud storage?

This post will help you simplify your data ingestion by utilizing Auto Loader, Delta Optimized Writes, Delta Write Jobs, and Delta Live Tables. Pre-Req: You are using JSON data and Delta Writes commandsStep 1: Simplify ingestion with Auto Loader Delt...

Data Engineering

1057 Views
1 replies
6 kudos

09-16-2022 4:20:11 PM

View Replies

Latest Reply

youssefmrini
Honored Contributor III

10-19-2022 8:15:47 AM

6 kudos

This post will help you simplify your data ingestion by utilizing Auto Loader, Delta Optimized Writes, Delta Write Jobs, and Delta Live Tables.Pre-Req: You are using JSON data and Delta Writes commandsStep 1: Simplify ingestion with Auto Loader Delta...

6 kudos

10-19-2022 8:15:47 AM

by ricperelli • New Contributor II

10-19-2022 5:38:26 AM

1499 Views
0 replies
1 kudos

How can i save a parquet file using pandas with a data factory orchestrated notebook?

Hi guys,this is my first question, feel free to correct me if i'm doing something wrong.Anyway, i'm facing a really strange problem, i have a notebook in which i'm performing some pandas analysis, after that i save the resulting dataframe in a parque...

Data Engineering

1499 Views
0 replies
1 kudos

10-19-2022 5:38:26 AM

by venkad • Contributor

10-19-2022 5:23:27 AM

638 Views
0 replies
4 kudos

Default location for Schema/Database in Unity

Hello Bricksters,We organize the delta lake in multiple storage accounts. One storage account per data domain and one container per database. This helps us to isolate the resources and cost on the business domain level.Earlier, when a schema/database...

Data Engineering

638 Views
0 replies
4 kudos

10-19-2022 5:23:27 AM

by vizoso • New Contributor III

10-18-2022 2:47:52 AM

710 Views
2 replies
3 kudos

Cluster list in Microsoft.Azure.Databricks.Client fails because ClusterSource enum does not include MODELS. When you have a model serving cluster, Clu...

Cluster list in Microsoft.Azure.Databricks.Client fails because ClusterSource enum does not include MODELS.When you have a model serving cluster, ClustersApiClient.List method fails to deserialize the API response because that cluster has MODELS as C...

Data Engineering

710 Views
2 replies
3 kudos

10-18-2022 2:47:52 AM

View Replies

Latest Reply

Kaniz
Community Manager

10-19-2022 1:57:23 AM

3 kudos

Hi @José Fernández Vizoso, May I know are you facing any issue here or do you want to share some sort of information through this post?

3 kudos

10-19-2022 1:57:23 AM

1 More Replies

by parulpaul • New Contributor III

10-18-2022 12:53:53 AM

2305 Views
2 replies
2 kudos

AnalysisException: Multiple sources found for bigquery (com.google.cloud.spark.bigquery.BigQueryRelationProvider, com.google.cloud.spark.bigquery.v2.BigQueryTableProvider), please specify the fully qualified class name.

While reading data from BigQuery to Databricks getting the error : AnalysisException: Multiple sources found for bigquery (com.google.cloud.spark.bigquery.BigQueryRelationProvider, com.google.cloud.spark.bigquery.v2.BigQueryTableProvider), please spe...

Data Engineering

2305 Views
2 replies
2 kudos

10-18-2022 12:53:53 AM

View Replies

Latest Reply

Kaniz
Community Manager

10-19-2022 2:13:35 AM

2 kudos

Hi @Parul Paul , We haven’t heard from you since the last response from @Debayan Mukherjee , and I was checking back to see if you have a resolution yet. If you have any solution, please share it with the community as it can be helpful to others. ...

2 kudos

10-19-2022 2:13:35 AM

1 More Replies

by saurabh12521 • New Contributor II

09-19-2022 7:40:15 AM

1252 Views
3 replies
4 kudos

Unity through terraform

I am working on automation of Unity through terraform. I have referred below link link to get started :https://registry.terraform.io/providers/databricks/databricks/latest/docs/guides/unity-catalog-azureI am facing issue when I create metastore using...

Data Engineering

1252 Views
3 replies
4 kudos

09-19-2022 7:40:15 AM

View Replies

Latest Reply

Pat
Honored Contributor III

10-19-2022 1:50:10 AM

4 kudos

Not sure if you got this working, but I noticed you are using provider: `databrickslabs/databricks`, hence why this is not avaialable. You should be using new provider: `databricks/databricks`: https://registry.terraform.io/providers/databricks/datab...

4 kudos

10-19-2022 1:50:10 AM

2 More Replies

by DataBricks_2022 • New Contributor III

10-18-2022 11:42:14 AM

641 Views
2 replies
1 kudos

Resolved! How to get started with Auto Loader using partner academy portal? Are there any videos and step by step material

Need Video and step by step documentation on Auto Loader as well as how to build end-to-end data pipeline

Data Engineering

641 Views
2 replies
1 kudos

10-18-2022 11:42:14 AM

View Replies

Latest Reply

Kaniz
Community Manager

10-18-2022 11:57:19 PM

1 kudos

Hi @raja iqbal, We haven’t heard from you since the last response from @karthik p.Please don't forget to click on the "Select As Best" button whenever the information provided helps resolve your question.

1 kudos

10-18-2022 11:57:19 PM

1 More Replies

by cvantassel • New Contributor III

06-08-2022 1:18:28 PM

3742 Views
7 replies
6 kudos

Is there any way to propagate errors from dbutils?

I have a master notebook that runs a few different notebooks on a schedule using the dbutils.notebook.run() function. Occasionally, these child notebooks will fail (due to API connections or whatever). My issue is, when I attempt to catch the errors ...

Data Engineering

3742 Views
7 replies
6 kudos

06-08-2022 1:18:28 PM

View Replies

Latest Reply

wdphilli
New Contributor III

10-18-2022 1:34:00 PM

6 kudos

I have the same issue. I see no reason that Databricks couldn't propagate the internal exception back through their WorkflowException

6 kudos

10-18-2022 1:34:00 PM

6 More Replies

by 740209 • New Contributor II

09-27-2022 12:42:12 PM

1123 Views
5 replies
1 kudos

Bug in db.fs.utils

When using db.fs.utils on a s3 bucket titled "${sometext}.${sometext}.${somenumber}${sometext}-${sometext}-${sometext}" we receive an error. PLEASE understand this is an issue with how it encodes the .${somenumber} because we verified with boto3 that...

Data Engineering

1123 Views
5 replies
1 kudos

09-27-2022 12:42:12 PM

View Replies

Latest Reply

740209
New Contributor II

10-03-2022 9:43:40 AM

1 kudos

@Debayan Mukherjee All the information is there please read accurately. I am not going to give you the actual bucket name I am using on a public forum. As i said above here is the command:dbutils.fs.ls("s3a://${bucket_name_here_follow_above_format}"...

1 kudos

10-03-2022 9:43:40 AM

4 More Replies

by Trey • New Contributor III

10-10-2022 7:19:10 PM

1348 Views
3 replies
6 kudos

Resolved! Is it a good idea to use a managed delta table as a temporal table?

Hi all!I would like to use a managed delta table as a temporal table, meaning:to create a managed table in the middle of ETL processto drop the managed table right after the processThis way I can perform merge, insert, or delete oprations better than...

Data Engineering

1348 Views
3 replies
6 kudos

10-10-2022 7:19:10 PM

View Replies

Latest Reply

Kaniz
Community Manager

10-18-2022 4:30:26 AM

6 kudos

Hi @Kwangwon Yi , We haven’t heard from you since the last response from @Werner Stinckens and @karthik p, and I was checking back to see if you have a resolution yet. If you have any solution, please share it with the community, as it can be help...

6 kudos

10-18-2022 4:30:26 AM

2 More Replies

by ramankr48 • Contributor II

09-26-2022 6:45:54 AM

6522 Views
5 replies
7 kudos

Resolved! AnalysisException: The schema of your Delta table has changed in an incompatible way since your DataFrame or DeltaTable object was created. Please redefine your DataFrame or DeltaTable object.

can anybody tell me why this error is coming and what's the reliable solution for it

Data Engineering

6522 Views
5 replies
7 kudos

09-26-2022 6:45:54 AM

View Replies

Latest Reply

Anonymous
Not applicable

10-18-2022 1:05:19 AM

7 kudos

Hi @Raman Gupta Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Thanks...

7 kudos

10-18-2022 1:05:19 AM

4 More Replies

User

Count

1601

736

343

284

246

Databricks

Forum Posts

Resolved! Difference between "spark.table" & "spark.read.table"?

Faced error using Databricks SQL Connector

Resolved! How do I track databricks cluster users ?

Resolved! how to find the size of a table in python or sql?

How can I simplify my data ingestion by processing the data as it arrives in cloud storage?

How can i save a parquet file using pandas with a data factory orchestrated notebook?

Default location for Schema/Database in Unity

Cluster list in Microsoft.Azure.Databricks.Client fails because ClusterSource enum does not include MODELS. When you have a model serving cluster, Clu...

AnalysisException: Multiple sources found for bigquery (com.google.cloud.spark.bigquery.BigQueryRelationProvider, com.google.cloud.spark.bigquery.v2.BigQueryTableProvider), please specify the fully qualified class name.

Unity through terraform

Resolved! How to get started with Auto Loader using partner academy portal? Are there any videos and step by step material

Is there any way to propagate errors from dbutils?

Bug in db.fs.utils

Resolved! Is it a good idea to use a managed delta table as a temporal table?

Resolved! AnalysisException: The schema of your Delta table has changed in an incompatible way since your DataFrame or DeltaTable object was created. Please redefine your DataFrame or DeltaTable object.

DELTA_EXCEED_CHAR_VARCHAR_LIMIT

Not able to set run_as service_principal_name

Pyspark operations slowness in CLuster 14.3LTS as ...

[Databricks Assets Bundles] Workflow trigger on fi...

Addressing Pipeline Error Handling in Databricks b...