cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 
Data + AI Summit 2024 - Data Engineering & Streaming

Forum Posts

hari
by Contributor
  • 16592 Views
  • 5 replies
  • 5 kudos

How to add the partition for an existing delta table

We didn't need to set partitions for our delta tables as we didn't have many performance concerns and delta lake out-of-the-box optimization worked great for us. But there is now a need to set a specific partition column for some tables to allow conc...

  • 16592 Views
  • 5 replies
  • 5 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 5 kudos

Hi @Harikrishnan P H​ , We haven’t heard from you since the last response from @Hubert Dudek​ , and I was checking back to see if you have a resolution yet. If you have any solution, please share it with the community as it can be helpful to others. ...

  • 5 kudos
4 More Replies
Anonymous
by Not applicable
  • 653 Views
  • 0 replies
  • 1 kudos

Heads up! November Community Social!  On November 17th we are hosting another Community Social - we're doing these monthly ! We want to make sure ...

Heads up! November Community Social! On November 17th we are hosting another Community Social - we're doing these monthly ! We want to make sure that we all have the chance to connect as a community often. Come network, talk data, and just get social...

  • 653 Views
  • 0 replies
  • 1 kudos
Taha_Hussain
by Valued Contributor II
  • 1218 Views
  • 0 replies
  • 8 kudos

Ask your technical questions at Databricks Office Hours October 26 - 11:00 AM - 12:00 PM PT: Register HereNovember 9 - 8:00 AM - 9:00 AM GMT: Register...

Ask your technical questions at Databricks Office HoursOctober 26 - 11:00 AM - 12:00 PM PT: Register HereNovember 9 - 8:00 AM - 9:00 AM GMT: Register Here (NEW EMEA Office Hours)Databricks Office Hours connects you directly with experts to answer all...

  • 1218 Views
  • 0 replies
  • 8 kudos
pen
by New Contributor II
  • 1408 Views
  • 2 replies
  • 2 kudos

Pyspark will error while I pack source zip package without dir.

If I send the package made by zipfile on spark.submit.pyFiles which zip by this code. import zipfile, os def make_zip(source_dir, output_filename): with zipfile.ZipFile(output_filename, 'w') as zipf: pre_len = len(os.path....

  • 1408 Views
  • 2 replies
  • 2 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 2 kudos

I checked, and your code is ok. If you set source_dir and output_filename please remember to start path with /dbfsIf you work on the community edition you can get problems with access to underlying filesystem.

  • 2 kudos
1 More Replies
witnessthee
by New Contributor II
  • 5820 Views
  • 4 replies
  • 2 kudos

Resolved! Error when using pyflink on databricks, An error occurred while trying to connect to the Java server

Hi, right now I am trying to run a pyflink script that can connect to a kafka server. When I run that script, I got an error "An error occurred while trying to connect to the Java server 127.0.0.1:35529". Do I need to install a extra jdk for that? er...

  • 5820 Views
  • 4 replies
  • 2 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 2 kudos

Hi @Tam daad​, We haven’t heard from you since the last response from @Werner Stinckens​, and I was checking back to see if you have a resolution yet. If you have any solution, please share it with the community as it can be helpful to others. Otherw...

  • 2 kudos
3 More Replies
Mado
by Valued Contributor II
  • 8761 Views
  • 4 replies
  • 4 kudos

Resolved! Difference between "spark.table" & "spark.read.table"?

Hi,I want to make a PySpark DataFrame from a Table. I would like to ask about the difference of the following commands:spark.read.table(TableName)&spark.table(TableName)Both return PySpark DataFrame and look similar. Thanks.

  • 8761 Views
  • 4 replies
  • 4 kudos
Latest Reply
Mado
Valued Contributor II
  • 4 kudos

Hi @Kaniz Fatma​ I selected answer from @Kedar Deshpande​ as the best answer.

  • 4 kudos
3 More Replies
829023
by New Contributor
  • 1429 Views
  • 2 replies
  • 0 kudos

Faced error using Databricks SQL Connector

I installed databricks-sql-connector in Pycharm.Then i run the query below based on docs.I refer this docs.(https://docs.databricks.com/dev-tools/python-sql-connector.html)==========================================from databricks import sqlimport osw...

  • 1429 Views
  • 2 replies
  • 0 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 0 kudos

It seems that one of your environment variables is incorrect. Please print them and compare them with the connection settings from the cluster or SQL warehouse endpoint.

  • 0 kudos
1 More Replies
Tahseen0354
by Contributor III
  • 3231 Views
  • 2 replies
  • 4 kudos

Resolved! How do I track databricks cluster users ?

Hi, is there a way to find out/monitor which users has used my cluster, how long and how many times in an azure databricks workspace ?

  • 3231 Views
  • 2 replies
  • 4 kudos
Latest Reply
youssefmrini
Honored Contributor III
  • 4 kudos

Hello, You can activate Audit logs ( More specifically Cluster logs) https://learn.microsoft.com/en-us/azure/databricks/administration-guide/account-settings/azure-diagnostic-logs It can be very helpful to track all the metrics.

  • 4 kudos
1 More Replies
ramankr48
by Contributor II
  • 25236 Views
  • 6 replies
  • 10 kudos

Resolved! how to find the size of a table in python or sql?

let's suppose there is a database db, inside that so many tables are there and , i want to get the size of tables . how to get in either sql, python, pyspark.even if i have to get one by one it's fine.

  • 25236 Views
  • 6 replies
  • 10 kudos
Latest Reply
shan_chandra
Esteemed Contributor
  • 10 kudos

@Raman Gupta​ - could you please try the below %python spark.sql("describe detail delta-table-name").select("sizeInBytes").collect()

  • 10 kudos
5 More Replies
User16835756816
by Valued Contributor
  • 6725 Views
  • 1 replies
  • 6 kudos

How can I simplify my data ingestion by processing the data as it arrives in cloud storage?

This post will help you simplify your data ingestion by utilizing Auto Loader, Delta Optimized Writes, Delta Write Jobs, and Delta Live Tables. Pre-Req: You are using JSON data and Delta Writes commandsStep 1: Simplify ingestion with Auto Loader Delt...

  • 6725 Views
  • 1 replies
  • 6 kudos
Latest Reply
youssefmrini
Honored Contributor III
  • 6 kudos

This post will help you simplify your data ingestion by utilizing Auto Loader, Delta Optimized Writes, Delta Write Jobs, and Delta Live Tables.Pre-Req: You are using JSON data and Delta Writes commandsStep 1: Simplify ingestion with Auto Loader Delta...

  • 6 kudos
ricperelli
by New Contributor II
  • 1792 Views
  • 0 replies
  • 1 kudos

How can i save a parquet file using pandas with a data factory orchestrated notebook?

Hi guys,this is my first question, feel free to correct me if i'm doing something wrong.Anyway, i'm facing a really strange problem, i have a notebook in which i'm performing some pandas analysis, after that i save the resulting dataframe in a parque...

  • 1792 Views
  • 0 replies
  • 1 kudos
venkad
by Contributor
  • 878 Views
  • 0 replies
  • 4 kudos

Default location for Schema/Database in Unity

Hello Bricksters,We organize the delta lake in multiple storage accounts. One storage account per data domain and one container per database. This helps us to isolate the resources and cost on the business domain level.Earlier, when a schema/database...

  • 878 Views
  • 0 replies
  • 4 kudos
vizoso
by New Contributor III
  • 1051 Views
  • 2 replies
  • 3 kudos

Cluster list in Microsoft.Azure.Databricks.Client fails because ClusterSource enum does not include MODELS. When you have a model serving cluster, Clu...

Cluster list in Microsoft.Azure.Databricks.Client fails because ClusterSource enum does not include MODELS.When you have a model serving cluster, ClustersApiClient.List method fails to deserialize the API response because that cluster has MODELS as C...

  • 1051 Views
  • 2 replies
  • 3 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 3 kudos

Hi @José Fernández Vizoso​, May I know are you facing any issue here or do you want to share some sort of information through this post?

  • 3 kudos
1 More Replies
parulpaul
by New Contributor III
  • 3020 Views
  • 2 replies
  • 2 kudos

AnalysisException: Multiple sources found for bigquery (com.google.cloud.spark.bigquery.BigQueryRelationProvider, com.google.cloud.spark.bigquery.v2.BigQueryTableProvider), please specify the fully qualified class name.

While reading data from BigQuery to Databricks getting the error : AnalysisException: Multiple sources found for bigquery (com.google.cloud.spark.bigquery.BigQueryRelationProvider, com.google.cloud.spark.bigquery.v2.BigQueryTableProvider), please spe...

  • 3020 Views
  • 2 replies
  • 2 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 2 kudos

Hi @Parul Paul​ ​, We haven’t heard from you since the last response from @Debayan Mukherjee​ , and I was checking back to see if you have a resolution yet. If you have any solution, please share it with the community as it can be helpful to others. ...

  • 2 kudos
1 More Replies
saurabh12521
by New Contributor II
  • 1807 Views
  • 3 replies
  • 4 kudos

Unity through terraform

I am working on automation of Unity through terraform. I have referred below link link to get started :https://registry.terraform.io/providers/databricks/databricks/latest/docs/guides/unity-catalog-azureI am facing issue when I create metastore using...

image
  • 1807 Views
  • 3 replies
  • 4 kudos
Latest Reply
Pat
Honored Contributor III
  • 4 kudos

Not sure if you got this working, but I noticed you are using provider: `databrickslabs/databricks`, hence why this is not avaialable. You should be using new provider: `databricks/databricks`: https://registry.terraform.io/providers/databricks/datab...

  • 4 kudos
2 More Replies
Join 100K+ Data Experts: Register Now & Grow with Us!

Excited to expand your horizons with us? Click here to Register and begin your journey to success!

Already a member? Login and join your local regional user group! If there isn’t one near you, fill out this form and we’ll create one for you to join!

Labels