cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

greyfine
by New Contributor II
  • 11483 Views
  • 5 replies
  • 5 kudos

Hi Everyone , I was wondering if it is possible to have alerts set up on query level for pyspark notebooks that are run on schedule in databricks so if we have some expected result from it we can receive a mail alert ?

In Above you can see we have 3 workspaces - we have the alert option available in the sql workspace but not in our data science and engineering space , anyway we can incorporate this in our DS and Engineering space ?

image.png
  • 11483 Views
  • 5 replies
  • 5 kudos
Latest Reply
JKR
Contributor
  • 5 kudos

How can I receive call on teams/number/slack if any jobs fails?

  • 5 kudos
4 More Replies
Aidzillafont
by New Contributor II
  • 877 Views
  • 1 replies
  • 0 kudos

How to pick the right cluster for your workflow

Hi All,I am attempting to execute a workflow on various job clusters, including general-purpose and memory-optimized clusters. My main bottleneck is that data is being written to disk because I’m running out of RAM. This is due to the large dataset t...

  • 877 Views
  • 1 replies
  • 0 kudos
Latest Reply
Ravivarma
Databricks Employee
  • 0 kudos

Hello @Aidzillafont , Greetings! Please find below the document which explains the Compute configuration best practices Doc: https://docs.databricks.com/en/compute/cluster-config-best-practices.html I hope this helps you! Regards, Ravi

  • 0 kudos
Sadam97
by New Contributor III
  • 518 Views
  • 0 replies
  • 0 kudos

Databricks (GCP) Cluster not resolving Hostname into IP address

we have #mongodb hosts that must be resolved to private internal loadbalancer ips ( of another cluster ), and that we are unable to add host aliases in the Databricks GKE cluster in order for the spark to be able to connect to a mongodb and resolve t...

  • 518 Views
  • 0 replies
  • 0 kudos
Enrique1987
by New Contributor III
  • 2191 Views
  • 1 replies
  • 3 kudos

Resolved! when to activate photon and when not to ?

Photon appears as an option to check and uncheck as appropriate.The use of Photon leads to higher consumption of DBUs and higher costs.At what point does it pay off and when not to enable it.More costs for the use of photon, but at the same time less...

  • 2191 Views
  • 1 replies
  • 3 kudos
Latest Reply
jacovangelder
Honored Contributor
  • 3 kudos

This is my own experience: For SQL workloads, with not too many joins, it will speed things up. For building facts and dimensions using many joins, I found Photon to increase costs by a lot, while not bringing much better performance. The only real w...

  • 3 kudos
feliximmanuel
by New Contributor
  • 920 Views
  • 0 replies
  • 0 kudos

Error: oidc: fetch .well-known: Get "https://%E2%80%93host/oidc/.well-known/oauth-authorization-serv

I'm trying to authenticate databricks using WSL but suddenly getting this error./databricks-asset-bundle$ databricks auth login –host https://<XXXXXXXXX>.12.azuredatabricks.netDatabricks Profile Name:<XXXXXXXXX>Error: oidc: fetch .well-known: Get "ht...

  • 920 Views
  • 0 replies
  • 0 kudos
Sudheer_DB
by New Contributor II
  • 748 Views
  • 3 replies
  • 0 kudos

DLT SQL schema definition

Hi All,While defining a schema in creating a table using Autoloader and DLT using SQL, I am getting schema mismatch error between the defined schema and inferred schema. CREATE OR REFRESH STREAMING TABLE csv_test(a0 STRING,a1 STRING,a2 STRING,a3 STRI...

Sudheer_DB_0-1719375711422.png
  • 748 Views
  • 3 replies
  • 0 kudos
Latest Reply
daniel_sahal
Esteemed Contributor
  • 0 kudos

@Sudheer_DB You can specify your own _rescued_data column name by setting up rescuedDataColumn option.https://docs.databricks.com/en/ingestion/auto-loader/schema.html#what-is-the-rescued-data-column

  • 0 kudos
2 More Replies
hr959
by New Contributor II
  • 938 Views
  • 1 replies
  • 0 kudos

Access Control/Management Question

I have two workspaces made with the same account using same metastore and region, and I want the second workspace to be able to access only certain rows of tables from data held in the first workspace based on a user group condition. Is this possible...

  • 938 Views
  • 1 replies
  • 0 kudos
Latest Reply
hr959
New Contributor II
  • 0 kudos

Sorry, forgot to mention! When I tried delta sharing, all my workspaces have the same sharing identifier so the data never actually showed up in the "shared with me", and then I wasn't able to access the data I shared. It was in "shared by me" in bot...

  • 0 kudos
pm71
by New Contributor II
  • 1526 Views
  • 4 replies
  • 3 kudos

Issue with os and sys Operations in Repo Path on Databricks

Hi,Starting from today, I have encountered an issue when performing operations using the os and sys modules within the Repo path in my Databricks environment. Specifically, any operation that involves these modules results in a timeout error. However...

  • 1526 Views
  • 4 replies
  • 3 kudos
Latest Reply
mgradowski
New Contributor III
  • 3 kudos

https://status.azuredatabricks.net/pages/incident/5d49ec10226b9e13cb6a422e/667c08fa17fef71767abda04"Degraded performance" is a pretty mild way of saying almost nothing productve can be done ATM...

  • 3 kudos
3 More Replies
hfyhn
by New Contributor
  • 663 Views
  • 0 replies
  • 0 kudos

DLT, combine LIVE table with data masking and row filter

I need to apply data masking and row filters to my table. At the same time I would like to use DLT Live tables. However, as far as I can see, DLT Live tables are not compatble with Live tables. What are my options? Move the tables from out of the mat...

  • 663 Views
  • 0 replies
  • 0 kudos
Hertz
by New Contributor II
  • 961 Views
  • 1 replies
  • 0 kudos

System Tables / Audit Logs action_name createWarehouse/createEndpoint

I am creating a cost dashboard across multiple accounts. I am working get sql warehouse names and warehouse ids so I can combine with system.access.billing on warehouse_id.  But the only action_names that include both the warehouse_id and warehouse_n...

Data Engineering
Audit Logs
cost monitor
createEndpoint
createWarehouse
  • 961 Views
  • 1 replies
  • 0 kudos
Latest Reply
Hertz
New Contributor II
  • 0 kudos

I just wanted to circle back to this. It appears that the ID is returned in the response column of the create action_name.

  • 0 kudos
HASSAN_UPPAL123
by New Contributor II
  • 1191 Views
  • 1 replies
  • 0 kudos

SPARK_GEN_SUBQ_0 WHERE 1=0, Error message from Server: Configuration schema is not available

Hi Community,I'm trying to read the data from sample schema from table nation from data-bricks catalog via spark but i'm getting this error.com.databricks.client.support.exceptions.GeneralException: [Databricks][JDBCDriver](500051) ERROR processing q...

Data Engineering
pyspark
python
  • 1191 Views
  • 1 replies
  • 0 kudos
Latest Reply
HASSAN_UPPAL123
New Contributor II
  • 0 kudos

Hi Community,I'm still facing the issue can someone please provide me any solution how to fix above error.

  • 0 kudos
Phani1
by Valued Contributor II
  • 2713 Views
  • 1 replies
  • 0 kudos

Resolved! Databricks with Private cloud

Hi Databricks Team,Is it possible for Databricks to offer support for private cloud environments other than Azure, GCP, and AWS? The client intends to utilize Databricks in their own cloud for enhanced security. If this is feasible, what is the proce...

  • 2713 Views
  • 1 replies
  • 0 kudos
Latest Reply
holly
Databricks Employee
  • 0 kudos

Hi Janga, Providing your own cloud is not a service we offer at this time. I can't say for certain, but it's unlikely we'll ever offer this.  You mentioned you have a 'client' so I'm assuming you're part of a consulting firm. I understand it's diffic...

  • 0 kudos
Zume
by New Contributor II
  • 831 Views
  • 1 replies
  • 0 kudos

Unity Catalog Shared compute Issues

Am I the only one experiencing challenges in migrating to Databricks Unity Catalog? I observed that in Unity Catalog-enabled compute, the "Shared" access mode is still tagged as a Preview feature. This means it is not yet safe for use in production w...

  • 831 Views
  • 1 replies
  • 0 kudos
Latest Reply
jacovangelder
Honored Contributor
  • 0 kudos

Have you tried creating a volume on top of the external location, and using the volume in spark.read.parquet?i.e.   spark.read.parquet('/Volumes/<volume_name>/<folder_name>/<file_name.parquet>')  Edit: also, not sure why the Databricks community mana...

  • 0 kudos
Martin_Pham
by New Contributor III
  • 587 Views
  • 1 replies
  • 1 kudos

Resolved! Is Datbricks-Salesforce already available to use?

Reference: Salesforce and Databricks Announce Strategic Partnership to Bring Lakehouse Data Sharing and Shared ...I was going through this article and wanted to know if this is already released. My assumption is that there’s no need to use third-part...

  • 587 Views
  • 1 replies
  • 1 kudos
Latest Reply
Martin_Pham
New Contributor III
  • 1 kudos

Looks like it has been released - Salesforce BYOM

  • 1 kudos

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group
Labels