cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 
Data + AI Summit 2024 - Data Engineering & Streaming

Forum Posts

RobsonNLPT
by Contributor
  • 406 Views
  • 2 replies
  • 0 kudos

dbfs /mounts permissions with Clusters on Shared Mode / Serverless

Hi allI've used mounts based on service principals but users using shaed clusters or the new serverless they have problems with permissions to access resources on dbfs. Right now we have used clusters in single modeWhat should be the best approach to...

  • 406 Views
  • 2 replies
  • 0 kudos
Latest Reply
" src="" />
This widget could not be displayed.
This widget could not be displayed.
This widget could not be displayed.
  • 0 kudos

This widget could not be displayed.
Hi allI've used mounts based on service principals but users using shaed clusters or the new serverless they have problems with permissions to access resources on dbfs. Right now we have used clusters in single modeWhat should be the best approach to...

This widget could not be displayed.
  • 0 kudos
This widget could not be displayed.
1 More Replies
Manish1231
by New Contributor
  • 295 Views
  • 0 replies
  • 0 kudos

how to migrate features from azure databricks workspace to gcp

I’m in the process of migrating feature tables from Azure Databricks to GCP Databricks and am having trouble listing all feature tables from Azure Databricks.I’ve tried using the FeatureStoreClient API, but it doesn’t have a function to list all feat...

Data Engineering
data engineering
  • 295 Views
  • 0 replies
  • 0 kudos
ptambe
by New Contributor III
  • 4404 Views
  • 6 replies
  • 3 kudos

Resolved! Is Concurrent Writes from multiple databricks clusters to same delta table on S3 Supported?

Does databricks have support for writing to same Delta Table from multiple clusters concurrently. I am specifically interested to know if there is any solution for https://github.com/delta-io/delta/issues/41 implemented in databricks OR if you have a...

  • 4404 Views
  • 6 replies
  • 3 kudos
Latest Reply
dennyglee
Contributor
  • 3 kudos

Please note, the issue noted above [Storage System] Support for AWS S3 (multiple clusters/drivers/JVMs) is for Delta Lake OSS. As noted in this issue as well as Issue 324, as of this writing, S3 lacks putIfAbsent transactional consistency. For Del...

  • 3 kudos
5 More Replies
talenik
by New Contributor III
  • 1182 Views
  • 2 replies
  • 1 kudos

Resolved! Ingesting logs from Databricks (GCP) to Azure log Analytics

Hi everyone, I wanted to ask if there is any way through which we can ingest logs from GCP databricks to azure log analytics in store-sync fashion. Meaning we will save logs into some cloud bucket lets say, then from there we should be able to send l...

Data Engineering
azure log analytics
Databricks
GCP databricks
google cloud
  • 1182 Views
  • 2 replies
  • 1 kudos
Latest Reply
talenik
New Contributor III
  • 1 kudos

Hi @Retired_mod ,Thanks for help. We decided to develop our own library for logging to azure log analytics. We used buffer for this. We are currently on timer based logs but in future versions we wanted to move to memory based.Thanks,Nikhil

  • 1 kudos
1 More Replies
Gary_Irick
by New Contributor III
  • 7876 Views
  • 9 replies
  • 10 kudos

Delta table partition directories when column mapping is enabled

I recently created a table on a cluster in Azure running Databricks Runtime 11.1. The table is partitioned by a "date" column. I enabled column mapping, like this:ALTER TABLE {schema}.{table_name} SET TBLPROPERTIES('delta.columnMapping.mode' = 'nam...

  • 7876 Views
  • 9 replies
  • 10 kudos
Latest Reply
talenik
New Contributor III
  • 10 kudos

Hi @Retired_mod , I have few queries on Directory Names with Column Mapping. I have this delta table on ADLS and I am trying to read it, but I am getting below error. How can we read delta tables with column mapping enabled with pyspark?Can you pleas...

  • 10 kudos
8 More Replies
kodexolabs
by New Contributor
  • 326 Views
  • 0 replies
  • 0 kudos

Federated Learning for Decentralized, Secure Model Training

Federated learning allows you to train machine learning models on decentralized data while ensuring data privacy and security by storing data on local devices and only sharing model updates. This approach assures that raw data never leaves its source...

  • 326 Views
  • 0 replies
  • 0 kudos
venkateshp
by New Contributor II
  • 534 Views
  • 3 replies
  • 3 kudos

How to reliably get the databricks run time version as part of init scripts in aws/azure databricks

We currently use the script below, but it is not working in some environments.The environment variable used in the script is not listed in this link Databricks Environment Variables```bash#!/bin/bashecho "Databricks Runtime Version: $DATABRICKS_RUNTI...

Data Engineering
init scripts
  • 534 Views
  • 3 replies
  • 3 kudos
Latest Reply
szymon_dybczak
Contributor III
  • 3 kudos

If environment variable doesn't work for you, then maybe try with REST API or databrick cli?

  • 3 kudos
2 More Replies
guangyi
by Contributor II
  • 618 Views
  • 1 replies
  • 0 kudos

Resolved! How exactly to create cluster policy via Databricks CLI ?

I tried these ways they are all not working:  Save the json config into a JSON file locally and run databricks cluster-policies create --json cluster-policy.json Error message: Error: invalid character 'c' looking for beginning of valueSave the json ...

  • 618 Views
  • 1 replies
  • 0 kudos
Latest Reply
szymon_dybczak
Contributor III
  • 0 kudos

Hi @guangyi ,Try to add @ before the name of json filedatabricks cluster-policies create --json @policy.json Also make sure that you're escaping quotation marks like they do in below documenation:Create a new policy | Cluster Policies API | REST API ...

  • 0 kudos
ruoyuqian
by New Contributor II
  • 477 Views
  • 0 replies
  • 0 kudos

dbt writing parquet from Volumes to Catalog schema

I have ran into a weird situation, so I uploaded few parquet files (about 10) for my sales data into the Volume in my catalog, and run dbt againt it , dbt went successful and table was able to be created however when i upload a lot more parquet files...

  • 477 Views
  • 0 replies
  • 0 kudos
mddheeraj
by New Contributor
  • 301 Views
  • 0 replies
  • 0 kudos

Streaming Kafka data without duplication

Hello,We are creating an application to read data from Kafka topic send by a source. After we get the data, we do some transformations and send to other Kafka topic. In this process source may send same data twice.Our questions are1. How can we contr...

  • 301 Views
  • 0 replies
  • 0 kudos
suqadi
by New Contributor
  • 218 Views
  • 1 replies
  • 0 kudos

systems table predictive_optimization_operations_history stays empty

Hi,For our lakehouse with Unity catalog enabled, we enabled predictive optimization feature for several catalogs to clean up storage with Vacuum. When we describe the catalogs, we can see that predictive optimization is enabled. The system table for ...

  • 218 Views
  • 1 replies
  • 0 kudos
Latest Reply
Walter_C
Honored Contributor
  • 0 kudos

Hello as per docs data could take 24 hours to be retrieved, can you confirm if the below requirement are met?Your region must support predictive optimization (see Databricks clouds and regions).

  • 0 kudos
anh-le
by New Contributor
  • 530 Views
  • 1 replies
  • 2 kudos

Image disappears after notebook export to HTML

Hi everyone,I have an image saved at DBFS which I want to include in my notebook. I'm using the standard markdown syntax![my image] (/files/my_image.png)which works and the image shows. However, when I export the notebook to HTML, the image disappear...

  • 530 Views
  • 1 replies
  • 2 kudos
Latest Reply
Walter_C
Honored Contributor
  • 2 kudos

The issue you're experiencing might be due to the fact that when you export your notebook to HTML, the image from DBFS isn't accessible in the same way as it is within the Databricks environment. The DBFS path isn't accessible from outside Databricks...

  • 2 kudos
prasadvaze
by Valued Contributor II
  • 401 Views
  • 1 replies
  • 2 kudos

Resolved! Grant permission on catalog but revoke from schema for the same user

I have a catalog ( in unity catalog) containing multiple schemas.  I need an AD group to have select permission on all the schemas so at catalog level I granted Select to AD grp.  Then, I need to revoke permission on one particular schema in this cat...

  • 401 Views
  • 1 replies
  • 2 kudos
Latest Reply
Walter_C
Honored Contributor
  • 2 kudos

This unfortunately is not possible due to the hierarchical mechanism in UC, you will need to grant permissions to the specific schemas directly and not by providing a major permission at the catalog level

  • 2 kudos
Abhot
by New Contributor II
  • 5396 Views
  • 4 replies
  • 0 kudos

Temp Table Vs Temp View Vs temp table function- which one is better for large Databrick data processing

Hello , 1 ) Which one is better during large data processing - Temp table vs Temporary view vs temp Table function . 2) How lazy evaluation better for processing ? and which one of the above helps in lazy evaluation

  • 5396 Views
  • 4 replies
  • 0 kudos
Latest Reply
Abhot
New Contributor II
  • 0 kudos

Does anyone have any suggestions regarding the question above?

  • 0 kudos
3 More Replies

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group
Labels