cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 
Data + AI Summit 2024 - Data Engineering & Streaming

Forum Posts

Alex_Persin
by New Contributor III
  • 5371 Views
  • 4 replies
  • 6 kudos

How can the shared memory size (/dev/shm) be increased on databricks worker nodes with custom docker images?

PyTorch uses shared memory to efficiently share tensors between its dataloader workers and its main process. However in a docker container the default size of the shared memory (a tmpfs file system mounted at /dev/shm) is 64MB, which is too small to ...

  • 5371 Views
  • 4 replies
  • 6 kudos
Latest Reply
OxFF
New Contributor II
  • 6 kudos

Recently stumbled on this problem. It seems like it basically makes impossible usage of compute with custom docker images for any pytorch-based real life computer vision ML experiments. Which is unfortunate. +1 for requesting followup and possible al...

  • 6 kudos
3 More Replies
KristiLogos
by New Contributor III
  • 1247 Views
  • 9 replies
  • 4 kudos

Resolved! Load parent columns and not unnest using pyspark? Found invalid character(s) ' ,;{}()\n' in schema

I'm not sure I'm working this correctly but I'm having some issues with the column names when I try to load to a table in our databricks catalog. I have multiple .json.gz files in our blob container that I want to load to a table:df = spark.read.opti...

  • 1247 Views
  • 9 replies
  • 4 kudos
Latest Reply
szymon_dybczak
Contributor III
  • 4 kudos

Hi @KristiLogos ,Check if your JSON doesn't have characters contained in error message in it's key values. 

  • 4 kudos
8 More Replies
valjas
by New Contributor III
  • 2226 Views
  • 1 replies
  • 0 kudos

Warehouse Name in System Tables

Hello.I am creating a table to monitor the usage of All-purpose Compute and SQL Warehouses. From the tables in 'system' catalog, I can get cluster_name and cluster_id. However only warehouse_id is available and not warehouse name. Is there a way to g...

  • 2226 Views
  • 1 replies
  • 0 kudos
Latest Reply
" src="" />
This widget could not be displayed.
This widget could not be displayed.
This widget could not be displayed.
  • 0 kudos

This widget could not be displayed.
Hello.I am creating a table to monitor the usage of All-purpose Compute and SQL Warehouses. From the tables in 'system' catalog, I can get cluster_name and cluster_id. However only warehouse_id is available and not warehouse name. Is there a way to g...

This widget could not be displayed.
  • 0 kudos
This widget could not be displayed.
wendyl
by New Contributor II
  • 650 Views
  • 3 replies
  • 0 kudos

Connection Refused: [Databricks][JDBC](11640) Required Connection Key(s): PWD;

Hey I'm trying to connect to Databricks using client id and secrets. I'm using JDBC 2.6.38.I'm using the following connection url: jdbc:databricks://<server-hostname>:443;httpPath=<http-path>;AuthMech=11;Auth_Flow=1;OAuth2ClientId=<service-principal-...

  • 650 Views
  • 3 replies
  • 0 kudos
Latest Reply
szymon_dybczak
Contributor III
  • 0 kudos

Hi @wendyl ,Could you give as an answer for the following questions? - does your workspace have private link ?- do you use  Microsoft Entra ID managed service principal ?- if you used Entra ID managed SP, did you use secret from Entra ID, or Azure Da...

  • 0 kudos
2 More Replies
abhinandan084
by New Contributor III
  • 21593 Views
  • 18 replies
  • 12 kudos

Community Edition signup issues

I am trying to sign up for the community edition (https://databricks.com/try-databricks) for use with a databricks academy course. However, I am unable to signup and I receive the following error (image attached). On going to login page (link in ora...

0693f000007OoQjAAK
  • 21593 Views
  • 18 replies
  • 12 kudos
Latest Reply
brokeTechBro
New Contributor II
  • 12 kudos

Hello,I get "An error occurred, try again"I am exhausted from trying... also from solving the puzzle to prove I'm not a robot

  • 12 kudos
17 More Replies
Himanshu4
by New Contributor II
  • 1905 Views
  • 5 replies
  • 2 kudos

Inquiry Regarding Enabling Unity Catalog in Databricks Cluster Configuration via API

Dear Databricks Community,I hope this message finds you well. I am currently working on automating cluster configuration updates in Databricks using the API. As part of this automation, I am looking to ensure that the Unity Catalog is enabled within ...

  • 1905 Views
  • 5 replies
  • 2 kudos
Latest Reply
Himanshu4
New Contributor II
  • 2 kudos

Hi RaphaelCan we fetch job details from one workspace and create new job in new workspace with the same "job id" and configuration?

  • 2 kudos
4 More Replies
mayur_05
by New Contributor II
  • 482 Views
  • 3 replies
  • 0 kudos

access cluster executor logs

Hi Team,I want to get realtime log for cluster executor and driver stderr/stdout log while performing data operations and save those log in catalog's volume

  • 482 Views
  • 3 replies
  • 0 kudos
Latest Reply
gchandra
Databricks Employee
  • 0 kudos

you can create it for Job Clusters compute too. The specific cluster log folder will be under /dbfs/cluster-logs (or whatever you change it to)    

  • 0 kudos
2 More Replies
TheManOfSteele
by New Contributor III
  • 743 Views
  • 2 replies
  • 0 kudos

Resolved! Databricks-connect Configure a connection to serverless compute Not working

Following these instructions, at https://docs.databricks.com/en/dev-tools/databricks-connect/python/install.html#configure-a-connection-to-serverless-compute There seems to be an issue with the example code.from databricks.connect import DatabricksSe...

  • 743 Views
  • 2 replies
  • 0 kudos
Latest Reply
TheManOfSteele
New Contributor III
  • 0 kudos

Worked! Thank you!

  • 0 kudos
1 More Replies
Dave_Nithio
by Contributor
  • 456 Views
  • 1 replies
  • 0 kudos

Delta Table Log History not Updating

I am running into an issue related to my Delta Log and an old version. I currently have default delta settings for delta.checkpointInterval (10 commits as this table was created prior to DBR 11.1), delta.deletedFileRetentionDuration (7 days), and del...

Dave_Nithio_4-1726759906146.png Dave_Nithio_2-1726759822867.png Dave_Nithio_1-1726759722776.png Dave_Nithio_5-1726760080078.png
  • 456 Views
  • 1 replies
  • 0 kudos
Latest Reply
jennie258fitz
New Contributor III
  • 0 kudos

@Dave_Nithio wrote:I am running into an issue related to my Delta Log and an old version. I currently have default delta settings for delta.checkpointInterval (10 commits as this table was created prior to DBR 11.1), delta.deletedFileRetentionDuratio...

  • 0 kudos
hpant
by New Contributor III
  • 430 Views
  • 1 replies
  • 0 kudos

" ResourceNotFound" error is coming on connecting devops repo to databricks workflow(job).

I have a .py file in a repo in azure devops,I want to add it in a workflow in databricks and these are the values I have provided. And the source is this:I have provided all the values correctly but getting this error: " ResourceNotFound". Can someon...

hpant_0-1725539147316.png hpant_2-1725539295054.png hpant_3-1725539358879.png
  • 430 Views
  • 1 replies
  • 0 kudos
Latest Reply
nicole_lu_PM
Databricks Employee
  • 0 kudos

Can you try cloning the DevOps repo as a Git folder? The git folder clone interface should ask you to set up a Git credential if it's not already there.

  • 0 kudos
drii_cavalcanti
by New Contributor III
  • 2654 Views
  • 3 replies
  • 0 kudos

DBUtils commands do not work on shared access mode clusters

Hi there,I am trying to upload a file to an s3 bucket. However, none of dbutils commands seem to work neither does the boto3 library. For clusters that have the configuration, except for the shared access mode, seem to work fine.Those are the error m...

  • 2654 Views
  • 3 replies
  • 0 kudos
Latest Reply
mvdilts1
New Contributor II
  • 0 kudos

I am encountering very similar behavior to drii_cavalcanti.  When I use a Shared cluster with an IAM Role specified I can verify that the aws cli is installed but when I run aws sts get-caller-identity I receive the error "Unable to locate credential...

  • 0 kudos
2 More Replies
ziafazal
by New Contributor II
  • 672 Views
  • 3 replies
  • 0 kudos

How to stop a continuous pipeline which is set to RETRY on FAILURE and failing for some reason

I have created a pipeline which is continuous and set to RETRY on FAILURE. For some reason it keeps failing and retrying. Is there any way I can stop it. Hitting Stop button throws an error.

  • 672 Views
  • 3 replies
  • 0 kudos
Latest Reply
ziafazal
New Contributor II
  • 0 kudos

Hi @szymon_dybczak I already tried to remove it via REST API but got same error as in the pipeline logs. Eventually, I had to remove workspace to get rid of it.

  • 0 kudos
2 More Replies
jen-metaplane
by New Contributor II
  • 598 Views
  • 4 replies
  • 1 kudos

How to get catalog and schema from system query table

Hi,We are querying the system.query table to parse query history. If the table in the query is not fully qualified with its catalog and schema, how can we derive the catalog and schema?Thanks,Jen

  • 598 Views
  • 4 replies
  • 1 kudos
Latest Reply
filipniziol
Contributor III
  • 1 kudos

There is no straightforward method to get this data. Run the query to check the defaults:SELECT current_catalog() AS default_catalog, current_schema() AS default_schema;Catalog and schema may be changed in the query, so if you have query text you...

  • 1 kudos
3 More Replies
Tom_Greenwood
by New Contributor III
  • 10991 Views
  • 12 replies
  • 3 kudos

UDF importing from other modules

Hi community,I am using a pyspark udf. The function is being imported from a repo (in the repos section) and registered as a UDF in a the notebook. I am getting a PythonException error when the transformation is run. This is comming from the databric...

Tom_Greenwood_0-1706798998837.png
  • 10991 Views
  • 12 replies
  • 3 kudos
Latest Reply
Abdul-Mannan
New Contributor III
  • 3 kudos

I faced this issue when i was running data ingestion on unity catalog table where the cluster access mode was shared.i changed it to `Single user` and re-ran it again, now it is working. 

  • 3 kudos
11 More Replies
afisl
by New Contributor II
  • 9017 Views
  • 7 replies
  • 5 kudos

Resolved! Apply unitycatalog tags programmatically

Hello,I'm interested in the "Tags" feature of columns/schemas/tables of the UnityCatalog (described here: https://learn.microsoft.com/en-us/azure/databricks/data-governance/unity-catalog/tags)I've been able to play with them by hand and would now lik...

Data Engineering
tags
unitycatalog
  • 9017 Views
  • 7 replies
  • 5 kudos
Latest Reply
Jiri_Koutny
New Contributor III
  • 5 kudos

Hi, running ALTER TABLE SET TAGS works on views too!

  • 5 kudos
6 More Replies

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group
Labels