cancel
Showing results for 
Search instead for 
Did you mean: 
Get Started Discussions
Start your journey with Databricks by joining discussions on getting started guides, tutorials, and introductory topics. Connect with beginners and experts alike to kickstart your Databricks experience.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

alwaysmoredata
by New Contributor II
  • 551 Views
  • 1 replies
  • 0 kudos

COPY INTO from Volume failure (rabbit hole)

hey guys, I am stuck on a loading task, and I simply can't spot what is wrong. The following query fails: COPY INTO `test`.`test_databricks_tokenb3337f88ee667396b15f4e5b2dd5dbb0`.`pipeline_state`FROM '/Volumes/test/test_databricks_tokenb3337f88ee6673...

  • 551 Views
  • 1 replies
  • 0 kudos
Latest Reply
NandiniN
Databricks Employee
  • 0 kudos

I see you are reading just 1 file, ensure that there are no zero-byte files in the directory. Zero-byte files can cause schema inference to fail. Double-check that the directory contains valid Parquet files using parquet tools. Sometimes, even if the...

  • 0 kudos
mrstevegross
by Contributor III
  • 604 Views
  • 1 replies
  • 0 kudos

How to identify the goal of a specific Spark job?

I'm analyzing the performance of a DBR/Spark request. In this case, the cluster is created using a custom image, and then we run a job on it.I've dived into the "Spark UI" part of the DBR interface, and identified 3 jobs that appear to account for an...

  • 604 Views
  • 1 replies
  • 0 kudos
Latest Reply
Lakshay
Databricks Employee
  • 0 kudos

The spark jobs are decided based on your spark code. You can look at the spark plan to understand what operations each spark job/stage is executing

  • 0 kudos
rhtermaat
by New Contributor II
  • 1414 Views
  • 3 replies
  • 1 kudos

Databricks workspace adjust column width

Hi, is it possible to change the column width in the workspace overview?  Currently I have a lot of jobs with a name which is too wide for the standard overview and so it not easy to find certain jobs.

  • 1414 Views
  • 3 replies
  • 1 kudos
Latest Reply
pranav_k1
New Contributor III
  • 1 kudos

Ahh my mistake! You are right. It can be done only in workflow

  • 1 kudos
2 More Replies
pdiamond
by Contributor
  • 1333 Views
  • 2 replies
  • 0 kudos

JDBC Invalid SessionHandle with dbSQL Warehouse

Connecting Pentaho Ctools dashboards to Databricks using JDBC to a serverless dbSQL Warehouse, it works fine on the initial load, but then if we leave it idle for awhile and come back we get this error:[Databricks][JDBCDriver](500593) Communication l...

  • 1333 Views
  • 2 replies
  • 0 kudos
Latest Reply
pdiamond
Contributor
  • 0 kudos

I should have mentioned that we're using AuthMech=3 and in the JDBC docs (Databricks JDBC Driver Installation and Configuration Guide) I don't see any relevant timeout settings that would apply in that scenario. Am I missing something?

  • 0 kudos
1 More Replies
ShankarM
by Contributor
  • 1292 Views
  • 6 replies
  • 1 kudos

Unity Catalog for Enterprise level governance

Can we import cataloguing information from other non Databricks workloads into unity catalog? Importing metadata information from Synapse, Redshift, ADF etc. into Unity catalog for end to end lineage and tracking? 

  • 1292 Views
  • 6 replies
  • 1 kudos
Latest Reply
Alberto_Umana
Databricks Employee
  • 1 kudos

Yes, it is possible, but limited at the moment. This is being implemented and under private preview. There is an API called "Bring-your-own Lineage". You can test it but for that you would need to contact your account team to allow you to use the fea...

  • 1 kudos
5 More Replies
tomvogel01
by New Contributor II
  • 696 Views
  • 1 replies
  • 0 kudos

Understanding Photon Row Group Skipping

Hey guys!I am using Photon to do a simple point query on a Liquid Clustered table with the purpose of understanding the statistics. I see that a significant number of files have been pruned (`files pruned`: 1104, `files read`:files read).However I am...

Screenshot 2025-01-24 at 10.07.05.png
  • 696 Views
  • 1 replies
  • 0 kudos
Latest Reply
Sidhant07
Databricks Employee
  • 0 kudos

Hi @tomvogel01 , "row groups skipped via lazy materialization" refers to the process where certain row groups are not physically read into memory during query execution. This is due to the ability of Photon to perform filtering at the row group level...

  • 0 kudos
JCamiloCS
by New Contributor II
  • 10935 Views
  • 2 replies
  • 1 kudos

how to use R in databricks

Hello everyone.I am a new user of databricks, they implemented it in the company where I work. I am a business analyst and I know something about R, not much either, when I saw that databricks could use R I was very excited because I thought that the...

  • 10935 Views
  • 2 replies
  • 1 kudos
Latest Reply
Rens
New Contributor II
  • 1 kudos

There are some existing posts about using R in databricks:https://docs.gcp.databricks.com/en/sparkr/index.htmlhttps://docs.databricks.com/en/dev-tools/databricks-connect/cluster-config.htmlOnce you have the correct cluster started (this post is about...

  • 1 kudos
1 More Replies
MohsenJ
by Contributor
  • 8195 Views
  • 8 replies
  • 1 kudos

log signature and input data for Spark LinearRegression

I am looking for a way to log my `pyspark.ml.regression.LinearRegression` model with input and signature ata. The usual example that I found around are using sklearn and they can simply do  # Log the model with signature and input example signature =...

Get Started Discussions
mlflow
model_registray
  • 8195 Views
  • 8 replies
  • 1 kudos
Latest Reply
LuluLiu
New Contributor II
  • 1 kudos

I accidentally stumbled upon this ticket when researching on a similar issue. Note that starting from MLflow 2.15.0 it supports VectorUDT. https://mlflow.org/releases/2.15.0 

  • 1 kudos
7 More Replies
SreeRam
by New Contributor
  • 472 Views
  • 0 replies
  • 0 kudos

Patient Risk Score based on health history: Unable to create data folder for artifacts in S3 bucket

Hi All,we're using the below git project to build PoC on the concept of "Patient-Level Risk Scoring Based on Condition History": https://github.com/databricks-industry-solutions/hls-patient-riskI was able to import the solution into Databricks and ru...

  • 472 Views
  • 0 replies
  • 0 kudos
Iguinrj11
by New Contributor II
  • 718 Views
  • 1 replies
  • 0 kudos

Data Bricks x Power BI Report Server

I connected two .pbix files to the local server. In the first, I used Import connectivity, and in the second, Direct Query connectivity. However, I encountered the following problems: Import connection: The data is viewed successfully, but it is not ...

Iguinrj11_0-1737741174035.png Iguinrj11_1-1737741227201.png
  • 718 Views
  • 1 replies
  • 0 kudos
Latest Reply
peter598philip
New Contributor II
  • 0 kudos

@Iguinrj11 wrote:I connected two .pbix files to the local server. In the first, I used Import connectivity, and in the second, Direct Query connectivity. However, I encountered the following problems: Import connection: The data is viewed successfull...

  • 0 kudos
Sase
by New Contributor II
  • 1523 Views
  • 5 replies
  • 0 kudos

Building a Custom Usage Dashboard using APIs for Job-Level Cost Insights

Since Databricks does not provide individual cost breakdowns for components like Jobs or Compute, we aim to create a custom usage dashboard leveraging APIs to display the cost of each job run across Databricks, Azure Data Factory (ADF), or serverless...

Get Started Discussions
apis
Cost Analysis
jobs
  • 1523 Views
  • 5 replies
  • 0 kudos
Latest Reply
Isi
Honored Contributor III
  • 0 kudos

Hey,Yes, I am not Azure expert but, Databricks REST API can help you extract usage data for serverless resources, allowing you to integrate this information into custom dashboards or external tools like Grafana.On the Azure side, costs related to wil...

  • 0 kudos
4 More Replies
Daan
by New Contributor III
  • 2799 Views
  • 4 replies
  • 1 kudos

Resolved! Permission denied during write

Hey everyone,I have a pipeline that fetches data from s3 and stores them under the Databricks .tmp/ folder.The pipeline is always able to write around 200 000 files before I get a Permission Denied error. This happens in the following code block: os....

  • 2799 Views
  • 4 replies
  • 1 kudos
Latest Reply
Daan
New Contributor III
  • 1 kudos

Thanks for your reply Walter! The filenames are already unique, retries produce the same result and I have the necessary permission as I was able to write the other 200 000 files (with the same program that is running continuous). It does makes sense...

  • 1 kudos
3 More Replies
Ashishkumar6921
by New Contributor III
  • 15356 Views
  • 12 replies
  • 0 kudos

Resolved! databricks data engineer associate exam

Hello Team, I encountered Pathetic experience while attempting my 1st Databricks certification. I was giving the exam and Abruptly, Proctor asked me to show my desk, everything i showed every corner of my bed.. It was neat and clean with no suspiciou...

  • 15356 Views
  • 12 replies
  • 0 kudos
Latest Reply
Cert-Team
Databricks Employee
  • 0 kudos

Hi @gokul2 the badge was issued on Dec 2. We just resent the email. Please check your spam. If you continue to have issues, please file a ticket with our support team: https://help.databricks.com/s/contact-us?ReqType=training

  • 0 kudos
11 More Replies
mrstevegross
by Contributor III
  • 1327 Views
  • 3 replies
  • 0 kudos

Resolved! How to grant custom container AWS credentials for reading init script?

I'm using a customer container *and* init scripts. At runtime, I get this error:Cluster '...' was terminated. Reason: INIT_SCRIPT_FAILURE (CLIENT_ERROR). Parameters: instance_id:i-0440ddd3a2d5cce79, databricks_error_message:Cluster scoped init script...

  • 1327 Views
  • 3 replies
  • 0 kudos
Latest Reply
mrstevegross
Contributor III
  • 0 kudos

Followup: I got the AWS creds working by amending our AWS role to permit read/write access to our S3 bucket. Woohoo!

  • 0 kudos
2 More Replies
mrstevegross
by Contributor III
  • 1588 Views
  • 3 replies
  • 0 kudos

Resolved! Format when specifying docker_image url?

I am providing a custom Docker image to my Databricks/Spark job. I've created the image and uploaded it to our private ECR registry (the URL is `472542229217.dkr.ecr.us-west-2.amazonaws.com/tectonai/mrstevegross-testing:latest`). Based on the docs (h...

  • 1588 Views
  • 3 replies
  • 0 kudos
Latest Reply
mrstevegross
Contributor III
  • 0 kudos

Thanks, that's pretty much what I did; a lot of terraform configuration to get the AWS account set up properly, and now I'm able to tell DBR to load the container. (FWIW, I'm encountering *new* access issues; I started a thread here (https://communit...

  • 0 kudos
2 More Replies

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels