cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 
Data + AI Summit 2024 - Data Engineering & Streaming

Forum Posts

sandy311
by New Contributor III
  • 1143 Views
  • 4 replies
  • 3 kudos

if else conditions in databricks asset bundles

Can I use if-else conditions in databricks.yml and parameterize my asset bundles similarly to Azure Pipelines YAML?

  • 1143 Views
  • 4 replies
  • 3 kudos
Latest Reply
filipniziol
Contributor III
  • 3 kudos

Hi @sandy311 ,Could you please provide more details on what you’re trying to achieve?It seems like you are looking to use Databricks Asset Bundles as complete CI/CD pipelines. While Databricks Asset Bundles are a crucial part of the CI/CD process, th...

  • 3 kudos
3 More Replies
cm04
by New Contributor III
  • 516 Views
  • 2 replies
  • 3 kudos

Resolved! Why does my job run on shared compute instead of job compute?

I have configured a job using `databricks.yml````resources:  jobs:    my_job:      name: my_job      tasks:        - task_key: create_feature_tables          job_cluster_key: my_job_cluster          spark_python_task:            python_file: ../src/c...

cm04_0-1725643451954.png
  • 516 Views
  • 2 replies
  • 3 kudos
Latest Reply
szymon_dybczak
Contributor III
  • 3 kudos

Hi @cm04 ,You can try to upgrade CLI to newest version. I've seen similiar issue before and upgrading CLI was a solution back then.Solved: Yml file replacing job cluster with all-purpose cl... - Databricks Community - 72248

  • 3 kudos
1 More Replies
Deloitte_DS
by New Contributor II
  • 3349 Views
  • 2 replies
  • 0 kudos

Unable to install poppler-utils

Hi,I'm trying to install system level package "Poppler-utils" for the cluster. I added the following line to the init.sh script.sudo apt-get -f -y install poppler-utilsI got the following error: PDFInfoNotInstalledError: Unable to get page count. Is ...

  • 3349 Views
  • 2 replies
  • 0 kudos
Latest Reply
dheeraj-cir
New Contributor II
  • 0 kudos

use a personal cluster and use !sudo apt-get updateand!sudo apt-get install -y poppler-utils

  • 0 kudos
1 More Replies
Graham
by New Contributor III
  • 4566 Views
  • 4 replies
  • 3 kudos

Resolved! Inline comment next to un-tickmarked SET statement = Syntax error

Running this code in databricks SQL works great:SET USE_CACHED_RESULT = FALSE;   -- Result: -- key value -- USE_CACHED_RESULT FALSEIf I add an inline comment, however, I get a syntax error:SET USE_CACHED_RESUL...

  • 4566 Views
  • 4 replies
  • 3 kudos
Latest Reply
rafal_walisko
New Contributor II
  • 3 kudos

Hi, I'm getting the same error when trying to execute statement through API "statement": "SET `USE_CACHED_RESULT` = FALSE; SELECT COUNT(*) FROM TABLE" Every combination fail  "status": { "state": "FAILED", "error": { "e...

  • 3 kudos
3 More Replies
shri0509
by New Contributor II
  • 1112 Views
  • 5 replies
  • 1 kudos

How to avoid iteration/loop in databricks in the given scenario

Hi all, I need your input.I am new to Databricks and working with a dataset that consists of around 10,000 systems, each containing approximately 100 to 150 parts. These parts have attributes such as name, version, and serial number. The dataset size...

Data Engineering
data engineering
  • 1112 Views
  • 5 replies
  • 1 kudos
Latest Reply
AnnieWhite
New Contributor II
  • 1 kudos

Thank you so much for the link.

  • 1 kudos
4 More Replies
Tico23
by Contributor
  • 13798 Views
  • 12 replies
  • 10 kudos

Connecting SQL Server (on-premise) to Databricks via jdbc:sqlserver

Is it possible to connect to SQL Server on-premise (Not Azure) from Databricks?I tried to ping my virtualbox VM (with Windows Server 2022) from within Databricks and the request timed out.%sh   ping 122.138.0.14This is what my connection might look l...

  • 13798 Views
  • 12 replies
  • 10 kudos
Latest Reply
BharathKumarS
New Contributor II
  • 10 kudos

I tried to connect to localhost sql server through databricks community edition, but it failed. I have created an IP rule on port 1433 allowed inbound connection from all public network, but still didn't connect. I tried locally using python its work...

  • 10 kudos
11 More Replies
guangyi
by Contributor III
  • 709 Views
  • 4 replies
  • 2 kudos

Resolved! How to create a DLT pipeline with SQL statement

I need a DLT pipeline to create a materialized view for fetching event logs. All the ways below I tried are failed:Attach a notebook with pure SQL inside: No magic cell like `%sql` are failedAttach a notebook with `spark.sql` python code: Failed beca...

  • 709 Views
  • 4 replies
  • 2 kudos
Latest Reply
guangyi
Contributor III
  • 2 kudos

After just finishing my last reply, I realized what’s wrong with my code: I should use “file” property instead of “notebook” in the libraries section.It works now. Thank you guys, you are my rubber duck!

  • 2 kudos
3 More Replies
MarkV
by New Contributor II
  • 465 Views
  • 1 replies
  • 0 kudos

Getting PermissionDenied in SDK When Updating External Location Isolation Mode

Using the Databricks SDK for Python in a notebook in a Databricks workspace, I'm creating an external location and then attempting to update the isolation mode and workspace bindings associated with the external location. The step to create the exter...

  • 465 Views
  • 1 replies
  • 0 kudos
Latest Reply
MarkV
New Contributor II
  • 0 kudos

Let me clean-up these cells for better readability:%pip install databricks-sdk --upgradedbutils.library.restartPython()from databricks.sdk import WorkspaceClient from databricks.sdk.service import catalogw = WorkspaceClient()# This works without issu...

  • 0 kudos
Agus1
by New Contributor III
  • 3103 Views
  • 2 replies
  • 0 kudos

Obtain the source table version number from checkpoint file when using Structured Streaming

Hello!I'm using Structured Streaming to write to a delta table. The source is another delta table written with Structured Streaming as well. In order to datacheck the results I'm attempting to obtain from the checkpoint files of the target table the ...

  • 3103 Views
  • 2 replies
  • 0 kudos
Latest Reply
Agus1
New Contributor III
  • 0 kudos

Hello @Retired_mod, thank you for your answer.I'm a bit confused here because you seem to be describing the opposite behavior of what I've seen in our checkpoint files.Here I repost my examples to try to understand better.First checkpoint file:{"sour...

  • 0 kudos
1 More Replies
simple89
by New Contributor
  • 238 Views
  • 0 replies
  • 0 kudos

Runtime increases exponentially from 11.3 to 13.3

Hello. I am using R on databricks and using the below approach. My Spark version:Single node: i3.2xlarge · On-demand · DBR: 11.3 LTS (includes Apache Spark 3.3.0, Scala 2.12) · us-east-1a, the job takes 1 hourI install all R packages (including a geo...

  • 238 Views
  • 0 replies
  • 0 kudos
ae20cg
by New Contributor III
  • 3704 Views
  • 5 replies
  • 9 kudos

Databricks Cluster Web terminal different permissions with tmux and xterm.

I am launching web terminal on my databricks cluster and when I am using the ephemeral xterm instance I am easily able to navigate to desired directory in `Workspace` and run anything... for example `ls ./` When I switch to tmux so that I can preserv...

  • 3704 Views
  • 5 replies
  • 9 kudos
Latest Reply
alenka
New Contributor III
  • 9 kudos

Hey there, fellow data explorer pals! I totally get your excitement when launching that web terminal on your Databricks cluster and feeling the power of running commands like 'ls ./' in the ephemeral xterm instance. It's like traversing the vast univ...

  • 9 kudos
4 More Replies
kranthi2
by New Contributor III
  • 534 Views
  • 2 replies
  • 2 kudos

Resolved! alter DLT Materialized View alter column set MASK

I am trying to mask a column on a DLT materialized view - this is created using DLT syntax. I am not able set the column masking after creation. Appreciate any workaround.alter DLT Materialized View alter column set MASK

  • 534 Views
  • 2 replies
  • 2 kudos
Latest Reply
kranthi2
New Contributor III
  • 2 kudos

Thank you. I will submit the idea.

  • 2 kudos
1 More Replies
prasadvaze
by Valued Contributor II
  • 20405 Views
  • 15 replies
  • 12 kudos

Resolved! How to query delta lake using SQL desktop tools like SSMS or DBVisualizer

Is there a way to use sql desktop tools? because delta OSS or databricks does not provide desktop client (similar to azure data studio) to browse and query delta lake objects.I currently use databricks SQL , a webUI in the databricks workspace but se...

  • 20405 Views
  • 15 replies
  • 12 kudos
Latest Reply
prasadvaze
Valued Contributor II
  • 12 kudos

DSR is Delta Standalone Reader. see more here - https://docs.delta.io/latest/delta-standalone.htmlIts a crate (and also now a py library) that allows you to connect to delta tables without using spark (e.g. directly from python and not using pyspa...

  • 12 kudos
14 More Replies

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group
Labels