cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

noorbasha534
by Valued Contributor II
  • 1412 Views
  • 5 replies
  • 4 kudos

Resolved! OPTIMIZE in parallel with actual data load

Dear allIf I understand correctly, OPTIMIZE cannot run in parallel with actual data load. We see 'concurrent update' errors in our environment if this happens; due to which we are unable to dedicate a maintenance window for the tables health.And, I s...

  • 1412 Views
  • 5 replies
  • 4 kudos
Latest Reply
noorbasha534
Valued Contributor II
  • 4 kudos

@MariuszK @szymon_dybczak thanks both. appreciate your support.

  • 4 kudos
4 More Replies
jollymon
by New Contributor II
  • 865 Views
  • 3 replies
  • 1 kudos

Resolved! Access notebooks parameters from a bash cell

How can I access a notebook parameter from a bash cell (%sh)? For python I use dbutils.widgets.get('param'), and for SQL I can use :param. Is there a something similar for bash?

  • 865 Views
  • 3 replies
  • 1 kudos
Latest Reply
szymon_dybczak
Esteemed Contributor III
  • 1 kudos

Hi @jollymon ,I believe there is no direct way to do this. But maybe there are some workarounds though. You can try to read widgets in python and set those values as envrionment variables. Then you can use shell to read that variables. Something like...

  • 1 kudos
2 More Replies
raypritha
by New Contributor II
  • 827 Views
  • 1 replies
  • 1 kudos

Resolved! Switch from Partner Academy to Customer Academy

I accidentally signed up for the partner academy when I should have signed up for the customer academy. How can I switch to the customer academy? My e-mail is the same as I use for this community platform.

  • 827 Views
  • 1 replies
  • 1 kudos
Latest Reply
Advika
Community Manager
  • 1 kudos

Hello @raypritha! Please raise a ticket with the Databricks Support Team. They’ll be able to assist you with switching to the Customer Academy.

  • 1 kudos
yvesbeutler
by New Contributor III
  • 2115 Views
  • 2 replies
  • 5 kudos

Resolved! run_if dependencies configuration within YAML

Hi guysI have a workflow with various python wheel tasks and one job task to call another workflow. How can I prevent my original workflow from getting an unsuccessful state if the second workflow fails? These workflows are independent and shouldn't ...

dbx-issue.png
  • 2115 Views
  • 2 replies
  • 5 kudos
Latest Reply
eniwoke
Contributor II
  • 5 kudos

Hi @yvesbeutler here is a sample way I did  it using databricks asset bundles for notebook tasksresources: jobs: chained_jobs: name: chained-jobs tasks: - task_key: main notebook_task: notebook_path: /W...

  • 5 kudos
1 More Replies
pgruetter
by Contributor
  • 35116 Views
  • 8 replies
  • 2 kudos

Resolved! How to use Service Principal to connect PowerBI to Databrick SQL Warehouse

Hi allI'm struggling to connect PowerBI service to a Databricks SQL Warehouse using a service principal. I'm following mostly this guide.I created a new app registration in the AAD and created a client secret for it.Now I'm particularly struggling wi...

  • 35116 Views
  • 8 replies
  • 2 kudos
Latest Reply
Lone
New Contributor II
  • 2 kudos

Hello All,After successfully adding a service principal to Databricks and generating a client ID and client secret, I plan to utilize these credentials for authentication when configuring Databricks as a data source in Power BI. Could you please clar...

  • 2 kudos
7 More Replies
HariPrasad1
by Databricks Partner
  • 5014 Views
  • 3 replies
  • 0 kudos

Jobs in Spark UI

Is there a way to get the url where all the spark jobs which are created in a specific notebook run can be found? I am creating an audit framework, in that the requirement is to get the spark jobs of a specific task or a notebook run so that we can d...

  • 5014 Views
  • 3 replies
  • 0 kudos
Latest Reply
eniwoke
Contributor II
  • 0 kudos

Hi @HariPrasad1 here is a way to get the job list (note: works for non-serverless clusters)from dbruntime.databricks_repl_context import get_context cluster_id = spark.conf.get("spark.databricks.clusterUsageTags.clusterId") workspaceUrl = spark.conf...

  • 0 kudos
2 More Replies
turagittech
by Contributor
  • 2073 Views
  • 1 replies
  • 0 kudos

Concatenating a row to be able to hash

Hi All,Sometimes to load data we want to only update a row based on changes in values. SCD1 type scenarios for a data warehouse. One approach is equivalency A=A, B=B etc. Another is generating a hash of all rows of interest, I believe pretty common. ...

turagittech_0-1752799736547.png
  • 2073 Views
  • 1 replies
  • 0 kudos
Latest Reply
SP_6721
Honored Contributor II
  • 0 kudos

Hi @turagittech ,concat_ws() is generally the most practical and reliable option here. It handles mixed datatypes well and safely skips nulls. The only edge cases you'd typically run into are with complex or unsupported custom datatypes or if the sep...

  • 0 kudos
HoussemBL
by New Contributor III
  • 6592 Views
  • 2 replies
  • 0 kudos

Resolved! Databricks Asset Bundle deploy failure

Hello,I have deployed successfully a Databricks Job that contains one task of type DLT using Databricks Asset Bundle.First deployment works well. For this particular Databricks job, I have clicked on "disconnect from source" to do some customization....

  • 6592 Views
  • 2 replies
  • 0 kudos
Latest Reply
thibault
Contributor III
  • 0 kudos

@Walter_C, should this property be set at the same level as name, catalog, channel? I'm getting an error at the schema validation (using the template from databricks bundle schema with databricls-cli v0.260.0), and the deployment does not succeed, du...

  • 0 kudos
1 More Replies
sensanjoy
by Contributor II
  • 4529 Views
  • 5 replies
  • 3 kudos

Java SQL Driver Manager not working in Unity Catalog shared mode

Hi All,We are facing issue during establishing connection with Azure SQL server through JDBC to perform UPSERT operation into sql server. Please find the connection statement and exception received during run:conn = spark._sc._jvm.java.sql.DriverMana...

  • 4529 Views
  • 5 replies
  • 3 kudos
Latest Reply
Karthikcv
New Contributor II
  • 3 kudos

Any update on this?I am also facing same error

  • 3 kudos
4 More Replies
Henrik_
by New Contributor III
  • 7609 Views
  • 11 replies
  • 5 kudos

Can use graphframes DBR 14.3

I get the following error when trying to run GraphFrame on DBR 14.3. Anyone has an idea of how I can solve this?  """import pyspark.sql.functions as Ffrom graphframes import GraphFrame vertices = spark.createDataFrame([    ("a", "Alice", 34),    ("b"...

  • 7609 Views
  • 11 replies
  • 5 kudos
Latest Reply
Miguel_CP
Databricks Employee
  • 5 kudos

Through pip install gaphframes the version is 0.6 which is only supported by 2.x versions of Spark. For the latest versions of graphframes use: pip install graphframes-py. This, as of today, gets the version 0.9.2 which is fully compatible with Spark...

  • 5 kudos
10 More Replies
melikaabedi
by Databricks Partner
  • 1410 Views
  • 2 replies
  • 1 kudos

databricks apps

Imagine I develop an app in Databricks with #databricks-apps. Is it possible for someone outside the organization to use it just by accessing a URL, without having a Databricks account? thank you in advance for your hel

  • 1410 Views
  • 2 replies
  • 1 kudos
Latest Reply
Gareema
Contributor
  • 1 kudos

Can we query from outside using service principal or by creating a dummy user?Basically we have created an app, I want to share it with a few users, and we can give them access or onbaord to databricks workspace, major issue is we are not able to que...

  • 1 kudos
1 More Replies
MrWick
by New Contributor
  • 2015 Views
  • 1 replies
  • 0 kudos

Opt-out of schema evolution with the Lakeflow connect for sql-server?

I am trying to connect to setup lakeflow connect pulling from an on-prem sql server.  I get connected and choose the tables I want to pull data from.  Change tracking is setup on sql server, however, the dba's don't want to create the helper tables f...

  • 2015 Views
  • 1 replies
  • 0 kudos
Latest Reply
Brahmareddy
Esteemed Contributor
  • 0 kudos

Hi MrWick, How are you doing today?This is a great question, and it’s understandable that your DBAs may be cautious about allowing schema evolution helper tables on the SQL Server side. As of now, in LakeFlow Connect, opting out of schema evolution i...

  • 0 kudos
smpa01
by Contributor
  • 2379 Views
  • 1 replies
  • 1 kudos

Resolved! viewing managed delta table files

I am getting an error when I am trying to view the underlying files of the managed delta table in unity catalogsuch asfrom pyspark.sql.functions import * table_directory = "workspace.db_bronze.test_01" data = [{"x": 1, "y": 2}] df = spark.createData...

  • 2379 Views
  • 1 replies
  • 1 kudos
Latest Reply
mnorland
Valued Contributor II
  • 1 kudos

That is correct.  Users cannot directly see the content in the managed paths for the underlying data files of a managed table in Unity Catalog. (_unitystorage subdirectory and below)

  • 1 kudos
NamrataHindujaS
by New Contributor III
  • 2119 Views
  • 2 replies
  • 3 kudos

Resolved! Namrata Hinduja Geneva, Switzerland (Swiss) - Getting Started with Databricks

Hi everyone,I'm Namrata Hinduja Geneva, Switzerland (Swiss) and I come from an ETL background and am looking to get started with Databricks. I'd appreciate your guidance on a clear learning roadmap, as well as any industry-recognized certifications t...

  • 2119 Views
  • 2 replies
  • 3 kudos
Latest Reply
NamrataHindujaS
New Contributor III
  • 3 kudos

Thanks to Vinay_M_R for your valuable reply — it’s a great help. I’ll definitely follow the instructions.     RegardsNamrata Hinduja Geneva, Switzerland (Swiss)

  • 3 kudos
1 More Replies
turagittech
by Contributor
  • 944 Views
  • 2 replies
  • 0 kudos

DLT pipeline python stop scanning all databases in source

Hi All,I have set up a DLT pipleline for SQL Server to use CDC as per this instruction https://learn.microsoft.com/en-us/azure/databricks/ingestion/lakeflow-connect/sql-server-pipeline I have it in principal working, however, it scans all databases a...

  • 944 Views
  • 2 replies
  • 0 kudos
Latest Reply
turagittech
Contributor
  • 0 kudos

I thought I might follow up this after getting it all working with the help of my local Databricks office. AS the CDC has been crated it scans metadata for the server that you connect to. This may get altered in a future release, I have no idea as to...

  • 0 kudos
1 More Replies
Labels