cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

jeremy98
by Contributor
  • 438 Views
  • 1 replies
  • 0 kudos

Resolved! how to pass using DABs the parameters to use

Hello Community, I want to pass parameters to my Databricks job through the DABs CLI. Specifically, I'd like to be able to run a job with parameters directly using the command:  databricks bundle run -t prod --params [for example: table_name="client"...

  • 438 Views
  • 1 replies
  • 0 kudos
Latest Reply
szymon_dybczak
Esteemed Contributor III
  • 0 kudos

Hi @jeremy98 ,You can pass parameters using CLI in following way: databricks bundle run -t ENV --params Param1=Value1,Param2=Value2 Job_Name And in your yml file you should define parameters in similar way to following:You can find more info in follo...

  • 0 kudos
parasupadhyay
by New Contributor
  • 243 Views
  • 1 replies
  • 0 kudos

Pydeequ is not working with shared mode clusters on a unity catalog enabled databricks account

Hi folks,Recently i was doing a poc on pydeequ. I found that it is not working with shared mode clusters in a databricks account where unity catalog is enabled. It throws the below error :It works fine with the single mode clusters. Does anyone have ...

parasupadhyay_0-1733142604026.png
  • 243 Views
  • 1 replies
  • 0 kudos
Latest Reply
szymon_dybczak
Esteemed Contributor III
  • 0 kudos

Hi @parasupadhyay ,I don't think there is much you can do. For security reasons, you can't use shared access mode in UC to access sparkContext. Your only option to make it work is to use single mode cluster.

  • 0 kudos
646901
by New Contributor II
  • 349 Views
  • 2 replies
  • 0 kudos

Get user who ran a job

From the databricks API/ CLI is it possible to get the user who triggered a job run programatically?The information can be found in the job "event log" and can be queried in the "audit log" but neither of these seem to have a API option. Is there a w...

  • 349 Views
  • 2 replies
  • 0 kudos
Latest Reply
Stefan-Koch
Valued Contributor
  • 0 kudos

with the databricks cli you can get all infos about a job run with this command:databricks jobs get-run <run-id> replace <run-id> with your actual run-id

  • 0 kudos
1 More Replies
matmat13
by New Contributor II
  • 17664 Views
  • 19 replies
  • 10 kudos

Resolved! Lakehouse Fundamentals Certificate/Badge not appearing

Hello! I just passed the Lakehouse Fundamentals Accreditation and I haven't received any badge or certificate for it. I understand that I need to go to credentials.databricks.com but it is not there. How long before it appears? Need help

  • 17664 Views
  • 19 replies
  • 10 kudos
Latest Reply
andy25
New Contributor II
  • 10 kudos

Hello can you help me, I have the same problem. I don't have the certificate available on credentials.databricks.com.

  • 10 kudos
18 More Replies
dev_puli
by New Contributor III
  • 38731 Views
  • 6 replies
  • 5 kudos

how to read the CSV file from users workspace

Hi!I have been carrying out a POC, so I created the CSV file in my workspace and tried to read the content using the techniques below in a Python notebook, but did not work.Option1:repo_file = "/Workspace/Users/u1@org.com/csv files/f1.csv"tmp_file_na...

  • 38731 Views
  • 6 replies
  • 5 kudos
Latest Reply
MujtabaNoori
New Contributor III
  • 5 kudos

Hi @Dev ,Generally, What happens spark reader APIs point to the DBFS by default. And, to read the file from User workspace, we need to append 'file:/' in the prefix.Thanks

  • 5 kudos
5 More Replies
AH
by New Contributor III
  • 1078 Views
  • 3 replies
  • 0 kudos

Databricks Genie AI Next Level Q&A

Hi Team Could Please you answer these three questions to onboard Genie Space for Analytics?Do you know if we can use GenieSpace in our web application through API or SDK?Is there any way to manage access control other than the Unity catalog?Can we ad...

  • 1078 Views
  • 3 replies
  • 0 kudos
Latest Reply
jennie258fitz
New Contributor III
  • 0 kudos

@AH my loyola wrote:Hi Team Could Please you answer these three questions to onboard Genie Space for Analytics?Do you know if we can use GenieSpace in our web application through API or SDK?Is there any way to manage access control other than the Uni...

  • 0 kudos
2 More Replies
avnish26
by New Contributor III
  • 11729 Views
  • 5 replies
  • 9 kudos

Spark 3.3.0 connect kafka problem

I am trying to connect to my Kafka from spark but getting an error:Kafka Version: 2.4.1Spark Version: 3.3.0I am using jupyter notebook to execute the pyspark code below:```from pyspark.sql.functions import *from pyspark.sql.types import *#import libr...

  • 11729 Views
  • 5 replies
  • 9 kudos
Latest Reply
jose_gonzalez
Databricks Employee
  • 9 kudos

Hi @avnish26, did you added the Jar files to the cluster? do you still have issues? please let us know

  • 9 kudos
4 More Replies
Erik
by Valued Contributor III
  • 584 Views
  • 3 replies
  • 2 kudos

Managing streaming checkpoints with unity catalog

This is partly a question, partly a feature request: How do you guys handle streaming checkpoints in combination with unity catalog managed tables?It seems like the only way is to create a volume, and manually specify paths in it as streaming checkpo...

  • 584 Views
  • 3 replies
  • 2 kudos
Latest Reply
cgrant
Databricks Employee
  • 2 kudos

For Structured Streaming Applications, this would be a nice feature. Delta Live Tables manages checkpoints for you out of the box - you don't even have to reason about checkpoints at all, would recommend checking it out!

  • 2 kudos
2 More Replies
labromb
by Contributor
  • 12918 Views
  • 10 replies
  • 4 kudos

How to pass configuration values to a Delta Live Tables job through the Delta Live Tables API

Hi Community,I have successfully run a job through the API but would need to be able to pass parameters (configuration) to the DLT workflow via the APII have tried passing JSON in this format:{ "full_refresh": "true", "configuration": [ ...

  • 12918 Views
  • 10 replies
  • 4 kudos
Latest Reply
Edthehead
Contributor III
  • 4 kudos

You cannot pass parameters from a Databricks job to a DLT pipeline. Atleast not yet. You can see from the DLT rest API that there is no option for it to accept any parameters.But there is a workaround.But there is a workaround.With the assumption tha...

  • 4 kudos
9 More Replies
ashraf1395
by Valued Contributor II
  • 289 Views
  • 1 replies
  • 0 kudos

Resolved! Column level and table level tagging in dlt pipeline

Can i set column level and table level tags programmatically in a dlt pipeline. I tried it the normal way using spark.sql(f"alter table and set tags (key=value)") using this syntax I found it out in one of the databricks community post. But can we do...

  • 289 Views
  • 1 replies
  • 0 kudos
Latest Reply
julie598doyle
New Contributor III
  • 0 kudos

@ashraf1395 wrote:Can i set column level and table level tags programmatically in a dlt pipeline. I tried it the normal way using spark.sql(f"alter table and set tags (key=value)") using this syntax I found it out in one of the databricks community p...

  • 0 kudos
oishimbo
by New Contributor
  • 9506 Views
  • 3 replies
  • 1 kudos

Databricks time travel - how to get ALL changes ever done to a table

Hi time travel gurus,I am investigating creating a reporting solution with an AsOf functionality. Users will be able to create a report based on the current data or on the data AsOf some time ago. Due to the nature of our data this AsOf feature is qu...

  • 9506 Views
  • 3 replies
  • 1 kudos
Latest Reply
mcveyroosevelt
New Contributor III
  • 1 kudos

In Databricks, you can use time travel to access historical versions of a table using the versionAsOf or timestampAsOf options in a SELECT query. To retrieve all changes made to a table, you would typically query the table's historical versions, spec...

  • 1 kudos
2 More Replies
ayush_273
by New Contributor
  • 635 Views
  • 4 replies
  • 0 kudos

Migrate transformations from snowflake to databricks

I want to migrate my data as well as the transformations i have to convert my raw data into BI data. Is there a way i can move those transformation to databricks? Some of the transformations use native snowflake functions.Thanks in advance.

  • 635 Views
  • 4 replies
  • 0 kudos
Latest Reply
infinitylearnin
New Contributor III
  • 0 kudos

When you migrate data and transformations, we map the data to delta lake and transformations into spark sql notebooks. 

  • 0 kudos
3 More Replies
ashap551
by New Contributor II
  • 653 Views
  • 2 replies
  • 0 kudos

JDBC Connection to NetSuite SuiteAnalytics Using Token-Based-Authentication (TBA)

I'm trying to connect to NetSuite2.com using Pyspark from a Databricks Notebook utilizing a JDBC driver.I was successful in setting up my DBVisualizer connection by installing the JDBC Driver (JAR) and generating the password with the one-time hashin...

  • 653 Views
  • 2 replies
  • 0 kudos
Latest Reply
alicerichard65
New Contributor II
  • 0 kudos

It seems like the issue might be related to the password generation or the JDBC URL configuration. Here are a few things you can check: NextCareurgentcare1. Password Generation: Ensure that the generate_tba_password function is correctly implemented ...

  • 0 kudos
1 More Replies
Sanjeev
by New Contributor II
  • 321 Views
  • 1 replies
  • 0 kudos

Resolved! Triggering a Databricks job more than once daily

Hi Team,I have a requirement to trigger a databricks job more than once daily, may be twice or thrice daily.I have checked the workflows but I couldn't find any option in the UI.Please advice

  • 321 Views
  • 1 replies
  • 0 kudos
Latest Reply
Alberto_Umana
Databricks Employee
  • 0 kudos

Hello @Sanjeev, You might want to try this option:   Please refer to below document.You can use cron syntax to schedule jobs. https://docs.databricks.com/en/jobs/scheduled.html

  • 0 kudos
Pingleinferyx
by New Contributor
  • 935 Views
  • 7 replies
  • 0 kudos

jdbc integration returning header as data for read operation

package com.example.databricks; import org.apache.spark.sql.Dataset;import org.apache.spark.sql.Row;import org.apache.spark.sql.SparkSession; public class DatabricksJDBCApp {     public static void main(String[] args) {        // Initialize Spark Ses...

  • 935 Views
  • 7 replies
  • 0 kudos
Latest Reply
Dengineer
New Contributor II
  • 0 kudos

After reading through the Driver documentation I've finally found a solution that appears to work for me. I've added .option("UseNativeQuery", 0) to my JDBC connection. The query that was being passed from the Databricks Driver to the Databricks Clus...

  • 0 kudos
6 More Replies

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group
Labels