cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

JKR
by Contributor
  • 3553 Views
  • 1 replies
  • 0 kudos

Databricks sql variables and if/else workflow

I have 2 tasks in databricks job workflow first task is of type SQL and SQL task is query.In that query I've declared 2 variables and SET the values by running query.e.g:DECLARE VARIABLE max_timestamp TIMESTAMP DEFAULT '1970-01-01'; SET VARIABLE max_...

Data Engineering
databricks-sql
Workflows
  • 3553 Views
  • 1 replies
  • 0 kudos
Latest Reply
NandiniN
Databricks Employee
  • 0 kudos

Please try with  max_timestamp = dbutils.jobs.taskValues("sql_task_1")["max_timestamp"] dbutils.jobs.taskValues("python_task_1", {"max_timestamp": max_timestamp}) Reference- https://docs.databricks.com/en/jobs/task-values.html  

  • 0 kudos
willie_nelson
by New Contributor II
  • 1617 Views
  • 3 replies
  • 1 kudos

ABFS Authentication with a SAS token -> 403!

Hi guys,I'm running a streamReader/Writer with autoloader from StorageV2 (general purpose v2) over abfss instead of wasbs. My checkpoint location is valid, the reader properly reads the file schema and autoloader is able to sample 105 files to do so....

  • 1617 Views
  • 3 replies
  • 1 kudos
Latest Reply
BricksGuy
New Contributor III
  • 1 kudos

Would you mind to paste the sample code please. I am trying to use abfs with autoloader and getting error like yours.

  • 1 kudos
2 More Replies
Vetrivel
by Contributor
  • 3960 Views
  • 3 replies
  • 1 kudos

Resolved! SSIS packages migration to Databricks Workflows

We are doing POC to migrate SSIS packages to Databricks workflows as part of our effort to build the analytics layer, including dimension and fact tables. How can we accelerate or automate the SSIS package migration to Databricks environment?

  • 3960 Views
  • 3 replies
  • 1 kudos
Latest Reply
BlakeHill
New Contributor II
  • 1 kudos

Thank you so much for the solution.

  • 1 kudos
2 More Replies
GabrieleMuciacc
by New Contributor III
  • 6452 Views
  • 5 replies
  • 2 kudos

Resolved! Support for kwargs parameter in `/2.1/jobs/create` endpoint for `python_wheel_task`

If I create a job from the web UI and I select Python wheel, I can add kwargs parameters. Judging from the generated JSON job description, they appear under a section named `namedParameters`.However, if I use the REST APIs to create a job, it appears...

  • 6452 Views
  • 5 replies
  • 2 kudos
Latest Reply
manojpatil04
New Contributor III
  • 2 kudos

@GabrieleMuciacc , in case of serverless compute job this can be pass as external dependency you can't use libraries. "tasks": [{ "task_key": task_id,                      "spark_python_task": {                        "python_file": py_file,         ...

  • 2 kudos
4 More Replies
radix
by New Contributor II
  • 4023 Views
  • 1 replies
  • 0 kudos

Pool clusters and init scripts

Hey, just trying out pool clusters and providing the instance_pool_type and driver_instance_pool_id configuration to the Airflow new_cluster fieldI also pass the init_scripts field with an s3 link as usual but it this case of pool clusters it doesn't...

  • 4023 Views
  • 1 replies
  • 0 kudos
Latest Reply
Walter_C
Databricks Employee
  • 0 kudos

When using a non pool cluster are you able to see the init script being deployed? You could set init script logging to see if it is being called or not at all https://docs.databricks.com/en/init-scripts/logs.html 

  • 0 kudos
Direo
by Contributor II
  • 5058 Views
  • 1 replies
  • 0 kudos

Managing Secrets for Different Groups in a Databricks Workspace

Hi everyone,I'm looking for some advice on how people are managing secrets within Databricks when you have different groups (or teams) in the same workspace, each requiring access to different sets of secrets.Here’s the challenge:We have multiple gro...

  • 5058 Views
  • 1 replies
  • 0 kudos
Latest Reply
Walter_C
Databricks Employee
  • 0 kudos

Managing secrets within Databricks when you have different groups or teams in the same workspace can be approached in several ways, each with its own advantages. Here are some best practices and methods based on the context provided: Using Azure Key...

  • 0 kudos
mjedy78
by New Contributor II
  • 1408 Views
  • 3 replies
  • 0 kudos

How to enable AQE in foreachbatch mode

I am processing the daily data from checkpoint to checkpoint everyday by using for each batch in streaming way.df.writeStream.format("delta") .option("checkpointLocation", "dbfs/loc") .foreachBatch(transform_and_upsert) .outpu...

mjedy78_0-1733819593344.png
  • 1408 Views
  • 3 replies
  • 0 kudos
Latest Reply
mjedy78
New Contributor II
  • 0 kudos

@MuthuLakshmi any idea?

  • 0 kudos
2 More Replies
niruban
by New Contributor II
  • 3803 Views
  • 3 replies
  • 0 kudos

Databricks Asset Bundle to deploy only one workflow

Hello Community -I am trying to deploy only one workflow from my CICD. But whenever I am trying to deploy one workflow using "databricks bundle deploy - prod", it is deleting all the existing workflow in the target environment. Is there any option av...

Data Engineering
CICD
DAB
Databricks Asset Bundle
DevOps
  • 3803 Views
  • 3 replies
  • 0 kudos
Latest Reply
nvashisth
New Contributor III
  • 0 kudos

Hi Team, the deployment via DAB(Databricks Asset Bundle) reads all yml files present and based on that workflows are generated. In the previous versions of Databricks CLI prior to 0.236(or latest one), it use to delete all the workflow by making dele...

  • 0 kudos
2 More Replies
sangwan
by New Contributor
  • 1261 Views
  • 1 replies
  • 0 kudos

Issue: 'Catalog hive_metastore doesn't exist. Create it?' Error When Installing Reconcile

Utility : Remorph (Databricks)Issue: 'Catalog  hive_metastore doesn't exist. Create it?'Error When Installing ReconcileI am encountering an issue while installing Reconcile on Databricks. Despite hive_metastore catalog is by default present in the Da...

  • 1261 Views
  • 1 replies
  • 0 kudos
Latest Reply
VZLA
Databricks Employee
  • 0 kudos

Hi @sangwan , Its not very clear, does the error come with a stacktrace? if so could you please share it? Also, any WARN/ERROR messages in the Driver log by any chance?

  • 0 kudos
oliverw
by New Contributor II
  • 1783 Views
  • 3 replies
  • 0 kudos

Structured Streaming QueryProgressEvent Metrics incorrect

Hi All,I've been working on implementing a custom StreamingQueryListener in pyspark to enable integration with our monitoring solution, I've had quite a lot of success with this on multiple different streaming pipelines, however on the last set I've ...

  • 1783 Views
  • 3 replies
  • 0 kudos
Latest Reply
VZLA
Databricks Employee
  • 0 kudos

Hi @oliverw , I believe this will require some logs and information correlation, could you please raise a support ticket for the same? Sharing further details here may expose some sensitive data, hence a ticket would be more appropriate. Looking forw...

  • 0 kudos
2 More Replies
brokeTechBro
by New Contributor II
  • 1062 Views
  • 2 replies
  • 0 kudos

Bug Community Edition Sign Up Error

Please help hereBug Community Edition Sign Up Error - an error occurred please try again laterI am frustrated

  • 1062 Views
  • 2 replies
  • 0 kudos
Latest Reply
GSam
New Contributor II
  • 0 kudos

@gchandra The issue is still there. Tried it on multiple browsers (Incognito and otherwise) and on multiple devicces in different networks. Still unable to sign up after 2 days of trying.

  • 0 kudos
1 More Replies
Miasu
by New Contributor II
  • 4919 Views
  • 2 replies
  • 0 kudos

Unable to analyze external table | FileAlreadyExistsException

Hello experts, There's a csv file, "nyc_taxi.csv" saved under users/myfolder on DBFS, and I used this file created 2 tables:1. nyc_taxi : created using the UI, and it appeared as a managed table saved under dbfs:/user/hive/warehouse/mydatabase.db/nyc...

  • 4919 Views
  • 2 replies
  • 0 kudos
Latest Reply
NandiniN
Databricks Employee
  • 0 kudos

Did you initially want to create an external or managed table?  Just trying to understand what was your intent for the file.

  • 0 kudos
1 More Replies
RantoB
by Valued Contributor
  • 29232 Views
  • 8 replies
  • 7 kudos

Resolved! read csv directly from url with pyspark

I would like to load a csv file directly to a spark dataframe in Databricks. I tried the following code :url = "https://opendata.reseaux-energies.fr/explore/dataset/eco2mix-national-tr/download/?format=csv&timezone=Europe/Berlin&lang=fr&use_labels_fo...

  • 29232 Views
  • 8 replies
  • 7 kudos
Latest Reply
anwangari
New Contributor II
  • 7 kudos

Hello it's end of 2024 and I still have this issue with python. As mentioned sc method nolonger works. Also, working with volumes within "/databricks/driver/" is not supported in Apache Spark.ALTERNATIVE SOLUTION: Use requests to download the file fr...

  • 7 kudos
7 More Replies
abaghel
by New Contributor II
  • 1232 Views
  • 2 replies
  • 0 kudos

Azure application insights logging not working after upgrading cluster to databricks runtime 14.x

I have a basic code setup to read a stream from a Delta table and write it into another Delta table. I am using logging to send logs to Application Insights. However, within the foreachBatch function, the logs I write are not being sent to Applicatio...

  • 1232 Views
  • 2 replies
  • 0 kudos
Latest Reply
abaghel
New Contributor II
  • 0 kudos

@MuthuLakshmi  Thank you for getting back to me. I have read the article and understand that "Any files, modules, or objects referenced in the function must be serializable and available on Spark." However, based on the code provided, can you help me...

  • 0 kudos
1 More Replies
None123
by New Contributor III
  • 10980 Views
  • 3 replies
  • 3 kudos

Open a Support Ticket

Anyone know how to submit a support ticket? I keep getting into a loop that takes me back to the community page, but I need to submit an urgent ticket. I'm told our company pays a ridiculous sum for this feature yet it is impossible to find.Thanks ...

  • 10980 Views
  • 3 replies
  • 3 kudos
Latest Reply
vickytscv
New Contributor II
  • 3 kudos

Hi Team,     We are working with Adobe tool for campaign metrics. which needs to pull data from AEP using explode option, when we pass query it is taking long time and performance is also very. Is there any better way to pull data from AEP, Please le...

  • 3 kudos
2 More Replies

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels