cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 
Data + AI Summit 2024 - Data Engineering & Streaming

Forum Posts

Hertz
by New Contributor II
  • 389 Views
  • 2 replies
  • 0 kudos

System Tables / Audit Logs action_name createWarehouse/createEndpoint

I am creating a cost dashboard across multiple accounts. I am working get sql warehouse names and warehouse ids so I can combine with system.access.billing on warehouse_id.  But the only action_names that include both the warehouse_id and warehouse_n...

Data Engineering
Audit Logs
cost monitor
createEndpoint
createWarehouse
  • 389 Views
  • 2 replies
  • 0 kudos
Latest Reply
Hertz
New Contributor II
  • 0 kudos

I just wanted to circle back to this. It appears that the ID is returned in the response column of the create action_name.

  • 0 kudos
1 More Replies
Bazhar
by New Contributor
  • 314 Views
  • 1 replies
  • 0 kudos

Understanding this Ipython related error in cluster logs

Hi Databricks Community !I'm having this error from a cluster's logs : [IPKernelApp] ERROR | Exception in control handler:Traceback (most recent call last):File "/databricks/python/lib/python3.10/site-packages/ipykernel/kernelbase.py", line 334, in p...

  • 314 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 0 kudos

Hi @Bazhar,  If you’re using Databricks Connect, ensure that it can reach your cluster.Verify that your workspace instance name and cluster ID are correct. 

  • 0 kudos
JamesY
by New Contributor III
  • 260 Views
  • 1 replies
  • 0 kudos

Resolved! Databricks JDBC write to table with PK column, error, key not found.

Hello, I am trying to write data to table, it works find before, but after I recreated the table with one column as PK, there is an error.Unable to write into the A_Table table....key not found: id What is the correct way of doing this?PK column:   [...

Data Engineering
Databricks
SqlMi
  • 260 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 0 kudos

Hi @JamesY, If you’re using Databricks with SQL Server, you can use the OUTPUT clause to retrieve the primary key value after an INSERT query. CREATE TABLE A_Table ( ID BIGINT IDENTITY PRIMARY KEY, -- Other columns... ); INSERT INTO A_Table ...

  • 0 kudos
HASSAN_UPPAL123
by New Contributor II
  • 283 Views
  • 1 replies
  • 0 kudos

SPARK_GEN_SUBQ_0 WHERE 1=0, Error message from Server: Configuration schema is not available

Hi Community,I'm trying to read the data from sample schema from table nation from data-bricks catalog via spark but i'm getting this error.com.databricks.client.support.exceptions.GeneralException: [Databricks][JDBCDriver](500051) ERROR processing q...

Data Engineering
pyspark
python
  • 283 Views
  • 1 replies
  • 0 kudos
Latest Reply
HASSAN_UPPAL123
New Contributor II
  • 0 kudos

Hi Community,I'm still facing the issue can someone please provide me any solution how to fix above error.

  • 0 kudos
Phani1
by Valued Contributor
  • 296 Views
  • 1 replies
  • 0 kudos

Resolved! Databricks with Private cloud

Hi Databricks Team,Is it possible for Databricks to offer support for private cloud environments other than Azure, GCP, and AWS? The client intends to utilize Databricks in their own cloud for enhanced security. If this is feasible, what is the proce...

  • 296 Views
  • 1 replies
  • 0 kudos
Latest Reply
holly
Contributor II
  • 0 kudos

Hi Janga, Providing your own cloud is not a service we offer at this time. I can't say for certain, but it's unlikely we'll ever offer this.  You mentioned you have a 'client' so I'm assuming you're part of a consulting firm. I understand it's diffic...

  • 0 kudos
Zume
by New Contributor II
  • 278 Views
  • 2 replies
  • 0 kudos

Unity Catalog Shared compute Issues

Am I the only one experiencing challenges in migrating to Databricks Unity Catalog? I observed that in Unity Catalog-enabled compute, the "Shared" access mode is still tagged as a Preview feature. This means it is not yet safe for use in production w...

  • 278 Views
  • 2 replies
  • 0 kudos
Latest Reply
jacovangelder
Contributor III
  • 0 kudos

Have you tried creating a volume on top of the external location, and using the volume in spark.read.parquet?i.e.   spark.read.parquet('/Volumes/<volume_name>/<folder_name>/<file_name.parquet>')  Edit: also, not sure why the Databricks community mana...

  • 0 kudos
1 More Replies
Martin_Pham
by New Contributor II
  • 95 Views
  • 1 replies
  • 1 kudos

Is Datbricks-Salesforce already available to use?

Reference: Salesforce and Databricks Announce Strategic Partnership to Bring Lakehouse Data Sharing and Shared ...I was going through this article and wanted to know if this is already released. My assumption is that there’s no need to use third-part...

  • 95 Views
  • 1 replies
  • 1 kudos
Latest Reply
Martin_Pham
New Contributor II
  • 1 kudos

Looks like it has been released - Salesforce BYOM

  • 1 kudos
Jackson1111
by New Contributor III
  • 131 Views
  • 1 replies
  • 0 kudos

How to use job.run_id as the running parameter of jar job to trigger job through REST API

"[,\"\{\{job.run_id\}\}\"]" {"error_code": "INVALID_PARAMETER_VALUE","message": "Legacy parameters cannot contain references."}

  • 131 Views
  • 1 replies
  • 0 kudos
Latest Reply
Jackson1111
New Contributor III
  • 0 kudos

How to get the Job ID and Run ID in job runing?

  • 0 kudos
moulitharan
by New Contributor
  • 187 Views
  • 1 replies
  • 0 kudos

notebook returns succeeded even if there is a failure in any of the command

I'm trying to run a notebook that has 14 commands from adf notebook activity, I'm writing the transformed data to a delta table in the last command. I'm handling the last command in try except block to handle errors, and raising error on exception.wh...

moulitharan_0-1718611669417.png
  • 187 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 0 kudos

Hi @moulitharan, When running a Databricks Notebook using the Databricks Notebook Activity in Azure Data Factory (ADF), the job status is typically reported as “succeeded” even if individual commands within the notebook fail. However, there are ways ...

  • 0 kudos
Akash_Wadhankar
by New Contributor
  • 311 Views
  • 1 replies
  • 0 kudos

DatabricksUniForm

Hi Community members, I tried creating a Delta UniForm table using databricks notebook. I created a database without providing location. It took the dbfs default storage location. On top of that I was able to create a Delta UniForm table. Then I trie...

  • 311 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 0 kudos

  Hi @Akash_Wadhankar, It appears you’ve encountered an issue while working with Delta tables in Databricks, specifically when trying to create a Delta table in an AWS S3 location. Error Message: Unknown configuration was specified: delta.enableI...

  • 0 kudos
MR07
by New Contributor II
  • 175 Views
  • 1 replies
  • 0 kudos

Optimal Cluster Selection for Continuous Delta Live Tables Pipelines: Bronze and Silver

Hi,I have two Delta Live Tables Pipelines. The first one is the Bronze pipeline, which handles bronze tables. These tables are defined as streaming tables, and this pipeline needs to be executed continuously. The second one is the Silver pipeline, wh...

  • 175 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 0 kudos

Hi @MR07, When configuring Delta Live Tables pipelines, you’ll need to consider the type of cluster that best suits your requirements.  Bronze Pipeline (Streaming Tables): For the Bronze pipeline, which handles streaming tables, consider using th...

  • 0 kudos
meret
by New Contributor
  • 218 Views
  • 1 replies
  • 0 kudos

Trouble Accessing Trust Store for Oracle JDBC Connection on Shared Compute Cluster

HiI am trying to read data from an Oracle DB using the Oracle JDBC Driver:df = (spark.read.format("jdbc").option("url", "jdbc:oracle:thin:@(DESCRIPTION=(ADDRESS=(PROTOCOL=TCPS)(PORT=xxx)(HOST=xxx))(CONNECT_DATA=(SID=xxx)))").option("dbTable", "schema...

  • 218 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 0 kudos

Hi @meret,  Distributed File System: Instead of using a local file path, consider storing the trust store file in a distributed file system (e.g., DBFS, HDFS, or ADLS).Custom Initialization Script: You can create an initialization script that runs on...

  • 0 kudos
Sangeethagk
by New Contributor
  • 544 Views
  • 1 replies
  • 0 kudos

TypeError: ColSpec.__init__() got an unexpected keyword argument 'required'

Hi Team, one of my customer is facing the below issue.. Anyone faced this issue before ? Any help would be appreciated.import mlflowmlflow.set_registry_uri("databricks-uc")catalog_name = "system"embed = mlflow.pyfunc.spark_udf(spark, f"models:/system...

  • 544 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 0 kudos

Hi @Sangeethagk, It looks like you’re encountering a couple of issues related to mlflow.pyfunc.spark_udf() and model dependencies. TypeError: ColSpec.init() got an unexpected keyword argument ‘required’: This error occurs when you’re using mlflo...

  • 0 kudos
ttamas
by New Contributor III
  • 246 Views
  • 2 replies
  • 1 kudos

Get the triggering task's name

Hi,I have tasks that depend on each other. I would like to get variables from task1 that triggers task2.This is how I solved for my problem:Following suggestion in https://community.databricks.com/t5/data-engineering/how-to-pass-parameters-to-a-quot-...

  • 246 Views
  • 2 replies
  • 1 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 1 kudos

Hi @ttamas, Thank you for sharing your approach! It’s true that handling task dependencies and passing values between tasks in Databricks can sometimes be complex. Databricks now supports dynamic value references in notebooks. Instead of using dbutil...

  • 1 kudos
1 More Replies
Kjetil
by New Contributor III
  • 553 Views
  • 3 replies
  • 2 kudos

Resolved! Autoloader to concatenate CSV files that updates regularly into a single parquet dataframe.

I have multiple large CSV files. One or more of these files changes now and then (a few times a day). The changes in the CSV files are both of type update and append (so both new rows) and updates of old. I want to combine all CSV files into a datafr...

  • 553 Views
  • 3 replies
  • 2 kudos
Latest Reply
jose_gonzalez
Moderator
  • 2 kudos

Hi @Kjetil, Please let us know if you still have issue or if @-werners- response could be mark as a best solution. Thank you

  • 2 kudos
2 More Replies
Join 100K+ Data Experts: Register Now & Grow with Us!

Excited to expand your horizons with us? Click here to Register and begin your journey to success!

Already a member? Login and join your local regional user group! If there isn’t one near you, fill out this form and we’ll create one for you to join!

Labels