cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

kirkj
by New Contributor
  • 5289 Views
  • 1 replies
  • 0 kudos

Can Databricks write query results to s3 in another account via the API

I work for a company where we are trying to create a Databrick's integration in node using the @DataBricks/sql package to query customers clusters or warehouses.  I see documentation of being able to load data via a query from s3 using STS tokens whe...

  • 5289 Views
  • 1 replies
  • 0 kudos
Latest Reply
Walter_C
Databricks Employee
  • 0 kudos

Have you been able to get a response on this topic, based on the information I can see it might not be supported to write on an S3 outside your account

  • 0 kudos
jeremy98
by Honored Contributor
  • 1209 Views
  • 1 replies
  • 0 kudos

Resolved! unvalidated the primary and foreign keys constraints?

Hello community,I'm inserting in a table defined (with primary key and foreign key set) some records in overwrite mode every moment I run a workflow where the task is defined. Why after inserting those records the DDL schema changes? Why I have my pr...

  • 1209 Views
  • 1 replies
  • 0 kudos
Latest Reply
Alberto_Umana
Databricks Employee
  • 0 kudos

Hi @jeremy98, When you use the "insert overwrite" mode in Databricks, it can lead to the schema being reset, which includes the removal of primary and foreign key constraints. This happens because the "insert overwrite" operation essentially replaces...

  • 0 kudos
Dhanushn
by New Contributor
  • 13388 Views
  • 1 replies
  • 0 kudos

Concurrent issue on delta lake insert update

Hey team! I need your help on delta lake let me explain the scenario of mine.Scenario: ive a table in delta lake and ive 2 databricks workflows running parallely which has insert and update tasks to do.My delta table is partitioned with country codeM...

  • 13388 Views
  • 1 replies
  • 0 kudos
Latest Reply
Takuya-Omi
Valued Contributor III
  • 0 kudos

Hi, @Dhanushn In response to your question, the community contains the following information:https://community.databricks.com/t5/community-platform-discussions/concurrent-update-to-delta-throws-error/td-p/65599https://kb.databricks.com/en_US/delta/in...

  • 0 kudos
aupres
by New Contributor III
  • 1716 Views
  • 1 replies
  • 0 kudos

how to generate log files on specific folders

Hello! My environments are like below, OS : Windows 11 Spark : spark-4.0.0-preview2-bin-hadoop3 And the configuration of spark files 'spark-defaults.conf' and 'log4j2.properties'spark-defaults.conf spark.eventLog.enabled true spark.event...

  • 1716 Views
  • 1 replies
  • 0 kudos
Latest Reply
Alberto_Umana
Databricks Employee
  • 0 kudos

Hi @aupres, Do you see any failures in spark logs? Few things to validate: It appears that the log files are not being generated in the specified directory due to a misconfiguration in your log4j2.properties fil   Check the Appender Configuration: En...

  • 0 kudos
vanshikagupta
by New Contributor II
  • 9717 Views
  • 3 replies
  • 0 kudos

conversion of code from scala to python

does databricks community edition provides with databricks ML visualization for pyspark, just the same as provided in this link for scala. https://docs.azuredatabricks.net/_static/notebooks/decision-trees.html also please help me to convert this lin...

  • 9717 Views
  • 3 replies
  • 0 kudos
Latest Reply
thelogicplus
Contributor II
  • 0 kudos

you may explore the tool and services from Travinto Technologies . They have very good tools. We had explored their tool for our code coversion from  Informatica, Datastage and abi initio to DATABRICKS , pyspark. Also we used for SQL queries, stored ...

  • 0 kudos
2 More Replies
LightUp
by New Contributor III
  • 12430 Views
  • 3 replies
  • 4 kudos

Converting SQL Code to SQL Databricks

I am new to Databricks. Please excuse my ignorance. My requirement is to convert the SQL query below into Databricks SQL. The query comes from EventLog table and the output of the query goes into EventSummaryThese queries can be found hereCREATE TABL...

image
  • 12430 Views
  • 3 replies
  • 4 kudos
Latest Reply
thelogicplus
Contributor II
  • 4 kudos

you may explore the tool and services from Travinto Technologies . They have very good tools. We had explored their tool for our code coversion from  Informatica, Datastage and abi initio to DATABRICKS , pyspark. Also we used for SQL queries, stored ...

  • 4 kudos
2 More Replies
MartinIsti
by Databricks Partner
  • 6255 Views
  • 2 replies
  • 0 kudos

Python UDF in Unity Catalog - spark.sql error

I'm trying to utilise the option to create UDFs in Unity Catalog. That would be a great way to have functions available in a fairly straightforward manner without e.g. putting the function definitions in an extra notebook that I %run to make them ava...

Data Engineering
function
udf
  • 6255 Views
  • 2 replies
  • 0 kudos
Latest Reply
Linglin
New Contributor III
  • 0 kudos

I came across the same problem. inside unity catalog UDF creation, spark.sql or spark.table doesn't work.Adding from pyspark.sql import SparkSession spark = SparkSession.builder.getOrCreate() into the session doesn't work as wellDon't know how to sol...

  • 0 kudos
1 More Replies
Tahseen0354
by Valued Contributor
  • 31315 Views
  • 9 replies
  • 5 kudos

Resolved! Getting "Job aborted due to stage failure" SparkException when trying to download full result

I have generated a result using SQL. But whenever I try to download the full result (1 million rows), it is throwing SparkException. I can download the preview result but not the full result. Why ? What happens under the hood when I try to download ...

  • 31315 Views
  • 9 replies
  • 5 kudos
Latest Reply
ac567
New Contributor III
  • 5 kudos

Job aborted due to stage failure: Task 6506 in stage 46.0 failed 4 times, most recent failure: Lost task 6506.3 in stage 46.0 (TID 12896) (10.**.***.*** executor 12): java.lang.OutOfMemoryError: Cannot reserve 4194304 bytes of direct buffer memory (a...

  • 5 kudos
8 More Replies
udays22222
by New Contributor II
  • 7593 Views
  • 6 replies
  • 1 kudos

Error writing data to Google Bigquery

Hi,I am able to read data from a Bigquery table ,But am getting error writing data to a table in BigqueryFollowed instuctions in this document.Connecting Databricks to BigQuery | Google Cloud%scalaimport scala.io.Sourceval contentCred = "/dbfs/FileSt...

  • 7593 Views
  • 6 replies
  • 1 kudos
Latest Reply
GeoPer
New Contributor III
  • 1 kudos

@udays22222 did you find any solution on this one? I face the same problem when I use Shared (Access mode) cluster. I can read but I cannot write with the error you mentioned.

  • 1 kudos
5 More Replies
Abdul-Mannan
by New Contributor III
  • 4650 Views
  • 14 replies
  • 2 kudos

Autoloader with file notification mode sleeps for 5000ms multiple times

Using DBR 15.4, i'm ingesting streaming data from adls using autoloader with file notification mode enabled. This is an older code which is using foreachbatch sink to process the data before merging with tables in delta lake. IssueStreaming job, is u...

AbdulMannan_0-1733760650416.png
  • 4650 Views
  • 14 replies
  • 2 kudos
Latest Reply
Abdul-Mannan
New Contributor III
  • 2 kudos

@VZLA I just tested it and it seems this autoloader behaviour with available now trigger & file notification enabled, would remain the same with DLT pipeline, it sleeps 7 times each time sleeping for 5000ms before finally closing the stream, even tho...

  • 2 kudos
13 More Replies
LearnDB1234
by New Contributor III
  • 2278 Views
  • 3 replies
  • 0 kudos

How To Parse a XML Column with string data type into multiple sql columns

Hi,I have a table with XML data in it which is stored in a column with STRING datatype. Can someone please help me on how to parse this XML into multiple sql columns.Below is the sample XML Table & desired output data  Select * from default.SampleDat...

  • 2278 Views
  • 3 replies
  • 0 kudos
Latest Reply
szymon_dybczak
Esteemed Contributor III
  • 0 kudos

Hi @LearnDB1234 ,Are you sure that this column stores xml as a string? To me it looks more like json  string If so, you can use new VARIANT data type through parse_json function:  %sql WITH src AS ( SELECT parse_json('{ "Status": { "Co...

  • 0 kudos
2 More Replies
sakuraDev
by New Contributor II
  • 6222 Views
  • 1 replies
  • 0 kudos

I keep on getting Parse_syntax_error on autoloader run foreachbatch

Hey guys, I keep on getting this error message when trying to call a function with soda DQ's: [PARSE_SYNTAX_ERROR] Syntax error at or near '{'. SQLSTATE: 42601 File <command-81221799516900>, line 4 1 dfBronze.writeStream \ 2 .foreachB...

  • 6222 Views
  • 1 replies
  • 0 kudos
Latest Reply
VZLA
Databricks Employee
  • 0 kudos

Hi @sakuraDev , this looks like a Soda syntax issue. Try fixing the "fail" and "warn" fields in your Soda checks. For example, instead of writing:   - missing_count(site) = 0: name: Ensure no null values fail: 1 warn: 0   Use Soda's thres...

  • 0 kudos
Data_Engineer07
by New Contributor II
  • 4452 Views
  • 1 replies
  • 0 kudos

Looking for 75% coupon code for Data Engineering Associate Certification

Hi Everyone, I am Looking for 75% coupon code for Data Engineering Associate Certification . Can anyone Guide me how can get coupon code for certification.

  • 4452 Views
  • 1 replies
  • 0 kudos
Latest Reply
VZLA
Databricks Employee
  • 0 kudos

Hi @Data_Engineer07 , Please reach out through https://www.databricks.com/company/contact regarding such requests. The corresponding team will guide on this request and let you know of its availability if there is.

  • 0 kudos
829023
by Databricks Partner
  • 2998 Views
  • 2 replies
  • 1 kudos

Databricks federation query why not support Oracle Database?

Hi,Based on the documents(https://docs.databricks.com/en/query-federation/index.html), Databricks federation query is not support Oracle for source. 1. Did you guys know the reason? (Is it depends on Oracle's speciality?)2. Is there another way to ru...

  • 2998 Views
  • 2 replies
  • 1 kudos
Latest Reply
VZLA
Databricks Employee
  • 1 kudos

@829023 There's limited support with respect to the pushdown and data types mapping as documented in our website: https://docs.databricks.com/en/query-federation/oracle.htmlThis was published recently, I believe in October, given your question was ra...

  • 1 kudos
1 More Replies
NhanNguyen
by Contributor III
  • 1408 Views
  • 2 replies
  • 0 kudos

Table Properties different for liquid clustering with Databricks version.

Dear all,Today, I tried the liquid clustering in Databricks, but after running it with two Databricks engine version, it showed different properties in the catalog explorer.1. Run with DBR version 14.3 LTS (includes Apache Spark 3.5.0, Scala 2.12) it...

  • 1408 Views
  • 2 replies
  • 0 kudos
Latest Reply
VZLA
Databricks Employee
  • 0 kudos

Correct, like @holly rightly said this is just an updated way of representing the columns in a more structured or updated manner, it may also be matching a new value type. In both cases the table property is reflecting that LC was enabled. Our sugges...

  • 0 kudos
1 More Replies
Labels