cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 
Data + AI Summit 2024 - Data Engineering & Streaming

Forum Posts

Abeeya
by New Contributor II
  • 5696 Views
  • 1 replies
  • 5 kudos

Resolved! How to Overwrite Using pyspark's JDBC without loosing constraints on table columns

Hello,My table has primary key constraint on a perticular column, Im loosing primary key constaint on that column each time I overwrite the table , What Can I do to preserve it? Any Heads up would be appreciatedTried Belowdf.write.option("truncate", ...

  • 5696 Views
  • 1 replies
  • 5 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 5 kudos

@Abeeya .​ , Mode "truncate", is correct to preserve the table. However, when you want to add a new column (mismatched schema), it wants to drop it anyway.

  • 5 kudos
Anonymous
by Not applicable
  • 1196 Views
  • 1 replies
  • 0 kudos

How to resolve Quickbooks error 12007

QuickBooks error 12007 occurs when an update time out occurs. QuickBooks may encounter this error when it cannot connect to the internet if it's unable to access the server. If you want to know its solutions then check out our latest blog on this.

  • 1196 Views
  • 1 replies
  • 0 kudos
Latest Reply
willjoe
New Contributor III
  • 0 kudos

How to Resolve QuickBooks Payroll Update Error 12007?For various possible causes of the QB payroll update error 12007, you need to perform different troubleshooting procedures. Follow the solutions in their given sequence to fix this QuickBooks error...

  • 0 kudos
BasavarajAngadi
by Contributor
  • 5461 Views
  • 11 replies
  • 9 kudos

Resolved! Hi Experts i am new to data bricks and i want to know how data bricks supports real time reporting needs in Business intelligence?

Delta lake have 3 levels to maintain data quality ( bronze , silver and gold tables ) but this supports the reporting and BI solutions how does this supports the streaming analytics ?example : I have an app that loads all the operational data in adls...

  • 5461 Views
  • 11 replies
  • 9 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 9 kudos

@Basavaraj Angadi​ , Why? For simplicity, cost savings etc.You can make it work with 2 'containers' but it is not necessary.

  • 9 kudos
10 More Replies
Anonymous
by Not applicable
  • 3342 Views
  • 2 replies
  • 3 kudos

Resolved! Play the BIG DATA GAME | By Firebolt

https://www.firebolt.io/big-data-gameThe most fun our Bricksters have had in a while at work is thanks to a little BIG DATA thing called The BIG DATA GAME ️This game is the cure for the mid-week blues. The Big Data Game is a simple yet awesome online...

Image Image
  • 3342 Views
  • 2 replies
  • 3 kudos
Latest Reply
Anonymous
Not applicable
  • 3 kudos

HA! I kept 'dying' there too!

  • 3 kudos
1 More Replies
lukas_vlk
by New Contributor III
  • 8473 Views
  • 3 replies
  • 2 kudos

Resolved! Python Spark Job - error: job failed with error message The output of the notebook is too large.

Hi databricks experts. I am currently facing a problem with a submitted job run on Azure Databricks. Any help on this is very welcome. See below for details:Problem Description:I submitted a python spark task via the databricks cli (v0.16.4) to Azure...

  • 8473 Views
  • 3 replies
  • 2 kudos
Latest Reply
lukas_vlk
New Contributor III
  • 2 kudos

Without any further changes from my side, the error has disappeard since 29.03.2022

  • 2 kudos
2 More Replies
Sudeshna
by New Contributor III
  • 2209 Views
  • 2 replies
  • 3 kudos

How can i pass one of the values from one function to another as an argument in Databricks SQL?

For eg - CREATE OR REPLACE TABLE table2(a INT, b INT);INSERT INTO table2 VALUES (100, 200);CREATE OR REPLACE FUNCTION func1() RETURNS TABLE(a INT, b INT) RETURN (SELECT a+b, a*b from table2);create or replace function calc(p DOUBLE) RETURNS TABLE(val...

image
  • 2209 Views
  • 2 replies
  • 3 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 3 kudos

Yes, it is possible, but with different logic. For scalar, so calc(a) in select calc(a) from func1(); it can only be a query as a table for a scalar is not allowed. So please try something like:CREATE OR REPLACE FUNCTION func_table() RETURNS TABLE(a ...

  • 3 kudos
1 More Replies
shan_chandra
by Databricks Employee
  • 14173 Views
  • 2 replies
  • 0 kudos
  • 14173 Views
  • 2 replies
  • 0 kudos
Latest Reply
shan_chandra
Databricks Employee
  • 0 kudos

%scala def clearAllCaching(tableName: Option[String] = None): Unit = { tableName.map { path => com.databricks.sql.transaction.tahoe.DeltaValidation.invalidateCache(spark, path) } spark.conf.set("com.databricks.sql.io.caching.bucketedRead.enabled", "f...

  • 0 kudos
1 More Replies
Pragan
by New Contributor
  • 3307 Views
  • 3 replies
  • 1 kudos

Resolved! Cluster doesn't support Photon with Docker Image enabled

I enabled Photon 9.1 LTS DBR in cluster that was already using Docker Image of the latest version, when I ran a SQL QUery using my cluster, I could not see any Photon engine working in my executor that should be actually running in Photon Engine.When...

  • 3307 Views
  • 3 replies
  • 1 kudos
Latest Reply
Anonymous
Not applicable
  • 1 kudos

Hello @Praganessh S​ , Photon is currently in Public Preview. The only way to use it is to explicitly run Databricks-provide Runtime images which contain it. Please see: https://docs.databricks.com/runtime/photon.html#databricks-clustersandhttps://do...

  • 1 kudos
2 More Replies
SimonY
by New Contributor III
  • 2709 Views
  • 3 replies
  • 3 kudos

Resolved! Trigger.AvailableNow does not support maxOffsetsPerTrigger in Databricks runtime 10.3

Hello,I ran a spark stream job to ingest data from kafka to test Trigger.AvailableNow.What's environment the job run ?1: Databricks runtime 10.32: Azure cloud3: 1 Driver node + 3 work nodes( 14GB, 4core)val maxOffsetsPerTrigger = "500"spark.conf.set...

  • 2709 Views
  • 3 replies
  • 3 kudos
Latest Reply
Anonymous
Not applicable
  • 3 kudos

You'd be better off with 1 node with 12 cores than 3 nodes with 4 each. You're shuffles are going to be much better one 1 machine.

  • 3 kudos
2 More Replies
fermin_vicente
by New Contributor III
  • 5246 Views
  • 7 replies
  • 4 kudos

Resolved! Can secrets be retrieved only for the scope of an init script?

Hi there, if I set any secret in an env var to be used by a cluster-scoped init script, it remains available for the users attaching any notebook to the cluster and easily extracted with a print.There's some hint in the documentation about the secret...

  • 5246 Views
  • 7 replies
  • 4 kudos
Latest Reply
pavan_kumar
Contributor
  • 4 kudos

@Fermin Vicente​ good to know that this approach is working well. but please make sure that you use this approach at the end of your init script only

  • 4 kudos
6 More Replies
Hubert-Dudek
by Esteemed Contributor III
  • 944 Views
  • 1 replies
  • 19 kudos

Runtime 10.4 is available and is LTS. From today it is not beta anymore and it is LTS! mean Long Time Support. So for sure it will be with us for next...

Runtime 10.4 is available and is LTS.From today it is not beta anymore and it is LTS! mean Long Time Support. So for sure it will be with us for next 2 years.10.4 includes some awesome features like:Auto Compaction rollbacks are now enabled by defaul...

  • 944 Views
  • 1 replies
  • 19 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 19 kudos

I have the same favorite.I am curious how it works under the hood. zipWithIndex?

  • 19 kudos
Hubert-Dudek
by Esteemed Contributor III
  • 16590 Views
  • 23 replies
  • 36 kudos

Resolved! SparkFiles - strange behavior on Azure databricks (runtime 10)

When you use:from pyspark import SparkFiles spark.sparkContext.addFile(url)it adds file to NON dbfs /local_disk0/ but then when you want to read file:spark.read.json(SparkFiles.get("file_name"))it wants to read it from /dbfs/local_disk0/. I tried als...

  • 16590 Views
  • 23 replies
  • 36 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 36 kudos

I confirm that as @Arvind Ravish​ said adding file:/// is solving the problem.

  • 36 kudos
22 More Replies
Vegard_Stikbakk
by New Contributor II
  • 2370 Views
  • 1 replies
  • 3 kudos

Resolved! External functions on a SQL endpoint

want to create an external function using CREATE FUNCTION (External) and expose it to users of my SQL endpoint. Although this works from a SQL notebook, if I try to use the function from a SQL endpoint, I get "User defined expression is not supporte...

Screenshot 2022-03-24 at 21.32.59
  • 2370 Views
  • 1 replies
  • 3 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 3 kudos

It is separated runtime https://docs.databricks.com/sql/release-notes/index.html#channels so it seems that it is not yet supported. There is CREATE FUNCTION documentation but it seems that it is support only SQL syntax https://docs.databricks.com/sql...

  • 3 kudos
dataguy73
by New Contributor
  • 2808 Views
  • 1 replies
  • 1 kudos

Resolved! spark properties files

I am trying to migrate a spark job from an on-premises Hadoop cluster to data bricks on azure. Currently, we are keeping many values in the properties file. When executing spark-submit we pass the parameter --properties /prop.file.txt. and inside t...

  • 2808 Views
  • 1 replies
  • 1 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 1 kudos

I use JSON files and .conf files which reside on the data lake or in the filestore of dbfs.Then read those files using python/scala

  • 1 kudos

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group
Labels