cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

erigaud
by Honored Contributor
  • 2741 Views
  • 2 replies
  • 1 kudos

Resolved! Serverless clusters not starting

Hello, I am trying to launch a serverless data warehouse, it used to work fine before but for some reason it no longer works. I tried creating a brand new serverless cluster and I get the same result. I am the creator of the clusters, and a workspace...

erigaud_0-1690877326549.png
  • 2741 Views
  • 2 replies
  • 1 kudos
Latest Reply
erigaud
Honored Contributor
  • 1 kudos

For anyone interested in the status page link : https://status.azuredatabricks.net/

  • 1 kudos
1 More Replies
tlecomte
by New Contributor III
  • 9483 Views
  • 6 replies
  • 3 kudos

Resolved! Enabling Adaptive Query Execution and Cost-Based Optimizer in Structured Streaming foreachBatch

Dear Databricks community,I am using Spark Structured Streaming to move data from silver to gold in an ETL fashion. The source stream is the change data feed of a Delta table in silver. The streaming dataframe is transformed and joined with a couple ...

  • 9483 Views
  • 6 replies
  • 3 kudos
Latest Reply
Lingesh
Databricks Employee
  • 3 kudos

It's not recommended to have AQE on a Streaming query for the same reason you shared in the description. It has been documented here

  • 3 kudos
5 More Replies
ivanychev
by Contributor II
  • 3252 Views
  • 1 replies
  • 2 kudos

Is there a way to avoid using EBS drives on workers with local NVMe SSD?

The Databricks on AWS docs claim that 30G + 150G EBS drives are mounter to every node by default. But if I use instance type like r5d.2xlarge, it already has local disk so I want to avoid mounting the 150G EBS drive to it. Is there a way to do it?We ...

  • 3252 Views
  • 1 replies
  • 2 kudos
Latest Reply
pabloanzorenac
New Contributor II
  • 2 kudos

Hey Ivan, did you find a way to do this?

  • 2 kudos
Qarol
by New Contributor
  • 2356 Views
  • 2 replies
  • 0 kudos

Work around for The method `pd.groupby.GroupBy.prod()` is not implemented yet.

0I have a database with two columns: name (str) and probability (float).I am running this command:df[['name','probability']].groupby('name').prod()on a Databricks (runtime 7.3) notebook and df is a koalas dataframe.The error I get is:PandasNotImpleme...

  • 2356 Views
  • 2 replies
  • 0 kudos
Latest Reply
alenka
New Contributor III
  • 0 kudos

The same trouble  

  • 0 kudos
1 More Replies
ChinmayU
by Databricks Partner
  • 2449 Views
  • 0 replies
  • 0 kudos

java.time.LocalDate exception when a date column is used with "IN" operator in replace where clause

Hi, we recently made an upgrade to our Databricks warehouse, transitioning from SQL Classic to SQL PRO. However, we encountered the following error message when attempting to execute the "INSERT INTO" table query with a "REPLACE WHERE" predicate that...

Data Engineering
Databricks
LocalDate exception
replace where
Unity Catalog
  • 2449 Views
  • 0 replies
  • 0 kudos
FabriceDeseyn
by Contributor
  • 2052 Views
  • 1 replies
  • 0 kudos

Databricks-connect VSCode debugging pandas_api not working

HiI am using the databricks extension on VSCode and am running against an issue since two days, prior it worked fine. I receive an error when I want to use Pandas-on-Spark during debugging.from databricks.connect import DatabricksSession spark = Data...

FabriceDeseyn_0-1689838667900.png FabriceDeseyn_1-1689838868477.png
Data Engineering
databricks VSCode extension
databricks-connect
  • 2052 Views
  • 1 replies
  • 0 kudos
Latest Reply
FabriceDeseyn
Contributor
  • 0 kudos

Additional info:It seems that the issue comes from the 1.1.0 version of the databricks extension in VSCode.Downgrading to 1.0.0 solves my issue.

  • 0 kudos
PhillT
by New Contributor
  • 4862 Views
  • 1 replies
  • 2 kudos

SQL expr undefined function 'LEN'

Getting this error message on our production cluster when I run a notebook that uses the SQL expr function that call the LEN() funciton example code:df = df.withColumn("POL", expr("CASE WHEN SRC_SYSTEM = 'X' THEN CONCAT('08' , SUBSTRING(POL, 3, LEN(P...

  • 4862 Views
  • 1 replies
  • 2 kudos
Latest Reply
daniel_sahal
Databricks MVP
  • 2 kudos

@PhillT There's no "LEN" function. You should use "LENGTH" instead.https://spark.apache.org/docs/2.3.0/api/sql/index.html#length

  • 2 kudos
numersoz
by New Contributor III
  • 6393 Views
  • 3 replies
  • 5 kudos

Resolved! Z-Ordering Timestamp Column

Hi,I've large Delta Table for IoT data for over 10K different sensors with timestamp, sensor name and value columns at 1 second precision.Query pattern is usually random 5-100 sensors at a time. But typically involves specific year/month/day interval...

  • 6393 Views
  • 3 replies
  • 5 kudos
Latest Reply
Oliver_Angelil
Valued Contributor II
  • 5 kudos

@numersoz did you z-order on the timestamp column or on less granular columns, like Year, Month, or Day. timestamp column is very granular (high cardinality) since it also includes hour, minute, second...

  • 5 kudos
2 More Replies
DarthObert
by New Contributor II
  • 1515 Views
  • 1 replies
  • 0 kudos

Databricks intellisence adds a second aliad

Hi all, A couple of weeks ago I noticed that whenever I use the intellisense to autocomplete my column names in the sql editor, it adds a second alias.  For example if i have a table (Table1) and I alias it in my query (i.e. Table1 as a) if I use the...

  • 1515 Views
  • 1 replies
  • 0 kudos
Latest Reply
DarthObert
New Contributor II
  • 0 kudos

And how do I fix it? 

  • 0 kudos
agar08
by New Contributor
  • 626 Views
  • 0 replies
  • 0 kudos

ava.net.SocketTimeoutException - ReadTimeOut

Databricks notebook is connecting ADLS Gen2 using Service principal authentication and the setup is working fine. The notebook is able to read/write files to ADLS Gen2. However, occasionally, we are seeing below error in the production environment:ja...

  • 626 Views
  • 0 replies
  • 0 kudos
jdhao
by New Contributor II
  • 5018 Views
  • 4 replies
  • 0 kudos

Why can't I query a table from a cluster, but can query from another cluster in the same workspace

I have two clusters A, B under the same azure databricks workspace. Under cluster A, inside my notebook, I tried to query a table: `SELECT * FROM some_table LIMIT 5`.  It shows some permission errors. Under cluster B, if I run the same sql query, it ...

  • 5018 Views
  • 4 replies
  • 0 kudos
Latest Reply
Lakshay
Databricks Employee
  • 0 kudos

Check for any spark config or init script differences in the two clusters.

  • 0 kudos
3 More Replies
gopikrsna925
by New Contributor
  • 2639 Views
  • 0 replies
  • 0 kudos

Azure Databricks: leading zeros in decimal integer literals are not permitted

Hey team,Need your help.I am trying to run the below python code in a data bricks notebook, which is part of parsing an XML file, exploding the element. This works great for all the other elements with no numbers and elements not starting with a zero...

Data Engineering
leading
zeros
  • 2639 Views
  • 0 replies
  • 0 kudos
Phani1
by Databricks MVP
  • 1710 Views
  • 1 replies
  • 0 kudos

Streaming tables vs DLT

Have a couple of questions wrt Streaming tables, Kindly help us with this.1)Can we create streaming tables without a DLT pipeline?2)Can we create streaming tables in Databricks SQL?3)What we observe in Streaming tables, it  support Kafka and event lo...

Data Engineering
dlt
sql
Streaming tables
  • 1710 Views
  • 1 replies
  • 0 kudos
Latest Reply
Lakshay
Databricks Employee
  • 0 kudos

HiPlease refer to the document: https://docs.databricks.com/sql/language-manual/sql-ref-syntax-ddl-create-streaming-table.html#create-streaming-tableI think this should help you answer your questions.

  • 0 kudos
Labels