cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 
Data + AI Summit 2024 - Data Engineering & Streaming

Forum Posts

arunak
by New Contributor
  • 1207 Views
  • 1 replies
  • 0 kudos

Connecting to Serverless Redshift from a Databricks Notebook

Hello Experts, A new databricks user here. I am trying to access an Redshift serverless table using a databricks notebook. Here is what happens when I try the below code,  df = spark.read.format("redshift")\.option("dbtable", "public.customer")\.opti...

  • 1207 Views
  • 1 replies
  • 0 kudos
Latest Reply
shan_chandra
Esteemed Contributor
  • 0 kudos

@arunak - we need to specify forward_spark_s3_credentials to true during read. This will help spark detect the credentials used to authenticate to the S3 bucket and use these credentials to r read from redshift.  

  • 0 kudos
mh_db
by New Contributor III
  • 2283 Views
  • 1 replies
  • 0 kudos

Write to csv file in S3 bucket

I have a pandas dataframe in my Pyspark notebook. I want to save this dataframe to my S3 bucket. I'm using the following command to save itimport boto3import s3fsdf_summary.to_csv(f"s3://dataconversion/data/exclude",index=False)but I keep getting thi...

  • 2283 Views
  • 1 replies
  • 0 kudos
Latest Reply
shan_chandra
Esteemed Contributor
  • 0 kudos

Hi @mh_db - you can import botocore library (or) if it is not found can do a pip install botocore to resolve this. Alternatively, you can maintain the data in a spark dataframe without converting to a pandas dataframe and while writing to a csv. you ...

  • 0 kudos
naveenanto
by New Contributor III
  • 973 Views
  • 0 replies
  • 0 kudos

Custom Spark Extension in SQL Warehouse

I understand only a limited spark configurations are supported in SQL Warehouse but is it possible to add spark extensions to SQL Warehouse clusters?Use Case: We've a few restricted table properties. We prevent that with spark extensions installed in...

Data Engineering
sql-warehouse
  • 973 Views
  • 0 replies
  • 0 kudos
juanc
by New Contributor II
  • 4734 Views
  • 8 replies
  • 2 kudos

Activate spark extensions on SQL Endpoints

It would be possible to activate a custom extensions like Sedona (https://sedona.apache.org/download/databricks/ ) in SQL Endopoints?Example error:java.lang.ClassNotFoundException: org.apache.spark.sql.sedona_sql.UDT.GeometryUDT at org.apache.spark....

  • 4734 Views
  • 8 replies
  • 2 kudos
Latest Reply
naveenanto
New Contributor III
  • 2 kudos

@Retired_mod What is the right way to add custom spark extension to sql warehouse clusters?

  • 2 kudos
7 More Replies
marcuskw
by Contributor II
  • 13155 Views
  • 1 replies
  • 0 kudos

Resolved! Lakehouse Federation for SQL Server and Security Policy

We've been able to setup a Foreign Catalog using the following documentation:https://learn.microsoft.com/en-us/azure/databricks/query-federation/sql-serverHowever the tables that have RLS using a Security Policy appear empty. I imagine that this solu...

  • 13155 Views
  • 1 replies
  • 0 kudos
Latest Reply
marcuskw
Contributor II
  • 0 kudos

Was a bit quick here, found out that the SUSER_NAME() of the query is of course the connection that was setup.So the User/Password defined here:Once I added that same user to the RLS logic I get the correct result. 

  • 0 kudos
64883
by New Contributor
  • 850 Views
  • 1 replies
  • 0 kudos

Support for Delta tables multicluster writes in Databricks cluster

Hello, We're using Databricks on AWS and we've recently started using Delta tables. We're using R.While the code below[1] works in a notebook, when running it from RStudio on a Databricks cluster we get the following error: java.lang.IllegalStateExce...

  • 850 Views
  • 1 replies
  • 0 kudos
Latest Reply
NandiniN
Honored Contributor
  • 0 kudos

Sorry, for being very late here -  If you can not use  multi write to false, we can try to split this table into separate tables for each stream.

  • 0 kudos
_Raju
by New Contributor II
  • 3300 Views
  • 1 replies
  • 0 kudos

Cast string to decimal

Hello, can anyone help me with the below error.I'm trying to cast the string column into decimal. When I try to do that I'm getting the "Py4JJavaError: An error occurred while calling t.addCustomDisplayData. : java.sql.SQLException: Status of query a...

  • 3300 Views
  • 1 replies
  • 0 kudos
Latest Reply
" src="" />
This widget could not be displayed.
This widget could not be displayed.
This widget could not be displayed.
  • 0 kudos

This widget could not be displayed.
Hello, can anyone help me with the below error.I'm trying to cast the string column into decimal. When I try to do that I'm getting the "Py4JJavaError: An error occurred while calling t.addCustomDisplayData. : java.sql.SQLException: Status of query a...

This widget could not be displayed.
  • 0 kudos
This widget could not be displayed.
BeginnerBob
by New Contributor III
  • 28488 Views
  • 6 replies
  • 3 kudos

Convert Date to YYYYMMDD in databricks sql

Hi,I have a date column in a delta table called ADate. I need this in the format YYYYMMDD.In TSQL this is easy. However, I can't seem to be able to do this without splitting the YEAR, MONTH and Day and concatenating them together.Any ideas?

  • 28488 Views
  • 6 replies
  • 3 kudos
Latest Reply
JayDoubleYou42
New Contributor II
  • 3 kudos

I'll share I'm having a variant of the same issue. I have a varchar field in the form YYYYMMDD which I'm trying to join to another varchar field from another table in the form of MM/DD/YYYY. Does anyone know of a way to do this in SPARK SQL without s...

  • 3 kudos
5 More Replies
Cami
by Contributor III
  • 1513 Views
  • 1 replies
  • 0 kudos

VIEW JSON result value in view which based on volume

Hello guys!I have the following case:It has been decided that the json file will be read from a following definition ( from volume) , which more or less looks like this: CREATE OR REPLACE VIEW [catalog_name].[schema_name].v_[object_name] AS SELECT r...

  • 1513 Views
  • 1 replies
  • 0 kudos
Latest Reply
" src="" />
This widget could not be displayed.
This widget could not be displayed.
This widget could not be displayed.
  • 0 kudos

This widget could not be displayed.
Hello guys!I have the following case:It has been decided that the json file will be read from a following definition ( from volume) , which more or less looks like this: CREATE OR REPLACE VIEW [catalog_name].[schema_name].v_[object_name] AS SELECT r...

This widget could not be displayed.
  • 0 kudos
This widget could not be displayed.
lindsey
by New Contributor
  • 1491 Views
  • 0 replies
  • 0 kudos

"Error: cannot read mws credentials: invalid Databricks Account configuration" on TF Destroy

I have a terraform project that creates a workspace in Databricks, assigns it to an existing metastore, then creates external location/storage credential/catalog. The apply works and all expected resources are created. However, without touching any r...

  • 1491 Views
  • 0 replies
  • 0 kudos
akisugi
by New Contributor III
  • 3673 Views
  • 5 replies
  • 0 kudos

Resolved! Is it possible to control the ordering of the array values created by array_agg()?

Hi! I would be glad to ask you some questions.I have the following data. I would like to get this kind of result. I want `move` to correspond to the order of `hist`.Therefore, i considered the following query.```with tmp as (select * from (values(1, ...

スクリーンショット 2024-04-06 23.08.15.png スクリーンショット 2024-04-06 23.07.34.png
  • 3673 Views
  • 5 replies
  • 0 kudos
Latest Reply
akisugi
New Contributor III
  • 0 kudos

Hi @ThomazRossito This is a great idea. It can solve my problem.Thank you.

  • 0 kudos
4 More Replies
cool_cool_cool
by New Contributor II
  • 1517 Views
  • 1 replies
  • 2 kudos

Resolved! Trigger Dashboard Update At The End of a Workflow

Heya I have a workflow that computes some data and writes to a delta table, and I have a dashboard that is based on the table. How can I trigger refresh on the dashboard once the workflow is finished? Thanks!

  • 1517 Views
  • 1 replies
  • 2 kudos
Latest Reply
ThomazRossito
Contributor
  • 2 kudos

Hello,in your workflow it is possible to end with a SQL task and in the SQL Task item you can select some options, see the image below: 

  • 2 kudos
939772
by New Contributor III
  • 1210 Views
  • 1 replies
  • 0 kudos

Resolved! DLT refresh unexpectedly failing

We're hitting an error with a delta live table refresh since yesterday; nothing has changed in our system yet there appears to be a configuration error: { ... "timestamp": "2024-04-08T23:00:10.630Z", "message": "Update b60485 is FAILED.",...

  • 1210 Views
  • 1 replies
  • 0 kudos
Latest Reply
939772
New Contributor III
  • 0 kudos

Apparently the `custom_tags` of `ResourceClass` is now extraneous -- removing it from config corrected our problem.

  • 0 kudos

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group
Labels