cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

BradSheridan
by Valued Contributor
  • 2198 Views
  • 0 replies
  • 0 kudos

Workflow parameters

Hey everyone! I'm close but can't seem to figure this out. I'm trying to add 2 notebooks to a Databricks Job. Instead of the first command in both notebooks being a connection to an RDS/Redshift cluster, I'd prefer to make that connection once and ha...

  • 2198 Views
  • 0 replies
  • 0 kudos
palzor
by New Contributor III
  • 900 Views
  • 0 replies
  • 2 kudos

What is the best practice while loading delta table , do I infer the schema or provide the schema?

I am loading avro files into the detla tables. I am doing this for multiple tables and some files are big like (2-3GB) and most of them are small like in few MBs.I am using autoloader to load the data into the delta tables.My question is:What is the ...

  • 900 Views
  • 0 replies
  • 2 kudos
anisha_93
by New Contributor II
  • 4900 Views
  • 2 replies
  • 1 kudos

Error in SQL statement: KeyProviderException: Failure to initialize configuration

I have a source delta table from which I have selectively granted access to a particular pool id(can be thought of a dummy user). From the pool id interface, whenever I am running a select on any of the tables, even though it has access to, is faili...

  • 4900 Views
  • 2 replies
  • 1 kudos
Latest Reply
alicewong20
New Contributor II
  • 1 kudos

Hello all,I got the same problem. Does anyone help?

  • 1 kudos
1 More Replies
Dicer
by Valued Contributor
  • 4198 Views
  • 4 replies
  • 3 kudos

Resolved! Azure Databricks: Failed to extract data which is between two timestamps within those same dates using Pyspark

Data type:AAPL_Time: timestampAAPL_Close: floatRaw Data:AAPL_Time AAPL_Close 2015-05-11T08:00:00.000+0000 29.0344 2015-05-11T08:30:00.000+0000 29.0187 2015-05-11T09:00:00.000+0000 29.0346 2015-05-11T09:3...

  • 4198 Views
  • 4 replies
  • 3 kudos
Latest Reply
Anonymous
Not applicable
  • 3 kudos

Another thing to try is the hour() and minute() functions will return integers.

  • 3 kudos
3 More Replies
_Orc
by New Contributor
  • 18396 Views
  • 5 replies
  • 3 kudos

Resolved! Precision and scale is getting changed in the dataframe while casting to decimal

When i run the below query in databricks sql the Precision and scale of the decimal column is getting changed.Select typeof(COALESCE(Cast(3.45 as decimal(15,6)),0));o/p: decimal(16,6)expected o/p: decimal(15,6)Any reason why the Precision and scale i...

  • 18396 Views
  • 5 replies
  • 3 kudos
Latest Reply
berserkersap
Contributor
  • 3 kudos

You can use typeof(COALESCE(Cast(3.45 as decimal(15,6)),0.0)); (instead of 0)

  • 3 kudos
4 More Replies
Stephen678
by New Contributor II
  • 1211 Views
  • 0 replies
  • 0 kudos

Easy way to debug databricks code. Is there breakpoints in databricks or alternative way to achieve it?

I'm consuming multiple topics from confluent kafka and process each row with business rules using Spark structured streaming (.writestream and .foreach()). While doing that i call other notebook using %run and call the class via foreach while perform...

  • 1211 Views
  • 0 replies
  • 0 kudos
sage5616
by Valued Contributor
  • 8825 Views
  • 5 replies
  • 7 kudos

Resolved! SQL Error when querying any tables/views on a Databricks cluster via Dbeaver.

I am able to connect to the cluster, browse its hive catalog, see tables/views and columns/datatypesRunning a simple select statement from a view on a parquet file produces this error and no other results:"SQL Error [500540] [HY000]: [Databricks][Dat...

  • 8825 Views
  • 5 replies
  • 7 kudos
Latest Reply
sage5616
Valued Contributor
  • 7 kudos

Update. I have tried SQL Workbench/J and encountered exactly the same error(s) as with Dbeaver. I have also tried JetBrains DataGrip and it worked flawlessly. Able to connect, browse the databases and query tables/views. https://docs.microsoft.com/en...

  • 7 kudos
4 More Replies
BradSheridan
by Valued Contributor
  • 2821 Views
  • 1 replies
  • 0 kudos

Resolved! Drop/Create tables in Redshift with PySpark

Happy Friday afternoon fellow Bricksters! Got another question for you... I have a pyspark notebook that reads from redshift into a DF, does some 'stuff', then writes back to redshift. All good here. What I'm trying to do with no luck yet is first DR...

  • 2821 Views
  • 1 replies
  • 0 kudos
Latest Reply
BradSheridan
Valued Contributor
  • 0 kudos

Answered my own question!! check this out:dropSQL = ("DROP TABLE IF EXISTS <tablename>;"). --note the semicolon at the end!createSQL = ("CREATE TABLE IF NOT EXISTS <tablename> (field1 int, field2 date, etc...);")preActionsSQL = dropSQL + createSQLth...

  • 0 kudos
KarimSegura
by New Contributor III
  • 3189 Views
  • 2 replies
  • 4 kudos

databricks-connect throws an exception when showing a dataframe with json content

I'm facing an issue when I want to show a dataframe with JSON content.All this happens when the script runs in databricks-connect from VS Code.Basically, I would like any help or guidance to get this run as it should be. Thanks in advance.This is how...

  • 3189 Views
  • 2 replies
  • 4 kudos
Latest Reply
KarimSegura
New Contributor III
  • 4 kudos

The code works fine on databricks cluster, but this code is part of a unit test in local env. then submitted to a branch->PR->merged into master branch.Thanks for the advice on using DBX. I will give DBX a try again even though I've already tried.I'l...

  • 4 kudos
1 More Replies
Cano
by New Contributor III
  • 786 Views
  • 1 replies
  • 0 kudos

Hi,I&#39;ll like to know if it&#39;s possible to connect to Postgresql RDS from the Databricks SQL Warehouse.

Hi,I'll like to know if it's possible to connect to Postgresql RDS from the Databricks SQL Warehouse.

  • 786 Views
  • 1 replies
  • 0 kudos
Latest Reply
Cano
New Contributor III
  • 0 kudos

I should have posted this as a question and not a post. Please forgive me, I'm a newbie.

  • 0 kudos
nikgoel95
by New Contributor II
  • 1420 Views
  • 3 replies
  • 1 kudos

What&#39;s the be​at way to define the libraries for cluster as it always take a lot of time for me.

What's the be​at way to define the libraries for cluster as it always take a lot of time for me.

  • 1420 Views
  • 3 replies
  • 1 kudos
Latest Reply
Sivaprasad1
Valued Contributor II
  • 1 kudos

@Nikunj Goel​ : Please refer to the below doc the workspace library might help on thishttps://docs.databricks.com/libraries/workspace-libraries.html#workspace-libraries

  • 1 kudos
2 More Replies
pshah83
by New Contributor II
  • 1921 Views
  • 0 replies
  • 2 kudos

Use output of SHOW PARTITION commands in Sub-Query/CTE/Function

I am using SHOW PARTITIONS <<table_name>> to get all the partitions of a table. I want to use max() on the output of this command to get the latest partition for the table.However, I am not able to use SHOW PARTITIONS <<table_name>> in a CTE/sub-quer...

  • 1921 Views
  • 0 replies
  • 2 kudos
christys
by Databricks Employee
  • 619 Views
  • 0 replies
  • 2 kudos

Want to influence the Databricks product roadmap and services?  We are looking for feedback from you - our Databricks Community members - to give your...

Want to influence the Databricks product roadmap and services? We are looking for feedback from you - our Databricks Community members - to give your feedback and thoughts about your experience with Databricks over the last 6 months in a ~10 minute s...

  • 619 Views
  • 0 replies
  • 2 kudos

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group
Labels