cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 
Data + AI Summit 2024 - Data Engineering & Streaming

Forum Posts

Dicer
by Valued Contributor
  • 3223 Views
  • 4 replies
  • 3 kudos

Resolved! Azure Databricks: Failed to extract data which is between two timestamps within those same dates using Pyspark

Data type:AAPL_Time: timestampAAPL_Close: floatRaw Data:AAPL_Time AAPL_Close 2015-05-11T08:00:00.000+0000 29.0344 2015-05-11T08:30:00.000+0000 29.0187 2015-05-11T09:00:00.000+0000 29.0346 2015-05-11T09:3...

  • 3223 Views
  • 4 replies
  • 3 kudos
Latest Reply
Anonymous
Not applicable
  • 3 kudos

Another thing to try is the hour() and minute() functions will return integers.

  • 3 kudos
3 More Replies
_Orc
by New Contributor
  • 13457 Views
  • 6 replies
  • 3 kudos

Resolved! Precision and scale is getting changed in the dataframe while casting to decimal

When i run the below query in databricks sql the Precision and scale of the decimal column is getting changed.Select typeof(COALESCE(Cast(3.45 as decimal(15,6)),0));o/p: decimal(16,6)expected o/p: decimal(15,6)Any reason why the Precision and scale i...

  • 13457 Views
  • 6 replies
  • 3 kudos
Latest Reply
berserkersap
Contributor
  • 3 kudos

You can use typeof(COALESCE(Cast(3.45 as decimal(15,6)),0.0)); (instead of 0)

  • 3 kudos
5 More Replies
Stephen678
by New Contributor II
  • 952 Views
  • 0 replies
  • 0 kudos

Easy way to debug databricks code. Is there breakpoints in databricks or alternative way to achieve it?

I'm consuming multiple topics from confluent kafka and process each row with business rules using Spark structured streaming (.writestream and .foreach()). While doing that i call other notebook using %run and call the class via foreach while perform...

  • 952 Views
  • 0 replies
  • 0 kudos
sage5616
by Valued Contributor
  • 6040 Views
  • 5 replies
  • 7 kudos

Resolved! SQL Error when querying any tables/views on a Databricks cluster via Dbeaver.

I am able to connect to the cluster, browse its hive catalog, see tables/views and columns/datatypesRunning a simple select statement from a view on a parquet file produces this error and no other results:"SQL Error [500540] [HY000]: [Databricks][Dat...

  • 6040 Views
  • 5 replies
  • 7 kudos
Latest Reply
sage5616
Valued Contributor
  • 7 kudos

Update. I have tried SQL Workbench/J and encountered exactly the same error(s) as with Dbeaver. I have also tried JetBrains DataGrip and it worked flawlessly. Able to connect, browse the databases and query tables/views. https://docs.microsoft.com/en...

  • 7 kudos
4 More Replies
BradSheridan
by Valued Contributor
  • 1887 Views
  • 1 replies
  • 0 kudos

Resolved! Drop/Create tables in Redshift with PySpark

Happy Friday afternoon fellow Bricksters! Got another question for you... I have a pyspark notebook that reads from redshift into a DF, does some 'stuff', then writes back to redshift. All good here. What I'm trying to do with no luck yet is first DR...

  • 1887 Views
  • 1 replies
  • 0 kudos
Latest Reply
BradSheridan
Valued Contributor
  • 0 kudos

Answered my own question!! check this out:dropSQL = ("DROP TABLE IF EXISTS <tablename>;"). --note the semicolon at the end!createSQL = ("CREATE TABLE IF NOT EXISTS <tablename> (field1 int, field2 date, etc...);")preActionsSQL = dropSQL + createSQLth...

  • 0 kudos
Cano
by New Contributor III
  • 485 Views
  • 1 replies
  • 0 kudos

Hi,I&#39;ll like to know if it&#39;s possible to connect to Postgresql RDS from the Databricks SQL Warehouse.

Hi,I'll like to know if it's possible to connect to Postgresql RDS from the Databricks SQL Warehouse.

  • 485 Views
  • 1 replies
  • 0 kudos
Latest Reply
Cano
New Contributor III
  • 0 kudos

I should have posted this as a question and not a post. Please forgive me, I'm a newbie.

  • 0 kudos
nikgoel95
by New Contributor II
  • 937 Views
  • 3 replies
  • 1 kudos

What&#39;s the be​at way to define the libraries for cluster as it always take a lot of time for me.

What's the be​at way to define the libraries for cluster as it always take a lot of time for me.

  • 937 Views
  • 3 replies
  • 1 kudos
Latest Reply
Sivaprasad1
Valued Contributor II
  • 1 kudos

@Nikunj Goel​ : Please refer to the below doc the workspace library might help on thishttps://docs.databricks.com/libraries/workspace-libraries.html#workspace-libraries

  • 1 kudos
2 More Replies
christys
by Community Manager
  • 429 Views
  • 0 replies
  • 2 kudos

Want to influence the Databricks product roadmap and services?  We are looking for feedback from you - our Databricks Community members - to give your...

Want to influence the Databricks product roadmap and services? We are looking for feedback from you - our Databricks Community members - to give your feedback and thoughts about your experience with Databricks over the last 6 months in a ~10 minute s...

  • 429 Views
  • 0 replies
  • 2 kudos
FD_MR
by New Contributor II
  • 905 Views
  • 0 replies
  • 1 kudos

Delta Live Tables executing repeatedly and returning empty DF

Still relatively new to Spark and even more so to Delta Live Tables so apologies if I've missed something fundamental but here goes.We are trying to run a notebook via Delta Live Tables, which contains 2 functions decorated by the `dlt.table` decorat...

  • 905 Views
  • 0 replies
  • 1 kudos
Jack
by New Contributor II
  • 2787 Views
  • 2 replies
  • 0 kudos

Applying a formula to list of python dataframes produces error: object of type 'builtin_function_or_method' has no len(). How to fix?

I have a df where I am calculating values by month. When I run this code on my df it generates the desired results:for i in range(12,len(df.index)): df.iloc[i, 1] = df.iloc[i-12,1]*(((df.iloc[i,3]/100)+(df.iloc[i,6]/100))+1)So far so good. I want...

  • 2787 Views
  • 2 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Hey there @Jack Homareau​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from y...

  • 0 kudos
1 More Replies
Sri_H
by New Contributor III
  • 1380 Views
  • 2 replies
  • 1 kudos

Databricks Academy - Access to training recording attended during Data & AI Summit 2022

Hi All,I attended a 2 day ML training during the Data & AI 2022 summit and I received an email from the events team (ataaisummit@typeaevents.com) telling that the recordings for training and related material will be available in my Databricks Academy...

  • 1380 Views
  • 2 replies
  • 1 kudos
Latest Reply
Anonymous
Not applicable
  • 1 kudos

Hi @Sri H​ ! I am checking on this for you - hang tight! I'll try and get an update asap from the Academy Team.

  • 1 kudos
1 More Replies
Data_Engineer3
by Contributor II
  • 5066 Views
  • 4 replies
  • 4 kudos

Resolved! Unable to read file from dbfs location in databricks.

When i tried to read file from dbfs, it throws error - Caused by: FileReadException: Error while reading file dbfs:/.......................parquet is not a Parquet file. Expected magic number at tail [80, 65, 82, 49] but found [105, 108, 101, 115].Bu...

  • 5066 Views
  • 4 replies
  • 4 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 4 kudos

Hi @KARTHICK N​, What's the one-line code you're trying to read the file, precisely the path?Can you confirm if your file is a CSV or Parquet file?Are you trying to read it in python or scala?

  • 4 kudos
3 More Replies
Join 100K+ Data Experts: Register Now & Grow with Us!

Excited to expand your horizons with us? Click here to Register and begin your journey to success!

Already a member? Login and join your local regional user group! If there isn’t one near you, fill out this form and we’ll create one for you to join!

Labels