cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Aviral-Bhardwaj
by Esteemed Contributor III
  • 4146 Views
  • 1 replies
  • 36 kudos

Understand Trigger Intervals in Streaming Pipelines in Databricks When defining a streaming write, the trigger the method specifies when the system sh...

Understand Trigger Intervals in Streaming Pipelines in DatabricksWhen defining a streaming write, the trigger the method specifies when the system should process the next set of data. Triggers are specified when defining how data will be written to a...

image
  • 4146 Views
  • 1 replies
  • 36 kudos
Latest Reply
jose_gonzalez
Databricks Employee
  • 36 kudos

Thank you for sharing

  • 36 kudos
rammy
by Contributor III
  • 2087 Views
  • 1 replies
  • 5 kudos

Not able to parse .doc extension file using scala in databricks notebook?

I could able to parse .doc extension files using Java programming with the help of POI libraries but when trying to convert Java code into Scala i expect it has to work with same java libraries with Scala programming but it is showing with below erro...

error screenshot Jar dependencies
  • 2087 Views
  • 1 replies
  • 5 kudos
Latest Reply
UmaMahesh1
Honored Contributor III
  • 5 kudos

Hi @Ramesh Bathini​ In pyspark, we have a docx module. I found that to be working perfectly fine. Can you try using that ?Documentation and stuff could be found online. Cheers...

  • 5 kudos
pabloaus
by New Contributor III
  • 6319 Views
  • 2 replies
  • 4 kudos

Resolved! How to read sql file from a Repo to string

I am trying to read a sql file in the repo to string. I have triedwith open("/Workspace/Repos/xx@***.com//file.sql","r") as queryFile: queryText = queryFile.read()And I get following error.[Errno 1] Operation not permitted: '/Workspace/Repos/***@*...

  • 6319 Views
  • 2 replies
  • 4 kudos
Latest Reply
Senthil1
Contributor
  • 4 kudos

I checked in my unity_catalog enabled cluster, i am able to access the @repos file to read and display

  • 4 kudos
1 More Replies
Ryan_Chynoweth
by Esteemed Contributor
  • 9142 Views
  • 3 replies
  • 7 kudos

Resolved! Best language to use

Databricks supports SQL, Scala, Python, and R. Is there a most performant language to use on Databricks? I know SQL well but would like to get into one of the other languages and don't know which to focus on.

  • 9142 Views
  • 3 replies
  • 7 kudos
Latest Reply
Anonymous
Not applicable
  • 7 kudos

It total depends on you? BTW, you can choose Python and SQL

  • 7 kudos
2 More Replies
Rahul_Tiwary
by New Contributor II
  • 7026 Views
  • 1 replies
  • 4 kudos

Getting Error "java.lang.NoSuchMethodError: org.apache.spark.sql.AnalysisException" while writing data to event hub for streaming. It is working fine if I am writing it to another data brick table

import org.apache.spark.sql._import scala.collection.JavaConverters._import com.microsoft.azure.eventhubs._import java.util.concurrent._import scala.collection.immutable._import org.apache.spark.eventhubs._import scala.concurrent.Futureimport scala.c...

  • 7026 Views
  • 1 replies
  • 4 kudos
Latest Reply
Gepap
New Contributor II
  • 4 kudos

The dataframe to write needs to have the following schema:Column | Type ---------------------------------------------- body (required) | string or binary partitionId (*optional) | string partitionKey...

  • 4 kudos
pret
by New Contributor II
  • 3868 Views
  • 4 replies
  • 0 kudos

How can I run a scala command line in databricks?

I wish to run a scala command, which I believe would normally be run from a scala command line rather than from within a notebook. It happens to be:scala [-cp scalatest-<version>.jar:...] org.scalatest.tools.Runner [arguments](scalatest_2.12__3.0.8.j...

  • 3868 Views
  • 4 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Hi @David Vardy​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Thanks...

  • 0 kudos
3 More Replies
BkP
by Contributor
  • 2648 Views
  • 2 replies
  • 3 kudos

Scala Connectivity to Databricks Bronze Layer Raw Data from a Non-Databricks Spark environment

Hi All, We are developing a new Scala/Java program which needs to read & process the raw data stored in source ADLS (which is a Databricks Environment) in parallel as the volume of the source data is very high (in GBs & TBs). What kind of connection ...

requirement
  • 2648 Views
  • 2 replies
  • 3 kudos
Latest Reply
BkP
Contributor
  • 3 kudos

hello experts. any advise on this question ?? tagging some folks from whom I have received answers before. Please help on this requirement or tag someone who can help on this@Kaniz Fatma​ , @Vartika Nain​ , @Bilal Aslam​ 

  • 3 kudos
1 More Replies
isaac_gritz
by Databricks Employee
  • 4098 Views
  • 6 replies
  • 8 kudos

Library Dependency

How to Install Libraries on DatabricksYou can install libraries in Databricks at the cluster level for libraries commonly used on a cluster, at the notebook-level using %pip, or using global init scripts when you have libraries that should be install...

  • 4098 Views
  • 6 replies
  • 8 kudos
Latest Reply
Chris_Shehu
Valued Contributor III
  • 8 kudos

It can be a risky to install libraries without any sort of oversite/security structure to ensure those libraries have no vulnerabilities. I think more caution needs to be added to the wording of these documents to express that. All of the libraries w...

  • 8 kudos
5 More Replies
learnerbricks
by New Contributor II
  • 1580 Views
  • 2 replies
  • 0 kudos

how should I start databricks ?

Hello Guys,I am new to databricks. I have try to read the documentation as much I can. Now I want to jump in. What I Want : I have store my parquet file in Databricks storage system. I want to load this file into Data Lake Table. And then want to do ...

  • 1580 Views
  • 2 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Hi @Learner bricks​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Tha...

  • 0 kudos
1 More Replies
齐木木
by New Contributor III
  • 2094 Views
  • 1 replies
  • 3 kudos

Resolved! The case class reports an error when running in the notebook

As shown in the figure, the case class and the json string are converted through fasterxml.jackson, but an unexpected error occurred during the running of the code. I think this problem may be related to the loading principle of the notebook. Because...

image.png local image
  • 2094 Views
  • 1 replies
  • 3 kudos
Latest Reply
齐木木
New Contributor III
  • 3 kudos

code:var str="{\"app_type\":\"installed-app\"}" import com.fasterxml.jackson.databind.ObjectMapper import com.fasterxml.jackson.module.scala.DefaultScalaModule val mapper = new ObjectMapper() mapper.registerModule(DefaultScalaModule) ...

  • 3 kudos
Sha_1890
by New Contributor III
  • 5415 Views
  • 8 replies
  • 0 kudos

How to execute a series of stored procedures using scala in databricks

I am working in a migration project, where lift and shift method is used to migrate SQL server DB from onprem to AZure Cloud. There are a lot of stored procedures used for integration in On prem. Now here in On prem , to process the XMl file and exec...

  • 5415 Views
  • 8 replies
  • 0 kudos
Latest Reply
Noopur_Nigam
Databricks Employee
  • 0 kudos

Hi @shafana Roohi Jahubar​ I hope that your queries are answered. Please let me know if you have more doubts.

  • 0 kudos
7 More Replies
isaac_gritz
by Databricks Employee
  • 2054 Views
  • 1 replies
  • 6 kudos

Versions of Spark, Python, Scala, R in each Databricks Runtime

What version of Spark, Python, Scala, R are included in each Databricks Runtime? What libraries are pre-installed?You can find this info at the Databricks runtime releases page (AWS | Azure | GCP).Let us know if you have any additional questions on t...

  • 2054 Views
  • 1 replies
  • 6 kudos
Latest Reply
maxdata
Databricks Employee
  • 6 kudos

Wow! Thanks for the help @Isaac Gritz​ !

  • 6 kudos
Sunny
by New Contributor III
  • 9209 Views
  • 6 replies
  • 1 kudos

Using Thread.sleep in Scala

We need to hit REST web service every 5 mins until success message is received. The Scala object is inside a Jar file and gets invoked by Databricks task within a workflow.Thread.sleep(5000) is working fine but not sure if it is safe practice or is t...

  • 9209 Views
  • 6 replies
  • 1 kudos
Latest Reply
Vartika
Databricks Employee
  • 1 kudos

Hey there @Sundeep P​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.C...

  • 1 kudos
5 More Replies
Labels