cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

ERROR: No matching distribution found for databricks-smolder

dfoard
New Contributor

I'm trying to follow along with the blog post Gaining Insights Into Your HL7 Data With Smolder and Databricks-#1 of 3. I was able to finally get a jar file built from the repo using Java 17 and it successfully imports into the cluster. 

However, when I run the command: 

import com.databricks.labs.smolder.functions.parse_hl7_message

I get the following error: ModuleNotFoundError: No module named 'com.databricks.labs'

I was offered the following by the Assistant:

It seems that you are trying to import a module that is not recognized by your notebook. It could be that the module is not installed correctly or that it is not available in your notebook's environment.

To fix the issue, try installing the smolder library using the following command in a new cell:

%pip install 
databricks-smolder.
 
However twhen I run this, it fails as well.
 
Can anyone offer any guidance?

 

1 REPLY 1

Kaniz
Community Manager
Community Manager

Hi @dfoard , 

It appears that the error is due to attempting to import a Java package in Python code, which isn't supported. The Smolder library is designed to work with Scala code in a Databricks Notebook environment.

To use the com.databricks.labs.smolder package in Databricks Notebook, you should work with Scala code and ensure that the Smolder library is available in the classpath. Here's a guide on how to use Smolder in Databricks:

  1. Import the Smolder Package in Scala: Start by importing the Smolder package in a Scala cell of your Databricks Notebook using %scala the magic command.

    For example:

     
    %scala import com.databricks.labs.smolder.functions.parse_hl7_message
  2. Load the Smolder Library in Databricks Notebook: The Smolder JAR file should be either installed or uploaded to your Databricks workspace. To add it to the classpath, you can use the spark.jars configuration option. Replace /path/to/smolder.jar with the actual path to the Smolder JAR in your workspace. You can set it up like this:

     
    %scala val spark = SparkSession.builder().appName("Smolder Test").config("spark.jars", "/path/to/smolder.jar").getOrCreate()

    Alternatively, you can use the %AddJar magic command to add the library to the classpath.

    For example:

     
    %scala %AddJar -i databricks -c Maven -g com.databricks.labs -a databricks-smolder_2.12 -v 1.0.0-spark3.0 -u https://mvnrepository.com -f
  3. Use Smolder Functions in Scala: Once the Smolder library is loaded into the classpath, you can utilize its functions in your notebook. For instance, you can use the parse_hl7_message function to parse HL7 messages and extract fields. Here's a code snippet as an example:

     
    %scala import com.databricks.labs.smolder.functions.parse_hl7_message val hl7_msg = "MSH|^~\\&|..." val fields = parse_hl7_message(hl7_msg).toMap fields.foreach(println)

This Scala code demonstrates how to use the Smolder library to parse an HL7 message and work with its fields. You can adapt this example to suit your specific requirements.

Join 100K+ Data Experts: Register Now & Grow with Us!

Excited to expand your horizons with us? Click here to Register and begin your journey to success!

Already a member? Login and join your local regional user group! If there isn’t one near you, fill out this form and we’ll create one for you to join!