cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 
Data + AI Summit 2024 - Data Engineering & Streaming

Forum Posts

delta_lake
by New Contributor
  • 1417 Views
  • 3 replies
  • 1 kudos

Delta Lake Python

I have setup a virtual environment inside my existing hadoop cluster. Since the current cluster does not have spark >3 , so i installed delta spark using virtual environment. While trying to access the hdfs which is kerberose one, Getting below error...

  • 1417 Views
  • 3 replies
  • 1 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 1 kudos

Hi @Vasanth P​ ​ , Just a friendly follow-up. Do you still need help or the above responses help you to find the solution? Please let us know.

  • 1 kudos
2 More Replies
IkramMecheri
by New Contributor II
  • 8964 Views
  • 5 replies
  • 2 kudos

ImportError: No module named 'bs4'

Hi, I would like to do some web scrapping, however I am unable to import the libraries I traditionally use for that task import requests from bs4 import BeautifulSoup

  • 8964 Views
  • 5 replies
  • 2 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 2 kudos

Hi @Ikram Mecheri​ ​ , Just a friendly follow-up. Do you still need help, or do the above responses help you find the solution? Please let us know.

  • 2 kudos
4 More Replies
User16868770416
by Contributor
  • 1422 Views
  • 4 replies
  • 2 kudos
  • 1422 Views
  • 4 replies
  • 2 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 2 kudos

Hi @Will Block​ , Just a friendly follow-up. Do you still need help or the above responses help you to find the solution? Please let us know.

  • 2 kudos
3 More Replies
Zen
by New Contributor III
  • 3428 Views
  • 9 replies
  • 2 kudos

Resolved! How do I run a scala script from the Terminal

Hello, how do I run a scala script from a Terminal on Databricks - Web Terminal, or from a cell with %sh just doing `scala -nc script.scala` is not working.Thanks,

  • 3428 Views
  • 9 replies
  • 2 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 2 kudos

Hi @Zen)​, Just a friendly follow-up. Do you still need help, or @DARSHAN BARGAL​ 's response help you to find the solution? Please let us know.

  • 2 kudos
8 More Replies
Alex_G
by New Contributor II
  • 1542 Views
  • 3 replies
  • 5 kudos

Resolved! Databricks Feature Store in MLFlow run CLI command

Hello!I am attempting to move some machine learning code from a databricks notebook into a mlflow git repository. I am utilizing the databricks feature store to load features that have been processed. Currently I cannot get the databricks library to ...

  • 1542 Views
  • 3 replies
  • 5 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 5 kudos

Hi @Alex Graff​  , Just a friendly follow-up. Do you still need help, or @Sean Owen​ 's response help you to find the solution? Please let us know.

  • 5 kudos
2 More Replies
NickGoodfella
by New Contributor
  • 1225 Views
  • 2 replies
  • 1 kudos

DNS_Analytics Notebook Problems

Hello everyone! First post on the forums, been stuck at this for awhile now and cannot seem to understand why this is happening. Basically, I have been using a seems to be premade Databricks notebook from Databricks themselves for a DNS Analytics exa...

  • 1225 Views
  • 2 replies
  • 1 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 1 kudos

Hi @NickGoodfella​ , Just a friendly follow-up. Do you still need help, or @Sean Owen​'s response help you to find the solution? Please let us know.

  • 1 kudos
1 More Replies
EricOX
by New Contributor
  • 3472 Views
  • 3 replies
  • 3 kudos

Resolved! How to handle configuration for different environment (e.g. DEV, PROD)?

May I know any suggested way to handle different environment variables for the same code base? For example, the mount point of Data Lake for DEV, UAT, and PROD. Any recommendations or best practices? Moreover, how to handle Azure DevOps?

  • 3472 Views
  • 3 replies
  • 3 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 3 kudos

Hi @Eric Yeung​  , Just a friendly follow-up. Do you still need help or the above responses help you to find the solution? Please let us know.

  • 3 kudos
2 More Replies
StephanieRivera
by Valued Contributor II
  • 3389 Views
  • 2 replies
  • 5 kudos

Resolved! How to add a select all option in a Databricks SQL parameter? I would like to use a query-based drop-down list.

So I want to create a select all button in a parameter. The actual parameter has around 200 options because of the size of the database. However, if I want a general summary where you can see all the options I would have to select one by one and that...

  • 3389 Views
  • 2 replies
  • 5 kudos
Latest Reply
StephanieRivera
Valued Contributor II
  • 5 kudos

You could add '--- All Stores ---' to your list. Here is the query I would use to populate the drop-down. S.O. answer hereSELECT store as store_name FROM ( Select Distinct store From Table   UNION ALL   SELECT ...

  • 5 kudos
1 More Replies
Jin_Kim
by New Contributor II
  • 3965 Views
  • 4 replies
  • 5 kudos

Resolved! address how to use multiple spark streaming jobs connecting to one job cluster

Hi,We have a scenario where we need to deploy 15 spark streaming applications on databricks reading from kafka to single Job cluster. We tried following approach:1. create job 1 with new job cluster (C1)2. create job2 pointing to C1...3. create job15...

  • 3965 Views
  • 4 replies
  • 5 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 5 kudos

Hi @Jin Kim​, Just a friendly follow-up. Do you still need help, or the above responses help you to find the solution? Please let us know.

  • 5 kudos
3 More Replies
dataslicer
by Contributor
  • 3954 Views
  • 7 replies
  • 2 kudos

Resolved! Exploring additional cost saving options for structured streaming 24x7x365 uptime workloads

I currently have multiple jobs (each running its own job cluster) for my spark structured streaming pipelines that are long running 24x7x365 on DBR 9.x/10.x LTS. My SLAs are 24x7x365 with 1 minute latency. I have already accomplished the following co...

  • 3954 Views
  • 7 replies
  • 2 kudos
Latest Reply
Anonymous
Not applicable
  • 2 kudos

http://doramasmp4.tv/

  • 2 kudos
6 More Replies
pantelis_mare
by Contributor III
  • 2679 Views
  • 5 replies
  • 5 kudos

Resolved! Slow imports for concurrent notebooks

Hello all,I have a large number of light notebooks to run so I am taking the concurrent approach launching notebook runs with dbutils.notebook.run in parallel. The more I increase parallelism the more I see the duration of each notebook increasing.I ...

  • 2679 Views
  • 5 replies
  • 5 kudos
Latest Reply
pantelis_mare
Contributor III
  • 5 kudos

Hello @Kaniz Fatma​ yes it is clear.Following some tests on my side using a ***** notebook that all it does is importing stuff and sleeping for 15 secs (so nothing to do with spark) I figured that even with a 32 cores driver, the fatigue point is clo...

  • 5 kudos
4 More Replies
Anonymous
by Not applicable
  • 1271 Views
  • 3 replies
  • 2 kudos

Resolved! JOB API KEEPS SAYING THE JOB IS RUNNING

I have a library that waits until the job goes in the "TERMINATED" / "SKIPPED" state before continuing. It pools the JOB API.Unfortunately, I'm experiencing cases where the job is terminated on the GUI but the API still keeps saying "RUNNING".There i...

  • 1271 Views
  • 3 replies
  • 2 kudos
Latest Reply
Prabakar
Esteemed Contributor III
  • 2 kudos

@Alessio Palma​ could you please provide the API that you are using? Also share some sample output and logs that would help us with some information.

  • 2 kudos
2 More Replies
Serhii
by Contributor
  • 1579 Views
  • 4 replies
  • 9 kudos

Resolved! DBFS FileStore html document not showing in the browser

hello all! I am using the guide https://docs.databricks.com/data/filestore.html to save folder of static html content to the DBFS FileStore directory (as a sub-directory) and have "enable DBFS web browsing" setting on but still I can't view the web p...

  • 1579 Views
  • 4 replies
  • 9 kudos
Latest Reply
Prabakar
Esteemed Contributor III
  • 9 kudos

@Sergii Ivakhno​ In FileStore you can save files, such as images and libraries, that are accessible within HTML and JavaScript when you call displayHTML. However when you try to access the link it will download the file to your local desktop.

  • 9 kudos
3 More Replies
my_community2
by New Contributor III
  • 4906 Views
  • 10 replies
  • 1 kudos

Running notebooks on DataBricks in Azure blowing up all over since morning of Apr 5 (MST). Was there another poor deployment at DataBricks? This reall...

Running notebooks on DataBricks in Azure blowing up all over since morning of Apr 5 (MST). Was there another poor deployment at DataBricks? This really needs to stop. We are running premium DataBricks on Azure and calling notebooks from ADF.10.2 (inc...

image
  • 4906 Views
  • 10 replies
  • 1 kudos
Latest Reply
Prabakar
Esteemed Contributor III
  • 1 kudos

@Maciej G​ try using the below init script to increase the repl timeout.-------------------------------------- #!/bin/bash cat > /databricks/common/conf/set_repl_timeout.conf << EOL {  databricks.daemon.driver.launchTimeout = 150 }EOL----------------...

  • 1 kudos
9 More Replies
Join 100K+ Data Experts: Register Now & Grow with Us!

Excited to expand your horizons with us? Click here to Register and begin your journey to success!

Already a member? Login and join your local regional user group! If there isn’t one near you, fill out this form and we’ll create one for you to join!

Labels