- 11698 Views
- 5 replies
- 2 kudos
Trying out Dolly - how to load pytorch_model.bin so it's not downloading it every time the cluster is restarted
Hi, I am new to LLM and am curious to try it out. I did the following code to test from the databricks website:import torch from transformers import pipeline instruct_pipeline = pipeline(model="databricks/dolly-v2-12b", torch_dtype=torch.bfloat16, tr...
- 11698 Views
- 5 replies
- 2 kudos
- 2 kudos
Just set the HF cache dir to a persistent path on /dbfs:import os os.environ['TRANSFORMERS_CACHE'] = "/dbfs/..."
- 2 kudos
- 911 Views
- 0 replies
- 0 kudos
I found these videos to be beneficial CI/CD with Azure Dev OpsCI/CD with GitHub Actions
I found these videos to be beneficialCI/CD with Azure Dev OpsCI/CD with GitHub Actions
- 911 Views
- 0 replies
- 0 kudos
- 3301 Views
- 4 replies
- 3 kudos
Azure Databricks Capabilities: My objective is to evaluate Azure Databricks capability and Do I need to use Azure Devops or Jenkins or Databricks suffice the need.
hi,We have Real time streaming usecase where we have to build pipeline using Azure Databricks.My objective is to evaluate Azure Databricks capability and Do I need to use Azure Devops or Jenkins or Databricks suffice the need.Can you please provide c...
- 3301 Views
- 4 replies
- 3 kudos
- 3 kudos
I found these youtube videos to be beneficial. CI/CD with Azure Dev Ops Terraform Enablement - Part 1 of 2
- 3 kudos
- 5755 Views
- 6 replies
- 1 kudos
FFmpeg frame extraction explodes memory, how to mitigate?
For a computer vision project, my raw data consists of encrypted videos (60fps) stored in Azure Blob Storage. In order to have the data usable for model training, I need to do some preprocessing and for that I need the video split into individual fra...
- 5755 Views
- 6 replies
- 1 kudos
- 1 kudos
In the end, I decided to change around the workflow so it is as efficient as I could imagine it:Extract frames of video files in a containerized application somewhere running ffmpeg and storing the resulting frames in a parquet file in blob storage (...
- 1 kudos
- 2124 Views
- 2 replies
- 0 kudos
Can an HMS-managed table be upgraded to Unity Catalog?
As the question states, I am not getting the option to upgrade managed tables on UC. Is that possible, I can't find anything on the documentation?
- 2124 Views
- 2 replies
- 0 kudos
- 0 kudos
In case anyone else ever faced the same issue
- 0 kudos
- 7182 Views
- 3 replies
- 2 kudos
Resolved! Issues loading .txt files from DBFS into Langchain TextLoader()
Hello,I am working on building a Langchain QA application in Databricks. I currently have 13 .txt files loaded into the DBFS and am trying to read them in iteratively with TextLoader(), load them into the RecursiveCharacterTextSplitter() from Langcha...
- 7182 Views
- 3 replies
- 2 kudos
- 2 kudos
Hi @David Kersey​ Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answers ...
- 2 kudos
- 5822 Views
- 0 replies
- 1 kudos
Failure in Databricks Serving endpoint <build log says Pip failed due to conflicting dependency.>
Hello All,We are trying to deploy some models using Databricks Serving endpoint, But while deploying the artifact created during experiment run the serving endpoint build log says Pip failed due to conflicting dependency.The model is logged in experi...
- 5822 Views
- 0 replies
- 1 kudos
- 7563 Views
- 1 replies
- 2 kudos
Resolved! error in running a LLM model in pyfunc.spark_udf
error in running a LLM model in pyfunc.spark_udf
- 7563 Views
- 1 replies
- 2 kudos
- 2 kudos
Solution: Please find below the example. Creating a secret and scope is a one time activity once we create a scope and secret we can access the token using any notebook or cluster in the workspace as shown below. After creating a secret if we want ...
- 2 kudos
- 11141 Views
- 5 replies
- 2 kudos
Resolved! mlflow down in workspace?
Mlflow started failing all of a sudden for no reason when logged in databricks community edition:Any idea why this is happening or is there a way to restart the mlflow server?
- 11141 Views
- 5 replies
- 2 kudos
- 2 kudos
Hi @Zheng Han​ Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answers you...
- 2 kudos
- 2546 Views
- 2 replies
- 1 kudos
Mlflow :loading script failed !!
I am using mlflow to track experimentation with databricks but todaty i tried to access my experimetations in dtabricks and i face the error .
- 2546 Views
- 2 replies
- 1 kudos
- 1 kudos
I didn't manage to solve the error . I guess it is related to databricks community cloud because I tested with another account and it all the same.
- 1 kudos
- 8778 Views
- 2 replies
- 2 kudos
Resolved! AutoML with Stratified Sampling
Is it possible to use a stratified sampling strategy for the train/test/validate splits that the automl library does? We are working in a context where we need to segregate certain groups from the training and test sets to see how our models general...
- 8778 Views
- 2 replies
- 2 kudos
- 2 kudos
HI @Jared Webb​ Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answers yo...
- 2 kudos
- 6883 Views
- 2 replies
- 1 kudos
Isolation Forest prediction failing DLT pipeline, the same model works fine when prediction is done outside DLT pipeline.
Hey community membersI am new to Databricks and was building a simple DLT pipleine that loads data from S3 and runs an Isolation forest prediction to detect anomalies. The model has been stored in Model Registry. Here's the code for the pipeline:@dlt...
- 6883 Views
- 2 replies
- 1 kudos
- 1 kudos
Hi @Mukul Degweker​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Tha...
- 1 kudos
- 1533 Views
- 1 replies
- 2 kudos
Sample Archietcture for Databricks MLOps
Do anyone have sample architectures for Mlops using Databricks and other possible variations of architecture ?
- 1533 Views
- 1 replies
- 2 kudos
- 2 kudos
@Saurabh Singh​ This is well documented here:https://www.databricks.com/blog/2022/06/22/architecting-mlops-on-the-lakehouse.htmlPlease see: Reference architecture for MLOpsFurther refrences: Refer to The Big Book of MLOps for more discussion of the a...
- 2 kudos
- 900 Views
- 0 replies
- 1 kudos
Hello Everyone, I am thrilled to announce that we have our 5th winner for the raffle contest - @Emilia​. Please join me in congratulating her on this ...
Hello Everyone,I am thrilled to announce that we have our 5th winner for the raffle contest - @Emilia​. Please join me in congratulating her on this remarkable achievement!Your dedication and hard work have paid off, and we are delighted to have you ...
- 900 Views
- 0 replies
- 1 kudos
- 1678 Views
- 1 replies
- 1 kudos
When should you use the directory listing vs file notification
We are using Delta Live Tables for running ingestion pipelines and have come across the two options for the autoloader "file notification" vs "directory listing" this is reflected in the option cloudFiles.useIncrementalListing. We are wondering what ...
- 1678 Views
- 1 replies
- 1 kudos
- 1 kudos
@Bennett Lambert​ :The choice between using "file notification" vs "directory listing" for the autoloader in Delta Live Tables depends on your specific use case and requirements. Here are some general guidelines:Use file notification if you need real...
- 1 kudos
Join Us as a Local Community Builder!
Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!
Sign Up Now-
Access control
3 -
Access Data
2 -
AccessKeyVault
1 -
ADB
2 -
Airflow
1 -
Amazon
2 -
Apache
1 -
Apache spark
3 -
APILimit
1 -
Artifacts
1 -
Audit
1 -
Autoloader
6 -
Autologging
2 -
Automation
2 -
Automl
32 -
AWS
7 -
Aws databricks
1 -
AWSSagemaker
1 -
Azure
32 -
Azure active directory
1 -
Azure blob storage
2 -
Azure data lake
1 -
Azure Data Lake Storage
3 -
Azure data lake store
1 -
Azure databricks
32 -
Azure event hub
1 -
Azure key vault
1 -
Azure sql database
1 -
Azure Storage
2 -
Azure synapse
1 -
Azure Unity Catalog
1 -
Azure vm
1 -
AzureML
2 -
Bar
1 -
Beta
1 -
Better Way
1 -
BI Integrations
1 -
BI Tool
1 -
Billing and Cost Management
1 -
Blob
1 -
Blog
1 -
Blog Post
1 -
Broadcast variable
1 -
Business Intelligence
1 -
CatalogDDL
1 -
Centralized Model Registry
1 -
Certification
2 -
Certification Badge
1 -
Change
1 -
Change Logs
1 -
Chatgpt
2 -
Check
2 -
Classification Model
1 -
Cloud Storage
1 -
Cluster
10 -
Cluster policy
1 -
Cluster Start
1 -
Cluster Termination
2 -
Clustering
1 -
ClusterMemory
1 -
CNN HOF
1 -
Column names
1 -
Community Edition
1 -
Community Edition Password
1 -
Community Members
1 -
Company Email
1 -
Condition
1 -
Config
1 -
Configure
3 -
Confluent Cloud
1 -
Container
2 -
ContainerServices
1 -
Control Plane
1 -
ControlPlane
1 -
Copy
1 -
Copy into
2 -
CosmosDB
1 -
Courses
2 -
Csv files
1 -
Dashboards
1 -
Data
8 -
Data Engineer Associate
1 -
Data Engineer Certification
1 -
Data Explorer
1 -
Data Ingestion
2 -
Data Ingestion & connectivity
11 -
Data Quality
1 -
Data Quality Checks
1 -
Data Science & Engineering
2 -
databricks
5 -
Databricks Academy
3 -
Databricks Account
1 -
Databricks AutoML
9 -
Databricks Cluster
3 -
Databricks Community
5 -
Databricks community edition
4 -
Databricks connect
1 -
Databricks dbfs
1 -
Databricks Feature Store
1 -
Databricks Job
1 -
Databricks Lakehouse
1 -
Databricks Mlflow
4 -
Databricks Model
2 -
Databricks notebook
10 -
Databricks ODBC
1 -
Databricks Platform
1 -
Databricks Pyspark
1 -
Databricks Python Notebook
1 -
Databricks Runtime
9 -
Databricks SQL
8 -
Databricks SQL Permission Problems
1 -
Databricks Terraform
1 -
Databricks Training
2 -
Databricks Unity Catalog
1 -
Databricks V2
1 -
Databricks version
1 -
Databricks Workflow
2 -
Databricks Workflows
1 -
Databricks workspace
2 -
Databricks-connect
1 -
DatabricksContainer
1 -
DatabricksML
6 -
Dataframe
3 -
DataSharing
1 -
Datatype
1 -
DataVersioning
1 -
Date Column
1 -
Dateadd
1 -
DB Notebook
1 -
DB Runtime
1 -
DBFS
5 -
DBFS Rest Api
1 -
Dbt
1 -
Dbu
1 -
DDL
1 -
DDP
1 -
Dear Community
1 -
DecisionTree
1 -
Deep learning
4 -
Default Location
1 -
Delete
1 -
Delt Lake
4 -
Delta
24 -
Delta lake table
1 -
Delta Live
1 -
Delta Live Tables
6 -
Delta log
1 -
Delta Sharing
3 -
Delta-lake
1 -
Deploy
1 -
DESC
1 -
Details
1 -
Dev
1 -
Devops
1 -
Df
1 -
Different Notebook
1 -
Different Parameters
1 -
DimensionTables
1 -
Directory
3 -
Disable
1 -
Distribution
1 -
DLT
6 -
DLT Pipeline
3 -
Dolly
5 -
Dolly Demo
2 -
Download
2 -
EC2
1 -
Emr
2 -
Ensemble Models
1 -
Environment Variable
1 -
Epoch
1 -
Error handling
1 -
Error log
2 -
Eventhub
1 -
Example
1 -
Experiments
4 -
External Sources
1 -
Extract
1 -
Fact Tables
1 -
Failure
2 -
Feature Lookup
2 -
Feature Store
53 -
Feature Store API
2 -
Feature Store Table
1 -
Feature Table
6 -
Feature Tables
4 -
Features
2 -
FeatureStore
2 -
File Path
2 -
File Size
1 -
Fine Tune Spark Jobs
1 -
Forecasting
2 -
Forgot Password
2 -
Garbage Collection
1 -
Garbage Collection Optimization
1 -
Github
2 -
Github actions
2 -
Github Repo
2 -
Gitlab
1 -
GKE
1 -
Global Init Script
1 -
Global init scripts
4 -
Governance
1 -
Hi
1 -
Horovod
1 -
Html
1 -
Hyperopt
4 -
Hyperparameter Tuning
2 -
Iam
1 -
Image
3 -
Image Data
1 -
Inference Setup Error
1 -
INFORMATION
1 -
Input
1 -
Insert
1 -
Instance Profile
1 -
Int
2 -
Interactive cluster
1 -
Internal error
1 -
Invalid Type Code
1 -
IP
1 -
Ipython
1 -
Ipywidgets
1 -
JDBC Connections
1 -
Jira
1 -
Job
4 -
Job Parameters
1 -
Job Runs
1 -
Join
1 -
Jsonfile
1 -
Kafka consumer
1 -
Key Management
1 -
Kinesis
1 -
Lakehouse
1 -
Large Datasets
1 -
Latest Version
1 -
Learning
1 -
Limit
3 -
LLM
3 -
LLMs
1 -
Local computer
1 -
Local Machine
1 -
Log Model
2 -
Logging
1 -
Login
1 -
Logs
1 -
Long Time
2 -
Low Latency APIs
2 -
LTS ML
3 -
Machine
3 -
Machine Learning
24 -
Machine Learning Associate
1 -
Managed Table
1 -
Max Retries
1 -
Maximum Number
1 -
Medallion Architecture
1 -
Memory
3 -
Metadata
1 -
Metrics
3 -
Microsoft azure
1 -
ML Lifecycle
4 -
ML Model
4 -
ML Practioner
3 -
ML Runtime
1 -
MlFlow
75 -
MLflow API
5 -
MLflow Artifacts
2 -
MLflow Experiment
6 -
MLflow Experiments
3 -
Mlflow Model
10 -
Mlflow registry
3 -
Mlflow Run
1 -
Mlflow Server
5 -
MLFlow Tracking Server
3 -
MLModels
2 -
Model Deployment
4 -
Model Lifecycle
6 -
Model Loading
2 -
Model Monitoring
1 -
Model registry
5 -
Model Serving
4 -
Model Serving Cluster
2 -
Model Serving REST API
6 -
Model Training
2 -
Model Tuning
1 -
Models
8 -
Module
3 -
Modulenotfounderror
1 -
MongoDB
1 -
Mount Point
1 -
Mounts
1 -
Multi
1 -
Multiline
1 -
Multiple users
1 -
Nested
1 -
New Feature
1 -
New Features
1 -
New Workspace
1 -
Nlp
3 -
Note
1 -
Notebook
6 -
Notification
2 -
Object
3 -
Onboarding
1 -
Online Feature Store Table
1 -
OOM Error
1 -
Open Source MLflow
4 -
Optimization
2 -
Optimize Command
1 -
OSS
3 -
Overwatch
1 -
Overwrite
2 -
Packages
2 -
Pandas udf
4 -
Pandas_udf
1 -
Parallel
1 -
Parallel processing
1 -
Parallel Runs
1 -
Parallelism
1 -
Parameter
2 -
PARAMETER VALUE
2 -
Partner Academy
1 -
Pending State
2 -
Performance Tuning
1 -
Photon Engine
1 -
Pickle
1 -
Pickle Files
2 -
Pip
2 -
Points
1 -
Possible
1 -
Postgres
1 -
Pricing
2 -
Primary Key
1 -
Primary Key Constraint
1 -
Progress bar
2 -
Proven Practices
2 -
Public
2 -
Pymc3 Models
2 -
PyPI
1 -
Pyspark
6 -
Python
21 -
Python API
1 -
Python Code
1 -
Python Function
3 -
Python Libraries
1 -
Python Packages
1 -
Python Project
1 -
Pytorch
3 -
Reading-excel
2 -
Redis
2 -
Region
1 -
Remote RPC Client
1 -
RESTAPI
1 -
Result
1 -
Runtime update
1 -
Sagemaker
1 -
Salesforce
1 -
SAP
1 -
Scalability
1 -
Scalable Machine
2 -
Schema evolution
1 -
Script
1 -
Search
1 -
Security
2 -
Security Exception
1 -
Self Service Notebooks
1 -
Server
1 -
Serverless
1 -
Serving
1 -
Shap
2 -
Size
1 -
Sklearn
1 -
Slow
1 -
Small Scale Experimentation
1 -
Source Table
1 -
Spark
13 -
Spark config
1 -
Spark connector
1 -
Spark Error
1 -
Spark MLlib
2 -
Spark Pandas Api
1 -
Spark ui
1 -
Spark Version
2 -
Spark-submit
1 -
SparkML Models
2 -
Sparknlp
3 -
Spot
1 -
SQL
19 -
SQL Editor
1 -
SQL Queries
1 -
SQL Visualizations
1 -
Stage failure
2 -
Storage
3 -
Stream
2 -
Stream Data
1 -
Structtype
1 -
Structured streaming
2 -
Study Material
1 -
Summit23
2 -
Support
1 -
Support Team
1 -
Synapse
1 -
Synapse ML
1 -
Table
4 -
Table access control
1 -
Tableau
1 -
Task
1 -
Temporary View
1 -
Tensor flow
1 -
Test
1 -
Timeseries
1 -
Timestamps
1 -
TODAY
1 -
Training
6 -
Transaction Log
1 -
Trying
1 -
Tuning
2 -
UAT
1 -
Ui
1 -
Unexpected Error
1 -
Unity Catalog
12 -
Use Case
2 -
Use cases
1 -
Uuid
1 -
Validate ML Model
2 -
Values
1 -
Variable
1 -
Vector
1 -
Versioncontrol
1 -
Visualization
2 -
Web App Azure Databricks
1 -
Weekly Release Notes
2 -
Whl
1 -
Worker Nodes
1 -
Workflow
2 -
Workflow Jobs
1 -
Workspace
2 -
Write
1 -
Writing
1 -
Z-ordering
1 -
Zorder
1
- « Previous
- Next »
User | Count |
---|---|
89 | |
39 | |
37 | |
25 | |
25 |