Thanks to everyone who joined the Hassle-Free Data Ingestion webinar. You can access the on-demand recording here. We're sharing a subset of the phenomenal questions asked and answered throughout the session. You'll find Ingestion Q&A listed first, f...
Check out Part 2 of this Data Ingestion webinar to find out how to easily ingest semi-structured data at scale into your Delta Lake, including how to use Databricks Auto Loader to ingest JSON data into Delta Lake.
Thanks to everyone who joined the Best Practices for Your Data Architecture session on Optimizing Data Performance. You can access the on-demand session recording here and the pre-run performance benchmarks using the Spark UI Simulator. Proper cluste...
I have created a key in Azure Key Vault to store my secrets in it. In order to use it securely in Azure DataBricks, have created the secret scope and configured the Azure Key Vault properties. Out of curiosity, just wanted to check whether my key is ...
mapInPandas is one of the most powerful Spark functions. It uses an arrow-like in-memory data structure to split up Spark Data Frames into chunks and feeding them to a function that takes a Pandas DF as input and output. Check it out here:https://spa...
Ready to get hands-on? Explore the collaborative notebook environment: This gallery showcases some of the possibilities through Notebooks focused on technologies and use cases which can easily be imported into your own Databricks environment or the f...
2021-09 webinar: Automating the ML Lifecycle With Databricks Machine Learning (Post 2 of 2)Thank you to everyone who joined! You can access the on-demand recording here and the code in this Github repo.We're sharing a subset of the questions asked an...
2021-09 webinar: Automating the ML Lifecycle With Databricks Machine Learning (post 1 of 2)Thank you to everyone who joined the Automating the ML Lifecycle With Databricks Machine Learning webinar! You can access the on-demand recording here and the ...
The MLflow run was probably created either (a) via notebook autologging or (b) via a call to `mlflow.start_run()`. With (a), when the notebook first logs something to MLflow, it starts a run. But if the notebook is still active and attached to a clu...
Hi Folks,
I'm evaluating Delta Lake to store image / data version control to be used to train models. I looked at a session explaining how to do this and also using MLflow to manage training (https://databricks.com/session_na21/image-processing-on-d...
I can think of 3 ways for doing this:using the web UI (the create table option or upload data into DBFS)using databricks-connect, which bridges your local machine with the remote databricks clustersusing the databricks-cli to copy local files to dbfs...
I have run a few MLflow experiments and I can see them in the experiment history, but none of the metrics have been logged along with them. I thought this was supposed to be automatically included. Any idea why they wouldn't be showing up?
Hi @ trevor.bishop! My name is Kaniz, and I'm the technical moderator here. Great to meet you, and thanks for your question! Let's see if your peers on the Forum have an answer to your question first. Or else I will follow up shortly with a response.
Without knowing all that you are trying to do, the answer is yes, with the Instance Profile API. https://docs.databricks.com/dev-tools/api/latest/instance-profiles.html. You might also check out the SCIM APIs to associate the Instance Profile to a g...