cancel
Showing results for 
Search instead for 
Did you mean: 
Machine Learning
Dive into the world of machine learning on the Databricks platform. Explore discussions on algorithms, model training, deployment, and more. Connect with ML enthusiasts and experts.
cancel
Showing results for 
Search instead for 
Did you mean: 

Surprisingly sparse_logs and tensorboard logfiles in Databricks-Workspace

steve2
New Contributor

Hi, surprisingly we have found 2 new folders with some short logfiles in our Databricks workspace:
ls -lFr sparse_logs/ tensorboard/
tensorboard/:
-rwxrwxrwx 1 root root 88 Sep  2 11:26 events.out.tfevents.1725275744.0830-063833-n68nsxoq-10-139-64-10.2071.0*
-rwxrwxrwx 1 root root 88 Sep  2 07:26 events.out.tfevents.1725261966.0830-063833-n68nsxoq-10-139-64-10.2411.0*
-rwxrwxrwx 1 root root 88 Sep  2 06:52 events.out.tfevents.1725259952.0830-063833-n68nsxoq-10-139-64-10.4812.0*
-rwxrwxrwx 1 root root 88 Aug 30 07:34 events.out.tfevents.1725000798.0830-063833-n68nsxoq-10-139-64-12.2804.0*
-rwxrwxrwx 1 root root 88 Aug 30 06:30 events.out.tfevents.1724998741.0828-073605-8ilf5p15-10-139-64-11.121675.0*
-rwxrwxrwx 1 root root 88 Aug 30 06:09 events.out.tfevents.1724997141.0828-073605-8ilf5p15-10-139-64-11.117151.0*
-rwxrwxrwx 1 root root 88 Aug 30 05:41 events.out.tfevents.1724995015.0828-073605-8ilf5p15-10-139-64-11.112695.0*
-rwxrwxrwx 1 root root 88 Aug 30 05:08 events.out.tfevents.1724939336.0828-073605-8ilf5p15-10-139-64-11.4013.0*

sparse_logs/:
-rwxrwxrwx 1 root root 43 Aug 30 07:34 30-08-2024_06.53.17.log*
-rwxrwxrwx 1 root root 43 Aug 30 06:30 30-08-2024_06.19.01.log*
-rwxrwxrwx 1 root root 43 Aug 30 06:09 30-08-2024_05.52.20.log*
-rwxrwxrwx 1 root root 43 Sep  6 08:49 30-08-2024_05.16.55.log*
-rwxrwxrwx 1 root root 43 Sep 25 12:42 29-08-2024_13.48.55.log*
-rwxrwxrwx 1 root root 43 Sep 25 12:47 02-09-2024_11.15.43.log*
-rwxrwxrwx 1 root root 43 Sep 25 12:44 02-09-2024_07.26.06.log*

We do some RAG-developement and thereby we serve an embedding-model (bge_large_en_v1_5-1) and llm (llama_2_7b_chat_hf-3, llama3_70b_endpoint), but actually we can't reproduce, what or which process has created these files. After deleteing them they never appeared again. Has anybody an idea, what it was?
Content sparse-logs was: manager stage: Model structure initialized
Thanks 😊

1 REPLY 1

Louis_Frolio
Databricks Employee
Databricks Employee

Hey @steve2 ,  short answer: these look like TensorBoard event files, likely created by a library that briefly initialized a TensorBoard logger or writer during one of your training/serving runs; the sparse_logs folder naming and “manager stage: Model structure initialized” message strongly suggest a SparseML/Neural Magic integration was present at that time, which also commonly wires up a TensorBoard logger. Once that component stopped initializing, the files stopped appearing.

What the files are

  • The files named events.out.tfevents.… are TensorBoard event logs written by TensorFlow, PyTorch’s SummaryWriter, PyTorch Lightning, Hugging Face Trainer, or frameworks that integrate a TensorBoard logger. TensorBoard recursively looks for “tfevents” files under a log directory, and the filenames include a timestamp, host, and process ID.
  • On Databricks Runtime ML, TensorBoard is preinstalled and commonly used to monitor deep learning runs, which makes it easy for frameworks to emit these event files when a writer/logger is initialized.
  • In PyTorch, simply creating a SummaryWriter is enough to create a tiny tfevents file even if no scalars are written; default logdir is “./runs”.

Why you might see them even if you didn’t explicitly enable TensorBoard

  • Some frameworks auto-create a TensorBoard logger if TensorBoard is available. For example, SparseML integrations show code wiring a TensorBoardLogger into training (e.g., timm integration: TensorBoardLogger(log_path=output_dir)), which will generate “events.out.tfevents.*” in that directory even for short runs or initialization-only stages.
  • The message you saw in sparse_logs—“manager stage: Model structure initialized”—is consistent with SparseML’s notion of a “ScheduledModifierManager” that applies recipe-driven sparsification steps during model setup/training. SparseML docs and examples reference this Manager as the component that modifies and finalizes model training loops, and those integrations frequently attach loggers (including TensorBoard).

Why they disappeared after deletion

  • If the component that previously initialized a TensorBoard writer/logger is no longer invoked (e.g., a package removed, a logger disabled, or code path changed), new tfevents files won’t be created. On Databricks, as soon as a run stops using a TB writer, no new files appear.

Likely origin in your setup Based on your note:

Folder name tensorboard/ plus tiny 88-byte tfevents files suggests a writer was opened and closed quickly with minimal or no logged data, perhaps on startup/probing of a training loop or serving component.
  • Folder name sparse_logs/ and the line “manager stage: Model structure initialized” are characteristic of Neural Magic’s SparseML/DeepSparse stack, which uses a “Manager” abstraction and commonly emits framework-stage messages and can include a TensorBoard logger alongside Python logging; this would explain both directories appearing around the same times.

How to confirm the source Try the following quick checks:

  • Search your notebooks/jobs for any of these strings: “SummaryWriter”, “TensorBoardLogger”, “report_to='tensorboard'”, “SparseML”, “ScheduledModifierManager”, “deepsparse”. If any appear in code or dependencies used around Aug 30–Sep 2, that’s likely the origin.
  • Check installed packages on the cluster image used then: pip list | grep -E 'sparseml|deepsparse|tensorboard|pytorch-lightning'. SparseML/DeepSparse presence would support the sparse_logs link.
  • If you used PyTorch Lightning or Hugging Face Trainer, verify whether TensorBoard logging was implicitly enabled (PL’s TensorBoardLogger, or HF TrainingArguments(report_to=["tensorboard"])). Either can generate tfevents files even with brief runs.
  • Verify that the tensorboard/ directory was set explicitly by your code/config. By default PyTorch writes to “./runs”; a custom path (like “tensorboard/”) is often set by a logger or training script.

Is this anything to worry about?

  • Not typically. These are benign artifacts of a logger being initialized. If you don’t want them: Disable or remove the TensorBoard logger/writer in the relevant code path. Ensure frameworks don’t auto-wire TB loggers (e.g., adjust SparseML or Lightning logger configs).
  • If you want to use TensorBoard, you can point TensorBoard to that directory and visualize metrics directly in Databricks notebooks or a separate tab.
  •  
Hope this helps, Louis.

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now