Unable to train new model, training always fails

Wolphy · April 5, 2025, 12:05pm

Hi support, we’ve been unable to train a new model for a few days now as it always fails with this response: This training job did not complete successfully. This can happen for a few reasons but often means that the chosen model dimension (which corresponds to image size) was too large to fit into GPU memory.

We’ve tried adjusting settings as mentioned, as well as disabling all augmentations but have had the same issue every time. We have not changed the original dataset and tried using the same settings as a version that worked fine 2 weeks ago but it also ended up failing. At this point we believe it might be an issue specific to Semantic segmentation.

Our project is stalled due to this, so any help would be really appreciated!

Project Type: Semantic Segmentation
Operating System & Browser: Windows 10 - Chrome
Project Universe Link or Workspace/Project ID: park-path

Jacob_Witt · April 7, 2025, 8:14pm

Hi @Wolphy - thanks for your message.

I’ve let the ML team know about your issue and they are looking into it. For the future, I do want to call out that the workspace in question is on a free plan – dedicated support is reserved for our Growth plan.

system · April 28, 2025, 8:14pm

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Training new model fails every time Community Help	1	35	April 23, 2025
Training Failed on Model and Stalls in Generating Images Community Help	5	89	April 24, 2025
This training job did not complete successfully Feedback	0	21	March 9, 2025
YoloV11 training failed Community Help	4	105	March 25, 2025
Model training failed Community Help	1	489	April 10, 2024

Unable to train new model, training always fails

Related topics