Some images are more different than others in the dataset. Some images have more annotations than others, some fewer. When training the dataset, will they all be weighted equally? Or is it very important to take care annotating these variant images?
This answer likely depends on the specific model you’re using. There may be different strategies.
I’ll take “weighting” here to mean “with how much magnitude does the image move the weights of the neural network”. And the answer is a little complicated. It’s proportional to the calculated
loss for a batch of images (which is used to calculate the gradient for back-propagation) which can be chosen in many different ways.
For example, YOLOv5’s loss computation is a combination of the “average box intersection over union”, and the “binary cross entropy” of the image’s objectness & box classification. I believe this means that all images’ contribution to the loss (and therefore the weights) should be proportional only to “how right or wrong” the prediction is on the image; not scaled proportionate to the number of boxes in the ground truth or prediction.