there could be several different reasons, but mainly, it's because grouping arbitrary data into some target file-size is well... arbitrary.
Imagine I gave you a large container of sand and some emtpy buckets, and asked you to move the sand from the container to the buckets - aim for half full buckets. As you fill your buckets you realise, you have 3 half-full and only a small portion more. Do you redistribute everything to get to maybe 2/5 on four buckets? or 3/5 on three? or do you have the fourth with only a cup in it? Or 2 half-full and a third a bit over target?
Now this is a very simplistic example, but imagine now types or sand, colours, size of grains, etc and I asked you to make sure to redistribute the types of sand in a very specific way depending on its properties. This is now no longer a simple task, even for three buckets.
This is basically what file-size optimization (if you include partitioning, optimize etc) does. It redestributes everything neatly into buckets, but it is impossible to get the same size in each bucket, as the contents don't neatly divide how you ask it to. That's why it's a target - it's what the machine tries to aim for, but will always have outliers.
I hope this analogy helped ๐