> Open Source isn't even within 50% of what the SOTA models are.
When was the last time you used any of them? Because, a lot of people are actively using them for 9-5 work today, I count myself in that group. That opinion feels outdated, like it was formed a year ago+ and held onto. Or based on highly quantized versions and or small non-Thinking models.
Do you really think Qwen3.6 for a specific example is "50%" as good as Opus4.7? Opus4.7 is clearly and objectively better, no debate on that, but the gap isn't anywhere near that wide. I'd call "20%" hyperbole, the true difference is difficult to exactly measure but sub-10% for their top-tier Thinking models is likely.
Their opinion is also behind on LibreOffice, too. I won't defend GIMP's monstrosity, but I finished a whole dissertation, do all my regular spreadsheet work (that isn't done via R), and have created plenty of visual mockups with LibreOffice. Plus, I don't have to deal with a spammy Windows environment.
Sure, we use Google Drive, too, but that's just for sharing documents across offices, not for everyday use. For that, the open source model is a clear winner in my book.
Qwen3.6 at which model size and quantization? I already think Opus 4.6 is usable but still dumb as bricks. A 20% cut off that feels like it would still be unusable. And that's not even getting to the annoyance of setting everything up to run locally & getting HW that can run it locally which basically looks like a Macbook M4 these days as the x86 side is ridiculously pricey to get decent performance out of models.
The largest qwen model is similar so I’m not sure what point you’re trying to make. The only ones available are the open weight ones which are the smaller variants and nowhere near within 20% of the closed frontier models.
The largest open models are within 20%; they're likely within 10%. Go actually try them and stop making outdated assumptions. You don't need to invest a lot of money either, just pick your favorite vendor, and send out a few prompts.
When was the last time you used any of them? Because, a lot of people are actively using them for 9-5 work today, I count myself in that group. That opinion feels outdated, like it was formed a year ago+ and held onto. Or based on highly quantized versions and or small non-Thinking models.
Do you really think Qwen3.6 for a specific example is "50%" as good as Opus4.7? Opus4.7 is clearly and objectively better, no debate on that, but the gap isn't anywhere near that wide. I'd call "20%" hyperbole, the true difference is difficult to exactly measure but sub-10% for their top-tier Thinking models is likely.