Why would I buy two switches when I can solve the problem in general with a consensus algorithm that handles hosts going away for any reason?
Especially since a flaky switch isn't the only issue. Power loss could bring down even redundant backbone switches, but leave communication within the rack going. You could accidentally push a bad routing config and blackhole traffic to a bunch of hosts. You could send out a bad push that intermittently breaks connectivity to some hosts. And so on.