Add a few branch hints to vp10_optimize_b.
vp10_optimize_b now takes between 40% to 60% of the TOTAL runtime of the encoder, depending on bit-rate. It also contains 2/3 to 3/4 of the mispredicted branch instructions in the whole program. Adding a few branch hints makes vp10_optimize_b around 2-5% faster (dependig on bit-rate) when compiled with gcc/clang. Change-Id: I1572733e18b4166bc10591b958c5018a9561fa2b