    Fix collapse mask tracking for recombine steps.
    The recombine loop for cm was correct if one started at 1 block,
     but was wrong otherwise (for a test case, convert 2 recombined
     blocks back to 4 with an initial cm of 0x3; the result should be
     0xF, but instead you get 0x7).
    The recombine loop for fill was always wrong (for a test case,
     combine 8 blocks down to 1 with an initial fill=0xFE; the low bit
     remains unset).
    This now properly interleaves and deinterleaves bits for these
     steps, which avoids declaring collapses (and skipping folding)
     where none, in fact, occurred.