1. 06 Nov, 2015 17 commits
  2. 05 Nov, 2015 3 commits
    • James Zern's avatar
      vp9_spatial_svc_encoder.sh: fix command line param · 892130f7
      James Zern authored
      -l -> -sl, renamed in:
      be3b08da [svc] Temporal svc with two pass rate control
      Change-Id: I5a7b179b33d94e20e54825090659156dece928c0
    • Yunqing Wang's avatar
    • Geza Lore's avatar
      Add AVX vectorized vp9_diamond_search_sad · f1342a7b
      Geza Lore authored
      This function now has an AVX intrinsics version which is about 80%
      faster compared to the C implementation. This provides a 2-4% total
      speed-up for encode, depending on encoding parameters. The function
      utilizes 3 properties of the cost function lookup table, constructed
      in 'cal_nmvjointsadcost' and 'cal_nmvsadcosts'.
      For the joint cost:
        - mvjointsadcost[1] == mvjointsadcost[2] == mvjointsadcost[3]
      For the component costs:
        - For all i: mvsadcost[0][i] == mvsadcost[1][i]
              (equal per component cost)
        - For all i: mvsadcost[0][i] == mvsadcost[0][-i]
              (Cost function is even)
      These must hold, otherwise the AVX version of the function cannot be used.
      Change-Id: I184055b864c5a2dc37b2d8c5c9012eb801e9daf6
  3. 04 Nov, 2015 20 commits
    • Angie Chiang's avatar
      Add vp10_inv_txfm1d_test · 444acd77
      Angie Chiang authored
      Change-Id: I3b76c0146af7f191cdae31d2b53ab6d51ac791a4
    • Angie Chiang's avatar
      Add iadst32 · b0df5e0f
      Angie Chiang authored
      Change-Id: I3a53ee51146d0bd4b0fe4b27c286e8c921f9823b
    • Angie Chiang's avatar
      Add iadst16 · 35486a6b
      Angie Chiang authored
      Change-Id: I093881aacaf9a070f78cc4eea2e8a6ede8a71792
    • Angie Chiang's avatar
      Add iadst8 · 0ca0cc24
      Angie Chiang authored
      Change-Id: Ia58e4735d7d7bfd2ac55259c32705118c6745c6d
    • Angie Chiang's avatar
      Add iadst4 · ba69089e
      Angie Chiang authored
      Change-Id: Ie419b2b1e939a41c30ed609e1ba46f5f6609b2a5
    • Angie Chiang's avatar
      Add idct32 · 74678334
      Angie Chiang authored
      Change-Id: I75412bdc4bd0d9c90e8b56e02e0e467a2d9957f9
    • Angie Chiang's avatar
      Add idct16 · d3cee565
      Angie Chiang authored
      Change-Id: I8e5ba3a3f9b64ccbf038e371525e897774729b06
    • Angie Chiang's avatar
      Add idct8 · bd9db2f5
      Angie Chiang authored
      Change-Id: I8092a6f229b196c5c8b7dcd2dff8aaf68253e422
    • Angie Chiang's avatar
      Add idct4 · 7d2b7b69
      Angie Chiang authored
      Change-Id: I1d1b6822452772cec95160491c7bc6d3bba1f5c2
    • Angie Chiang's avatar
      Add vp10_fwd_txfm1d_test · b934148f
      Angie Chiang authored
      Change-Id: If3bef2be355227cfc2932e4471b84c21c7cd2b90
    • Angie Chiang's avatar
      Add fadst32 · a9253a20
      Angie Chiang authored
      Change-Id: I77299f0e39fc7cef91e7e420513dbd05194f320a
    • Angie Chiang's avatar
      Add fadst16 · a7d26f4e
      Angie Chiang authored
      Change-Id: I5175e39b5df73646488f74b2a9e4a463ae79d91a
    • Debargha Mukherjee's avatar
    • Angie Chiang's avatar
      Merge "Add fadst8" into nextgenv2 · 3813c2bc
      Angie Chiang authored
    • Angie Chiang's avatar
      Merge "Add fadst4" into nextgenv2 · 498866b6
      Angie Chiang authored
    • Jingning Han's avatar
    • Jingning Han's avatar
      Simplify txfm rate-distortion optimization · 493d0234
      Jingning Han authored
      This commit refactors the rate-distortion optimization scheme for
      transform block coding. When both ext-tx and var-tx experiments
      are turned on, the encoding time for bus_cif at 1000 kbps goes down
      from 706377 ms to 666503 ms (5.6% speed-up). The coding statics
      remain unchanged.
      Change-Id: I20835db573725580aad79c16220f799ce01f2093
    • Geza Lore's avatar
      Flip the result of the inverse transform for FLIPADST. · 4f510809
      Geza Lore authored
      When using FLIPADST, the vp10_inv_txfm_add functions used to flip
      the destination array, add the result of the inverse transform, to it
      and then flip the destination back. This has been replaced by
      flipping the result of the inverse transform before adding it to the
      destination. Up-Down flipping is done by negating the destination
      stride, and staring from the bottom, so it should now be free.
      Left-right flipping is done with the usual SSE2 instructions in the
      optimized code.
      The C functions match the SSE2 functions as expected, so the C functions
      now do the flipping as well when required. Adding this cleanly required
      some refactoring of the C functions, but there is no measurable
      performance impact when ext-tx is not enabled.
      Encode speedup with ext-tx enabled is about 3%.
      Change-Id: I5b04e5d720f0b9f0d54fd8607a8764f2314c7234
    • Yaowu Xu's avatar
      Merge branch 'master' into nextgenv2 · 4aafd018
      Yaowu Xu authored
    • hui su's avatar
      ext-intra experiment · be3559ba
      hui su authored
      Currently there are two parts in this experiment: extra directional intra
      prediction modes and the filter intra modes migrated from the nextgen branch.
      Several macros are defined in "blockd.h" to provide controls of the experiment
      settings. Setting "DR_ONLY" as 1 (default is 0) means we only use directional
      modes, and skip the filter-intra modes; "EXT_INTRA_ANGLES" (default is 128)
      defines the number of different angles we want to support; setting
      "ANGLE_FAST_SEARCH" as 1 (default is 1) means we use fast sub-optimal search
      for the best prediction angle, instead of exhaustive search. The fast search
      is about 6 times faster than the exhaustive search, while preserving about
      60% of the coding gains.
      With extra directional prediction modes (fast search), we observe the following
      code gains (number in parentheses is for all-key-frame setting):
      derflr +0.42%  (+1.79%)
      hevclr +0.78%  (+2.19%)
      hevcmr +1.20%  (+3.49%)
      stdhd  +0.56%
      Speed-wise, about 110% slower for key frames, and 30% slower overall.
      The gains of filter intra modes mostly add up with the gains of directional
      modes. The overall coding gain of this experiment:
      derflr +0.94%
      hevclr +1.46%
      hevcmr +1.94%
      stdhd  +1.58%
      Change-Id: Ida9ad00cdb33aff422d06eb42b4f4e5f25df8a2a