Skip to content
Snippets Groups Projects
  1. Mar 14, 2016
  2. Mar 08, 2016
    • Yi Luo's avatar
      Implemented DST 16x16 SSE2 intrinsics optimization · 50a164a1
      Yi Luo authored
      - Implemented fdst16_sse2(), fdst16_8col() against C version: fdst16().
      - Turned on 7 DST related hybrid txfm types in vp10_fht16x16_sse2().
      - Replaced vp10_fht10x10_c() with vp10_fht16x16_sse2() in
        fwd_txfm_16x16().
      - Added vp10_fht16x16_sse2() unit test against C version:
        vp10_fht16x16_c() (--gtest_filter=*VP10Trans16x16*).
      - Unit test passed.
      - Speed improvement: 2.4%, 3.2%, 3.2%, for city_cif.y4m, garden_sif.y4m,
        and mobile_cif.y4m.
      
      Change-Id: Ib30a67ce5d5964bef143d588d0f8fa438be8901f
      50a164a1
  3. Mar 07, 2016
    • Yi Luo's avatar
      Added vp10_fht8x8_sse2() unit test · 6ab06212
      Yi Luo authored
      - Inherited base class TransformTestBase to derived class VP10Trans8x8HT.
      - Employed RunCoeffCheck() to test vp10_fht8x8_sse2() against C reference
        function vp10_fht8x8_c().
      - fdst8_sse2() related seven hybrid transform cases are covered in this
        test.
      - Test passed (4 test cases w/o EXT_TX; 16 test cases with EXT_TX).
      
      Change-Id: Id9a9b308c707164a120d9ceb2c30e572026fb1d0
      6ab06212
    • Geza Lore's avatar
      Extend convolution functions to 128x128 for ext-partition. · 938b8dfc
      Geza Lore authored
      Change-Id: I7f7e26cd1d58eb38417200550c6fbf4108c9f942
      938b8dfc
  4. Mar 04, 2016
    • Yi Luo's avatar
      Added vp10_fht4x4_sse2() unit test · 267f73a1
      Yi Luo authored
      Inherited class TransformTestBase to derived class VP10Trans4x4HT.
      Employed RunCoeffCheck() to test vp10_fht4x4_sse2() against
      C reference vp10_fht4x4_c().
      fdst4_sse2() related seven hybrid transform cases are covered
       in this test.
      Wrote a header file for test base class. Some modification to
      make sure the base class can be used for 8x8, 16x16, 32x32 cases.
      All related tests passed.
      
      Change-Id: I6b19a39d3ea30b657847781e78e73b829998a57a
      267f73a1
  5. Mar 03, 2016
    • Geza Lore's avatar
      Add 128 pixel variance and SAD functions · 697bf5be
      Geza Lore authored
      Change-Id: I8fde245b32c9e586683a28aa6925da0b83850b39
      697bf5be
    • Aℓex Converse's avatar
      ANS: Switch from PDFs to CDFs. · 6bbbe316
      Aℓex Converse authored
      Make the RANS implementation operate on cumulative distribution
      functions rather than individual probability distribution functions.
      CDFs have shown themselves more flexible to work with.
      
      Reduces decoding memory usage from scaling O(num_distributions *
      symbol_resolution) to O(num_distributions).
      
      No bitstream change. This is an purely implementation change.
      
      Change-Id: I4e18d3a0a3d37a36a61487c3d778f9d088b0b374
      6bbbe316
  6. Mar 02, 2016
  7. Feb 26, 2016
  8. Feb 25, 2016
    • Angie Chiang's avatar
      convolve8 sse2 test · 8878fa4f
      Angie Chiang authored
      This experiment shows that when frame size is 64x64
      vpx_highbd_convolve8_sse2 and vpx_convolve8_sse2's speed are similar.
      However when frame size becomes 1024x1024
      vpx_highbd_convolve8_sse2 is around 50% slower than vpx_convolve8_sse2
      we think the bottleneck is from memory IO
      
      VP10ConvolveTest.vpx_highbd_convolve8_sse2_speed_8_64
      VP10ConvolveTest.vpx_highbd_convolve8_sse2_speed_8_64 (17 ms)
      VP10ConvolveTest.vpx_highbd_convolve8_sse2_speed_16_64
      VP10ConvolveTest.vpx_highbd_convolve8_sse2_speed_16_64 (42 ms)
      VP10ConvolveTest.vpx_highbd_convolve8_sse2_speed_32_64
      VP10ConvolveTest.vpx_highbd_convolve8_sse2_speed_32_64 (139 ms)
      VP10ConvolveTest.vpx_highbd_convolve8_sse2_speed_64_64
      VP10ConvolveTest.vpx_highbd_convolve8_sse2_speed_64_64 (499 ms)
      
      VP10ConvolveTest.vpx_convolve8_sse2_speed_l_8_64
      VP10ConvolveTest.vpx_convolve8_sse2_speed_l_8_64 (16 ms)
      VP10ConvolveTest.vpx_convolve8_sse2_speed_l_16_64
      VP10ConvolveTest.vpx_convolve8_sse2_speed_l_16_64 (40 ms)
      VP10ConvolveTest.vpx_convolve8_sse2_speed_l_32_64
      VP10ConvolveTest.vpx_convolve8_sse2_speed_l_32_64 (130 ms)
      VP10ConvolveTest.vpx_convolve8_sse2_speed_l_64_64
      VP10ConvolveTest.vpx_convolve8_sse2_speed_l_64_64 (485 ms)
      
      VP10ConvolveTest.vpx_highbd_convolve8_sse2_speed_8_1024
      VP10ConvolveTest.vpx_highbd_convolve8_sse2_speed_8_1024 (32 ms)
      VP10ConvolveTest.vpx_highbd_convolve8_sse2_speed_16_1024
      VP10ConvolveTest.vpx_highbd_convolve8_sse2_speed_16_1024 (61 ms)
      VP10ConvolveTest.vpx_highbd_convolve8_sse2_speed_32_1024
      VP10ConvolveTest.vpx_highbd_convolve8_sse2_speed_32_1024 (196 ms)
      VP10ConvolveTest.vpx_highbd_convolve8_sse2_speed_64_1024
      
      VP10ConvolveTest.vpx_highbd_convolve8_sse2_speed_64_1024 (694 ms)
      VP10ConvolveTest.vpx_convolve8_sse2_speed_l_8_1024
      VP10ConvolveTest.vpx_convolve8_sse2_speed_l_8_1024 (21 ms)
      VP10ConvolveTest.vpx_convolve8_sse2_speed_l_16_1024
      VP10ConvolveTest.vpx_convolve8_sse2_speed_l_16_1024 (44 ms)
      VP10ConvolveTest.vpx_convolve8_sse2_speed_l_32_1024
      VP10ConvolveTest.vpx_convolve8_sse2_speed_l_32_1024 (138 ms)
      VP10ConvolveTest.vpx_convolve8_sse2_speed_l_64_1024
      VP10ConvolveTest.vpx_convolve8_sse2_speed_l_64_1024 (491 ms)
      
      Change-Id: I3131a031e0380e8eae748cfcccc6cbb961d05943
      8878fa4f
  9. Feb 24, 2016
  10. Feb 23, 2016
  11. Feb 22, 2016
    • Yaowu Xu's avatar
      Cleanup psnr.h · 38cfc45e
      Yaowu Xu authored
      Change-Id: Id026e72ee655ee5bd645a89e378da0d462be367d
      38cfc45e
    • Yaowu Xu's avatar
      Add shift stage in FASTSSIM computation · d1c5cd4a
      Yaowu Xu authored
      This commits adds a shift stage for FASTSSIM computaton when source
      bit depth is different from working bit depth, to make sure metric
      results are calculated in bit_depth consistent with source.
      
      Change-Id: I997799634076ef7b00fd051710544681ed536185
      d1c5cd4a
    • Yaowu Xu's avatar
      Move psnrhvs function declaration to psnr.h · 6e695da2
      Yaowu Xu authored
      From "ssim.h"
      
      Change-Id: Ie53378794149ef8a844b4eb47ad4f08579de4b60
      6e695da2
  12. Feb 21, 2016
    • Yaowu Xu's avatar
      Extend HBDMetricTest · f6a7b17a
      Yaowu Xu authored
      This commit extends the HBDMetricTests to handle testing for metric
      computation where input source depth is different from working bit
      depth.
      
      Change-Id: I5d11101cc9603a3fd09e8439816bb982a0f1b654
      f6a7b17a
  13. Feb 20, 2016
    • Angie Chiang's avatar
      Fix 12 TAP convolution bug · 1e403064
      Angie Chiang authored
      Priviously, we do 12-tap interpolation even there is no sub pixel,
      This could cause a bug becuase decoder doesn't extend border when there
      is no sub pixel. In this situation, if we still do interpolation, we
      will access the border extension which doesn't exist and cause a
      memory error
      
      Change-Id: I55b879722f0a10c5d13261bd9617a75c826a2418
      1e403064
  14. Feb 17, 2016
  15. Feb 16, 2016
  16. Feb 15, 2016
    • Geza Lore's avatar
      Add optimized vpx_sum_squares_2d_i16 for vp10. · abd00505
      Geza Lore authored
      Using this we can eliminate large numbers of calls to predict intra,
      and is also faster than most of the variance functions it replaces.
      This is an equivalence transform so coding performance is unaffected.
      
      Encoder speedup is approx 7% when var_tx, super_tx and ext_tx are all
      enabled.
      
      Change-Id: I0d4c83afc4a97a1826f3abd864bd68e41bb504fb
      abd00505
  17. Feb 12, 2016
  18. Feb 11, 2016
    • Yaowu Xu's avatar
      Enable computing PSNRHVS for hbd build · bb8ca088
      Yaowu Xu authored
      This commit adds computation of PSNRHVS for highbitdepth build, it
      also adds tests to make sure the calculation of psnrhvs metric for
      10 and 12 bit correct.
      
      Change-Id: Iac8a8073d2b3e3ba5d368829d770793212fa63b6
      bb8ca088
    • Marco Paniconi's avatar
      vp9-resize: Force reference masking off for external dynamic-resizing. · 34d12d11
      Marco Paniconi authored
      An issue exists with reference_masking in non-rd pickmode for spatial
      scaling. It was kept off for internal dynamic resizing and svc, this
      change is to keep it off also for external dynamic resizing.
      
      Update to external resize test, and update TODO to re-enable this
      at frame level when references have same scale as source.
      
      Change-Id: If880a643572127def703ee5b2d16fd41bdbf256c
      34d12d11
Loading