1. 12 May, 2014 1 commit
  2. 06 May, 2014 2 commits
    • Johann's avatar
      Revert "VP8 for ARMv8 by using NEON intrinsics 10" · 677fb512
      Johann authored
      This reverts commit c500fc22
      
      There is an issue with gcc 4.6 in the Android NDK:
      loopfiltersimpleverticaledge_neon.c: In function 'vp8_loop_filter_bvs_neon':
      loopfiltersimpleverticaledge_neon.c:176:1: error: insn does not satisfy its constraints:
      
      Change-Id: I95b6509d12f075890308914cc691b813d2e5cd9f
      677fb512
    • Johann's avatar
      Revert "VP8 for ARMv8 by using NEON intrinsics 08" · 928ff038
      Johann authored
      This reverts commit a5d79f43
      
      There is an issue with gcc 4.6 in the Android NDK:
      loopfilter_neon.c: In function 'vp8_loop_filter_vertical_edge_y_neon':
      loopfilter_neon.c:394:1: error: insn does not satisfy its constraints:
      
      Change-Id: I2b8c6ee3fa595c152ac3a5c08dd79bd9770c7b52
      928ff038
  3. 04 May, 2014 4 commits
  4. 03 May, 2014 4 commits
  5. 02 May, 2014 3 commits
    • James Yu's avatar
      VP8 for ARMv8 by using NEON intrinsics 08 · a5d79f43
      James Yu authored
      Add loopfilter_neon.c
      - vp8_loop_filter_horizontal_edge_y_neon
      - vp8_loop_filter_horizontal_edge_uv_neon
      - vp8_loop_filter_vertical_edge_y_neon
      - vp8_loop_filter_vertical_edge_uv_neon
      
      Change-Id: I50b57dedabd42d2a3c183c1738cc5346f0e71ed8
      Signed-off-by: default avatarJames Yu <james.yu@linaro.org>
      a5d79f43
    • James Yu's avatar
      VP8 for ARMv8 by using NEON intrinsics 07 · 930557be
      James Yu authored
      Add iwalsh_neon.c
      - vp8_short_inv_walsh4x4_neon
      
      Change-Id: I8beda6ce11ad8ce9e80cc0a38d40161938359162
      Signed-off-by: default avatarJames Yu <james.yu@linaro.org>
      930557be
    • James Yu's avatar
      VP8 for ARMv8 by using NEON intrinsics 06 · 81ad047e
      James Yu authored
      Add idct_dequant_full_2x_neon.c
      - idct_dequant_full_2x_neon
      
      ==== Summary of apply VP8 decode patch series ====
      Benchmark on Samsung Chromebook, Cortex-A15, 1.7GHz, Dual core
      Toolchain: linaro-1.13.1-4.8-2014.01
      Compile argument: CROSS=arm-linux-gnueabihf- ../libvpx/configure
                           --target=armv7-linux-gcc --prefix=$HOME/out
                           --enable-shared --cpu=cortex-a7
      Test argument: vpxdec --summary --noblit ./tears_of_steel_1080p.webm
      
      NEON assembly   46.68 (fps)
      Apply patch 06  46.65, -0.03
      Apply patch 07  46.86, +0.21
      Apply patch 08  46.58, -0.28
      Apply patch 09  46.57, -0.01
      Apply patch 10  46.51, -0.06
      Apply patch 11  46.13, -0.38
      Apply patch 12  45.42, -0.71
      Apply patch 13  46.06, +0.64
      Apply patch 14  45.19, -0.87
      Apply patch 15  45.93, +0.74
      Apply patch 16  45.48, -0.45
      Apply patch 17  45.84, +0.36
      Apply patch 18  45.91, +0.07  <= With all NEON intrinsics patches
                       Total -0.77 fps, 1.65% performance regression
      
      Change-Id: I77bfc9eaccfb97b8d401e949ceff8795e26ca6b7
      Signed-off-by: default avatarJames Yu <james.yu@linaro.org>
      81ad047e
  6. 29 Apr, 2014 1 commit
    • Yunqing Wang's avatar
      Remove VP8 save_reg_neon function · 096eaba7
      Yunqing Wang authored
      This patch did a cleanup following the commit "Save NEON registers
      in VP8 NEON functions". The pushing/poping of callee-saved NEON
      registers was moved into individual NEON functions. Therefore,
      we don't need to save those registers at the beginning of codec.
      The related code was removed.
      
      Change-Id: I5648166514fc9beffb780aa138495597731f49ea
      096eaba7
  7. 03 Mar, 2014 1 commit
    • James Zern's avatar
      build: convert rtcd.sh to perl · 805078a1
      James Zern authored
      significantly speeds up file generation.
      
      the goal of this change is to convert rtcd.sh to perl as directly as
      possible to allow for simple comparison. future changes can make it more
      perl-like.
      
      ---
      Linux
          [CREATE] vpx_scale_rtcd.h
      real    0m0.485s ->    0m0.022s
          [CREATE] vp8_rtcd.h
      real    0m4.619s ->    0m0.060s
          [CREATE] vp9_rtcd.h
      real    0m10.102s ->    0m0.087s
      
      Windows
          [CREATE] vpx_scale_rtcd.h
      real    0m8.360s ->    0m0.080s
          [CREATE] vp8_rtcd.h
      real    1m8.083s ->    0m0.160s
          [CREATE] vp9_rtcd.h
      real    2m6.489s ->    0m0.233s
      
      Change-Id: Idfb71188206c91237d6a3c3a81dfe00d103f11ee
      805078a1
  8. 26 Feb, 2014 3 commits
  9. 23 Feb, 2014 1 commit
    • James Yu's avatar
      VP8 for ARMv8 by using NEON intrinsics 02 · 300a3bfc
      James Yu authored
      Add copymem_neon.c
      - vp8_copy_mem16x16_neon
      - vp8_copy_mem8x8_neon
      - vp8_copy_mem8x4_neon
      
      vpxdec  --summary --noblit ../videos/tears_of_steel_1080p.webm
      Before => After, 13.25 => 13.25 (fps)
      
      Change-Id: Ib956b5a20522ff57dc8a580bf0aef7b252bddba6
      Signed-off-by: default avatarJames Yu <james.yu@linaro.org>
      300a3bfc
  10. 10 Jan, 2014 1 commit
    • Johann's avatar
      Apply neon flags to intrinsic files · dadf3505
      Johann authored
      Filter out files ending in _neon.c and append .neon so the Android build
      system knows to apply -mfpu=neon
      
      Change-Id: Ib67277e5920bfcaeda7c4aa16cd1001b11d59305
      dadf3505
  11. 09 Jan, 2014 1 commit
  12. 09 Jul, 2013 1 commit
  13. 02 Mar, 2013 1 commit
  14. 15 Nov, 2012 1 commit
  15. 01 Nov, 2012 2 commits
  16. 30 Oct, 2012 1 commit
  17. 26 Oct, 2012 1 commit
    • Scott LaVarnway's avatar
      Faster 8t filtering · ce811f87
      Scott LaVarnway authored
      Quickly modified the ssse3 sixtap filters to support eight taps.  For the test
      clip used, a 23+% boost in decoder performance was seen.  We can
      revisit later and improve further.
      
      Change-Id: I5f59860459e80d6fa23e6cc0fd91296a969f5240
      ce811f87
  18. 25 Oct, 2012 1 commit
  19. 22 Oct, 2012 1 commit
  20. 19 Oct, 2012 1 commit
    • Scott LaVarnway's avatar
      sse2 intrinsic version of vp8_mbloop_filter_vertical_edge() · 085433c2
      Scott LaVarnway authored
      First sse2 version of vp8_mbloop_filter_vertical_edge().  For now,
      intrinsics are being used until the bitstream is finalized.  This function
      will be revisited later for further performance improvements.
      
      For the test clip used, a 34+% decoder performance improvement
      was seen.  This will vary depending on material.
      
      Change-Id: I455b438bc8d8af76cf7533ac42eda5f689b21f7c
      085433c2
  21. 18 Oct, 2012 1 commit
    • Scott LaVarnway's avatar
      sse2 intrinsic version of vp8_mbloop_filter_horizontal_edge() · 992b5e2d
      Scott LaVarnway authored
      First sse2 version of vp8_mbloop_filter_horizontal_edge().  For now,
      intrinsics are being used until the bitstream is finalized.  This function
      will be revisited later for further performance improvements.
      For the test clip used, a 31+% decoder performance improvement
      was seen.  This will vary depending on material.
      
      Change-Id: I03ed3a7182478bdd1f094644ff3e0442625600e7
      992b5e2d
  22. 17 Oct, 2012 1 commit
  23. 24 Aug, 2012 1 commit
    • Paul Wilkins's avatar
      New Motion Reference Search · 2d60bee1
      Paul Wilkins authored
      Alternative strategy for finding a list of candidate motion
      vectors to use as reference values in mv coding and as
      nearest and near.
      
      Sort by sad in vp8_find_best_ref_mvs() rather than just
      pick the best. Allow 0,0 as a best ref option but not a
      nearest or near unless there are no alternatives.
      
      Encode/Decode verified on at least some clips.
      
      Some commented out experimental and stats code still in place.
      
      Gain over existing code averages about 1% on derf (alll metrics)
      with improvement on all clips. Other test results pending.
      
      The entropy coding of the mode (nearest/near etc) still
      depends upon and requires the old "findnear" code so
      this needs looking at and may provide room for further gains.
      
      Change-Id: I871d7cba1d1c379c4bad9bcccce1fb19c46b8247
      2d60bee1
  24. 21 Aug, 2012 2 commits
  25. 16 Aug, 2012 1 commit
  26. 08 Aug, 2012 2 commits