- 27 Feb, 2014 1 commit
-
-
hkuang authored
Change-Id: Ie21b5ae89100389b80f919710839084f935a8545
-
- 15 Feb, 2014 1 commit
-
-
James Yu authored
Change-Id: Ifabb8c7ec0c327fea9d6739cab10addb060ff435 Signed-off-by:
James Yu <james.yu@linaro.org>
-
- 13 Feb, 2014 2 commits
-
-
Frank Galligan authored
The current code removed the check to only perform the filter8. Change-Id: Ie54e19a77745042a5660eab986d9ef1c42e82410
-
Andrew Russell authored
Change-Id: Ia91c6c406273345b08505097ffe1af3896980f06
-
- 12 Feb, 2014 1 commit
-
-
James Yu authored
Change-Id: I1fabad59747eb5f68c64275a36c3a1d94daf32a3 Signed-off-by:
James Yu <james.yu@linaro.org>
-
- 05 Feb, 2014 2 commits
-
-
Martin Storsjö authored
This isn't strictly necessary, but makes the file more consistent with the other arm assembly source files. Change-Id: I245c9677d89e0ab3f31991e473764858af35b180
-
Martin Storsjö authored
This fixes building for iOS. Change-Id: Ice082648c02a3faf93891f7ddc122875e2bdc9cb
-
- 01 Feb, 2014 1 commit
-
-
Dmitry Kovalev authored
Change-Id: Iefe118f61a335e88821a21a9f50fb919212c1507
-
- 28 Jan, 2014 1 commit
-
-
hkuang authored
which is 7.8 times faster than C. Change-Id: I858ef4ec09202a07d445da8db702783d6d9d7321
-
- 27 Jan, 2014 1 commit
-
-
hkuang authored
Change-Id: I832cf83871044bfee7b7e57dbd31bae05cbd53e9
-
- 24 Jan, 2014 2 commits
-
-
Frank Galligan authored
Change-Id: Ia12aae491202098ff66366145aa0c3da38dc97e5
-
hkuang authored
which is 3.5 times faster than C. Change-Id: I24439ba7a2971829c11620f34848facf2c916678
-
- 22 Jan, 2014 1 commit
-
-
hkuang authored
Change-Id: I76c2720546b737cb63018a8ab6a3ff62a291786d
-
- 15 Jan, 2014 1 commit
-
-
hkuang authored
Change-Id: I10c423bde7ea5a3bac9f14f35c73b6bc31c8f3e3
-
- 08 Jan, 2014 1 commit
-
-
hkuang authored
More intra optimizations will be added. Change-Id: I33ae8d93f6002bf7b64cc2669602d9e6bfa5a6e8
-
- 18 Dec, 2013 1 commit
-
-
Jim Bankoski authored
This renames all the loop filter functions so that they no longer refer to mb Change-Id: I8a58a8c7fd253d835cb619bde13913e896ece90b
-
- 26 Nov, 2013 1 commit
-
-
Frank Galligan authored
Multiply by 3 was on 8bit vectors when it should have been on 16bit vectors. Change-Id: I248c1429b3134dfd171dfab0ebb109fd2437e1fc
-
- 22 Nov, 2013 1 commit
-
-
Yunqing Wang authored
This patch followed "Add filter_selectively_vert_row2 to enable parallel loopfiltering" commit, and added x86 SSE2 optimization to do 16-pixel filtering in parallel. For other optimizations (neon and dspr2), current 16-pixel functions were done by calling 8-pixel functions twice, and real 16-pixel functions could be added later. Decoder speedup: tulip clip: 2% speed gain; old_town_cross: 1.2% speed gain; bus: 2% speed gain. Change-Id: I4818a0c72f84b34f5fe678e496cf4a10238574b7
-
- 21 Nov, 2013 2 commits
-
-
Frank Galligan authored
The change caused mismatches with some test vectors on neon. Original CL: https://gerrit.chromium.org/gerrit/#/c/67863/ Change-Id: I913891636d53783e93cb1865ca78ded1821dc4b0
-
Frank Galligan authored
Add support to do 16 pixel horizontal filtering in Neon. Nexus devices saw about 0.5% decode speed increase. Change-Id: I2993f6c2d49f31fa74976879eeaa289fd3f4e15d
-
- 16 Nov, 2013 1 commit
-
-
Yunqing Wang authored
This patch followed "Rewrite filter_selectively_horiz for parallel loopfiltering" commit, and added x86 SSE2 optimization to do 16-pixel filtering in parallel. Also, corrected the declaration of aligned arrays. For 8-pixel-in-parallel case, improved the calculation of the masks and filters. Updated the threshold loading since the thresholds were already duplicated. Updated neon C functions to call neon loopfilters twice. Using tulip clip, tests showed it gave a ~1.5% decoder speed gain. Change-Id: Id02638626ac27a4b0e0b09d71792a24c0499bd35
-
- 12 Nov, 2013 1 commit
-
-
Johann authored
iOS doesn't recognize B: bad instruction `B idct32_pass_loop' Change-Id: I3cf6aede4639f1d9efa97f7962fa287ba6feaaef
-
- 11 Nov, 2013 1 commit
-
-
hkuang authored
Change-Id: Ic416e3f8a11e82ee298e6f709b2119a9ddf1e2f8
-
- 05 Nov, 2013 1 commit
-
-
hkuang authored
cleanup I63df79a13cf62aa2c9360a7a26933c100f9ebda3. Change-Id: I034848cf05031618818f7df2e7f9c35102686948
-
- 12 Oct, 2013 1 commit
-
-
Dmitry Kovalev authored
Also renaming dest_stride to stride in some places. Change-Id: I75f602b623a5a7071d4922b747c45fa0b7d7a940
-
- 11 Oct, 2013 1 commit
-
-
Dmitry Kovalev authored
Renames: vp9_short_iht4x4_add -> vp9_iht4x4_16_add vp9_short_iht8x8_add -> vp9_iht8x8_64_add vp9_short_iht16x16_add_c -> vp9_iht16x16_256_add Change-Id: Ibca7a188fd062b196787ac5efc1ea545e7f166c0
-
- 10 Oct, 2013 1 commit
-
-
Dmitry Kovalev authored
Renames: vp9_short_idct32x32_add -> vp9_idct32x32_1024_add vp9_short_idct32x32_1_add -> vp9_idct32x32_1_add vp9_idct_add_32x32 -> vp9_idct32x32_add Change-Id: Id85306f5814bac6c47463a6b5901a93082510666
-
- 07 Oct, 2013 1 commit
-
-
Dmitry Kovalev authored
Renames: vp9_short_idct16x16_add -> vp9_idct16x16_256_add vp9_short_idct16x16_10_add -> vp9_idct16x16_10_add vp9_short_idct16x16_1_add -> vp9_idct16x16_1_add vp9_idct_add_16x16 -> vp9_idct16x16_add Change-Id: Ief8a3904de78deab0f4ede944c4d0339c228cfc3
-
- 06 Oct, 2013 1 commit
-
-
Dmitry Kovalev authored
Renames: vp9_short_idct8x8_add -> vp9_idct8x8_64_add vp9_short_idct8x8_1_add -> vp9_idct8x8_1_add vp9_short_idct8x8_10_add -> vp9_idct8x8_10_add vp9_idct_add_8x8 -> vp9_idct8x8_add Change-Id: Ifb8d3a45b4c0397aa805b30463f3d14581bf72c1
-
- 04 Oct, 2013 1 commit
-
-
Dmitry Kovalev authored
The idea is to have the following names for each transform size: vp9_idct4x4_add vp9_idct4x4_1_add vp9_idct4x4_10_add vp9_idct4x4_16_add vp9_idct8x8_add vp9_idct8x8_1_add vp9_idct8x8_10_add vp9_idct8x8_64_add etc for 16x16, 32x32 The actual list of renames in this patch: vp9_idct_add_lossless -> vp9_iwht4x4_add vp9_short_iwalsh4x4_add -> vp9_iwht4x4_16_add vp9_short_iwalsh4x4_1_add -> vp9_iwht4x4_1_add vp9_idct_add -> vp9_idct4x4_add vp9_short_idct4x4_add -> vp9_idct4x4_16_add vp9_short_idct4x4_1_add -> vp9_idct4x4_1_add Change-Id: I6f43f7437c68dd30cdd05d72e213765578ed30b1
-
- 27 Sep, 2013 2 commits
-
-
Dmitry Kovalev authored
Making name consistent with vp9_short_idct8x8 and vp9_short_idct8x8_1. Change-Id: I99e0be040ec893f9571dcf090e18f98dc58339f5
-
Christian Duvivier authored
Replace current code which corrupts the stack by duplicate of vp8 code to save and restore neon registers. Change-Id: Ibb0220b9aa985d10533befa0a455ebce57a2891a
-
- 26 Sep, 2013 2 commits
-
-
Dmitry Kovalev authored
Making function name consistent with vp9_short_idct16x16 and vp9_short_idct16x16_1. Change-Id: I70e54be9e6b9a1dddab0de470686591e96d05517
-
Christian Duvivier authored
- full ASM version, no more C gateway file. - integrate combine-add with last step of 2nd pass. - remove a few push/pop pairs. - some instruction reordering to hide latency. Change-Id: Ic9d9933c908b65d1bf7ba8fd47b524cda808c9c6
-
- 20 Sep, 2013 1 commit
-
-
Johann authored
The iOS compiler does not recognize BLE: bad instruction `BLE idct32_transpose_pair_loop' Change-Id: I7426694c66bc31caf939a2d5000968da1222c15b
-
- 16 Sep, 2013 1 commit
-
-
hkuang authored
Speed improves from 282% to 302% faster based on assembly-perf. Change-Id: I08c5c1a542d43361611198f750b725e4303d19e2
-
- 12 Sep, 2013 1 commit
-
-
hkuang authored
Change-Id: I963dd4a6e8671957403ccbb9a16ea7de703e3530
-
- 11 Sep, 2013 1 commit
-
-
Christian Duvivier authored
Lots of TODO which will be taken care in upcoming changes. As is, about 6x faster than C version. Change-Id: Ie2557b72fd2d8edca376dbf400a4d173aa5e63e0
-
- 10 Sep, 2013 1 commit
-
-
hkuang authored
Speed improve from 376% to 400% faster base on assembly-perf. Change-Id: If0b2eccc39d5793dc101ce9feb7fcadf88396ea2
-
- 04 Sep, 2013 1 commit
-
-
hkuang authored
Speed improve from 264% ~ 270% to 280% ~ 300% base on assembly-perf. Change-Id: I3e2cc818ec14b432204ff43732f39b6438db685d
-