- 31 Jan, 2014 5 commits
-
-
Yunqing Wang authored
Implemented parallel loopfiltering, which uses existing tile- decoding threads. Each thread works on one row, and when that row is loopfiltered, it moves to next unattended row. To ensure the correct filtering order, threads are synchronized and one superblock is filtered only if the superblocks it depends on are filtered already. To reduce synchronization overhead and speed up the decoder, we use nsync > 1 for high resolution. Performance tests: 1. on desktop: 8-tile 4k video using 8 threads, speedup: 70% - 80% 4-tile HD video using 4 threads, speedup: ~35% 2. on mobile device(Nexus 7): 4-tile 1080p video using 4 threads, speedup: 18% - 25% 4-tile 1080p video using 2 threads, speedup: 10% - 15% Change-Id: If54b4a11960dd706c22d5ad145ad94156031f36a
-
Alex Converse authored
* Avoid unnecessary type erasure * Prune unused/duplicate fields from struct rdcost_block_args * Make struct rdcost_block_args a local Change-Id: I4f1fd4837ccd028bbfe727191ee8d69f0463b7e5
-
Adrian Grange authored
When showing a previously decoded frame, i.e. when show_existing_frame=1, the update of the last_show_frame flag must be disabled. This is to ensure that the last_show_frame flag reflects the state of the flag for the immediately previously decoded frame rather then the value that was forced to ensure that a previously decoded frame would be displayed. This patch also adds a test vector to verify that the display_existing_frame flag works as expected. Code for generating the test vector can be found in this patch: https://gerrit.chromium.org/gerrit/#/c/68581/ (Bug originally reported by Alexander Voronov <ru.xalba@gmail.com>). Change-Id: I731d288fba02088959f7fcc87707137fffc6acf5
-
Jim Bankoski authored
use mode instead Change-Id: I419d7a2dc4b0714ca6ff723c5e824521c150c460
-
Adrian Grange authored
Added a constant to represent the minimum KF boost rather than using the magic number 2000 in the code. Change-Id: I9428b61f47d26312caff81c6f9ae8587df004791
-
- 30 Jan, 2014 4 commits
-
-
Yaowu Xu authored
So x86_64-win64-vs11 can build successfully. Change-Id: If354c2ea3921fac8c9b413ed39223e70bc20c535
-
Yaowu Xu authored
Change-Id: Iea7c9fa0726dbf9792eea79e6a05eb8a3c718d45
-
Dmitry Kovalev authored
Change-Id: I08f45573e0b2195c09fb6aecacb4c57431a711ea
-
- 29 Jan, 2014 7 commits
-
-
Yaowu Xu authored
In this new mode, the size range is strictly determined by the min and max partition size in neighborhood blocks. Niklas720 encoding time at cpu-used -5 goes from 56250ms to 50676ms, a 10% reduction. Change-Id: I316b0e2ac967ff3fad57b28d69c0ec80b7d8b34e
-
Dmitry Kovalev authored
Change-Id: If446225afbb49f6033c2a4516a37c377de6f70f7
-
Dmitry Kovalev authored
Change-Id: Ic2ff6405f01fd43d07c5ee3b5e374909401115cc
-
Deb Mukherjee authored
Includes a few fixes and clean-ups that adds the ability to use alt-ref frames in one-pass mode. Whether alt-refs are actually used or not is controlled by a macro USE_ALTREF_FOR_ONE_PASS in vp9_firstpass.c. This first cut seems to improve derf by 15+% in 1-pass mode. But further experiments with parameters are underway. Change-Id: I78254421435478003367c788c7930d2dc4ee2816
-
Jim Bankoski authored
This patch only works if the video is a width and height that are both a multiple of 32.. It sets every partition to 16x16, and does INTRADC only on the first frame and ZEROMV on every other frame. It always does does the largest possible transform, and loop filter level is set to 4. Was ~20% faster than speed -5 of vp8 Now 20% slower but adds motion search ( every block ), nearest, near and zeromv The SVC test was changed because - while this realtime mode produces bad quality albeit quickly, it isn't obeying all the rules it should about which frames are available. Change-Id: I235c0b22573957986d41497dfb84568ec1dec8c7
-
Paul Wilkins authored
Trap divide by 0 that could occur with a 0 rate target in aq mode COMPLEXITY_AQ. Change-Id: I034514f512b2a0db470ae8d37ea395278bf473cf
-
Yunqing Wang authored
Added macros to reduce the code duplication. Change-Id: I1916aa5a386ea07d961d4ec439ab09bb8c45487d
-
- 28 Jan, 2014 6 commits
-
-
Jim Bankoski authored
Change-Id: Ia8fa3961eec34545465018281dc022bc6f73869a
-
Dmitry Kovalev authored
Change-Id: I0c286e3d68a4a4ecf6df02e6fd9990327b0ceb22
-
Dmitry Kovalev authored
It seems we don't use it and not going to. Change-Id: Ie76cd04dafc79b0a5911f8957d4253ca2d787f0c
-
Dmitry Kovalev authored
It is enough to specify (e.g.) idct16, it is obviously different from idct16x16. Change-Id: I6b408a37a945de3162429380b59a775b03b95db0
-
hkuang authored
which is 7.8 times faster than C. Change-Id: I858ef4ec09202a07d445da8db702783d6d9d7321
-
Dmitry Kovalev authored
Change-Id: I8d17867a4772554cbba2bd113cc5b4c99d50146d
-
- 27 Jan, 2014 2 commits
-
-
Alex Converse authored
Change-Id: I297954b16bce9e23931331520eadfb47540ff660
-
hkuang authored
Change-Id: I832cf83871044bfee7b7e57dbd31bae05cbd53e9
-
- 25 Jan, 2014 6 commits
-
-
Deb Mukherjee authored
Adds multiple filters in the 0.5-1.0 range in the last stage of the resize functions to prevent over-smoothing/aliasing Change-Id: I1a615adb16f0df5095790945c94b28b4d6a6fc48
-
Dmitry Kovalev authored
Change-Id: I31373ad860eb554eb3b03e877e8fba580dc3de07
-
Alex Converse authored
This avoids fitlering a frame multiple times at the same level. Change-Id: I1fd54dd7ea257d16da8569f48036b8fad3a3ed61
-
Alex Converse authored
Factor out the code that tries filtering a frame at a given level. Change-Id: Ia04507e3ce6b1ad6ae7d05a9d88222fd319f44b7
-
Dmitry Kovalev authored
We don't use different filter kernels for x and y, it is always one kernel for both directions. Change-Id: Iefcbb02ec74bf46ea20d9dca672a3efd5d631517
-
Yaowu Xu authored
That force the stop of subpel search possibly at full/half/quater pel stages Change-Id: Ie50c500417bd78e1a53e6620bd4c2b85f63d9c67
-
- 24 Jan, 2014 10 commits
-
-
Dmitry Kovalev authored
Corresponding renames: subpel_kernel => interp_kernel vp9_get_filter_kernel() => vp9_get_interp_kernel() pred_filter_type => pred_interp_filter adaptive_pred_filter_type => adaptive_pred_interp_filter mcomp_filter_type => interp_filter read_interp_filter_type() => read_interp_filter() write_interp_filter_type() => write_interp_filter() fix_mcomp_filter_type() => fix_interp_filter() Change-Id: I1fa61fa1dc81ebbf043457c3ee2d8d4515bee6d3
-
Alex Converse authored
Also change its wrongly named dest parameter to reference. Change-Id: Ide142dead31c9ccda1f09a48b221284369783fb7
-
Dmitry Kovalev authored
Change-Id: I24ff8ab3d2c807906aa86974bcb4c540256206de
-
Yaowu Xu authored
SSE for a 64x64 block with 3 planes can go as high as 3*2^28. So left shift by 4 may overflow 32 bit int. Change-Id: I63c84aa56894788bb987299badabbd7cc6fd0be6
-
Alex Converse authored
Use this method with rt at speed -5. Change-Id: If3bd6fad4c05ddde72131442dad191e4145047e7
-
Yaowu Xu authored
The sum of squared mv components can go beyond int range for large input resolution. This commit changed the type to int64 to avoid overflow. Change-Id: Ib21ea2817845cea1435f893064e6417c79c5bc64
-
Dmitry Kovalev authored
Change-Id: I8cfa5d5eb2c1bbacd9b604cc5dc0a2cd2e5cebb8
-
Dmitry Kovalev authored
Change-Id: I5173f996612e410d9cd495df9414d194b1ab18f3
-
Frank Galligan authored
Change-Id: Ia12aae491202098ff66366145aa0c3da38dc97e5
-
hkuang authored
which is 3.5 times faster than C. Change-Id: I24439ba7a2971829c11620f34848facf2c916678
-