- 08 Jul, 2013 1 commit
-
-
John Koleszar authored
In the rare case were 4x4 interior filtering was called for but no 8x8 or larger filtering takes place, the previous code was skipping the filtering. This patch fixes the issue by including the interior mask in the overall mask for the filter application loops. Change-Id: I4a0b65056c64f97478827c2ff41e0914fc7779d0
-
- 24 Jun, 2013 2 commits
-
-
Yaowu Xu authored
-
John Koleszar authored
For cases where there's no transform set in bit 0 (the left edge of the SB) but bit 0 of mask_4x4_int is set (the edge 4 pixels from the left edge needs filtering), it was incorrectly being skipped before. This situation only happens on the leftmost edge of the image, as the edge at column 0 is intentionally skipped since there aren't pixels to the left to read. Change-Id: Ib2fbbcb40166e90af31b1a0e13b85b68c226cbd3
-
- 22 Jun, 2013 2 commits
-
-
Ronald S. Bultje authored
-
Ronald S. Bultje authored
Fixes crashes of test_libvpx on 32-bit Linux. Change-Id: If94e7628a86b788ca26c004861dee2f162e47ed6
-
- 21 Jun, 2013 14 commits
-
-
John Koleszar authored
-
Ronald S. Bultje authored
-
Ronald S. Bultje authored
Change-Id: I8fcab81e390f93dc17e9666bbf8f77883b5aa897
-
James Zern authored
Change-Id: Id54ad9a781634f075e990d5bade5be8490959975
-
Ronald S. Bultje authored
Fixes a crash on Windows when building with MSVC. Change-Id: I124ac756a1be55d190fadda5fcc46d23b1445dbf
-
Ronald S. Bultje authored
Change vp9_block_error() to return a 64bit error variable, change all callers to expect a 64bit return value (this will prevent overflows, which we basically don't check for at all right now). Remove duplicate block_error() function, which fixed that through truncation. Remove old (incompatible) mmx/sse2 block_error SIMD versions and replace with a new one that returns a 64bit value. Encoding time of first 50 frames of bus @ 1500kbps goes from 3min29 to 3min23, i.e. a 3% overall speedup. Change-Id: Ib71ac5508b5ee8a80f1753cd85d72df1629abe68
-
Ronald S. Bultje authored
-
Ronald S. Bultje authored
-
Ronald S. Bultje authored
3% faster overall (3min35.0 to 3min28.5). Change-Id: I5ff8a5c2c91586b6632ca5009ad1ea51ce94af5e
-
Yaowu Xu authored
-
Yaowu Xu authored
and remove unused code. Change-Id: If380440c4450294b5450b7a9eeb94a376846ec01
-
Yaowu Xu authored
-
Yaowu Xu authored
Change-Id: I7960178c95c54d5c4497e44cfc8c493566294b34
-
Yaowu Xu authored
-
- 20 Jun, 2013 21 commits
-
-
Ronald S. Bultje authored
Encoding of bus @ 1500kbps (first 50 frames) goes from 3min57 to 3min35, i.e. approximately a 10.5% speedup. Note that the SIMD versions which use a bilinear filter (x_offset & 7 || y_offset & 7) aren't perfectly interleaved, and can probably be improved further in the future. I've marked this with a few TODOs/FIXMEs in the code. Change-Id: I5c9e900c0f0d32e431a50fecae213b510b2549f9
-
Jim Bankoski authored
-
Jim Bankoski authored
Change-Id: Idfd69e66e8982275eb00d8007a55efd1a4f86a98
-
James Zern authored
-
Frank Galligan authored
- size_t vs int. Change-Id: Ib47ebd932a4b69db9f52a43000bb69d0a96b9134
-
James Zern authored
This reverts commit 90a9900a Seems to break the Mac build: src/include/gtest/internal/gtest-port.h:1208:: pthread_mutex_lock(&mutex_)failed with error 22 Abort trap: 6 Change-Id: Icbe31161d7c27f1b0a28d33409e7712430bbf0ae
-
Jingning Han authored
-
Johann authored
-
Dmitry Kovalev authored
-
Dmitry Kovalev authored
-
Deb Mukherjee authored
Improves the rd modeling function and implements them using interpolation from a table which is a little faster. Also uses sse as input to the modeling function rather than var - since there is no dc prediction used and as a result the sse works a little better. derfraw300: +0.05% Speedup: ~1% Change-Id: I151353c6451e0e8fe3ae18ab9842f8f67e5151ff
-
Johann authored
dboolhuff.c(50) : warning C4267: 'initializing' : conversion from 'size_t' to 'int' Change-Id: I6b85759efb2fa19f362f406623d8a7583a55c036
-
Jim Bankoski authored
adds a new speed feature to force partitioning to be greater than or less than a certain size Change-Id: I8c048eeeef93700ae822eccf98f8751a45b2e7d0
-
Jim Bankoski authored
this feature lets you set a partitioning size to be used by the entire frame. Change-Id: I208a4c8c701375cbb054418266f677768b6f8f06
-
Jim Bankoski authored
This uses variance to split partition. Variance is calculated using nearest mv, always from last ref frame. Change-Id: Idd015b4a9aa3bc82591759eac239680c07496896
-
Jim Bankoski authored
Change-Id: Ie24489a4d39f3e53e816eeebf75a1c9c7d94515a
-
Jim Bankoski authored
Change-Id: Ideee45cad8b38087c509cd404484728e85d0c427
-
Jim Bankoski authored
This uses the speed feature functionality for code. Change-Id: I9cd16c0c5f98520ae27ebba81aa2c178546587f8
-
Jim Bankoski authored
force us to go through slow partitioning for keyframes, altref and overlays. Change-Id: I1a286361bf74083e71973575a7296be46eb98742
-
Ronald S. Bultje authored
Overall speedup around 5% (bus @ 1500kbps first 50 frames 4min10 -> 3min58). Specific changes to timings for each function compared to original assembly-optimized versions (or just new version timings if no previous assembly-optimized version was available): sse2 4x4: 99 -> 82 cycles sse2 4x8: 128 cycles sse2 8x4: 121 cycles sse2 8x8: 149 -> 129 cycles sse2 8x16: 235 -> 245 cycles (?) sse2 16x8: 269 -> 203 cycles sse2 16x16: 441 -> 349 cycles sse2 16x32: 641 cycles sse2 32x16: 643 cycles sse2 32x32: 1733 -> 1154 cycles sse2 32x64: 2247 cycles sse2 64x32: 2323 cycles sse2 64x64: 6984 -> 4442 cycles ssse3 4x4: 100 cycles (?) ssse3 4x8: 103 cycles ssse3 8x4: 71 cycles ssse3 8x8: 147 cycles ssse3 8x16: 158 cycles ssse3 16x8: 188 -> 162 cycles ssse3 16x16: 316 -> 273 cycles ssse3 16x32: 535 cycles ssse3 32x16: 564 cycles ssse3 32x32: 973 cycles ssse3 32x64: 1930 cycles ssse3 64x32: 1922 cycles ssse3 64x64: 3760 cycles Change-Id: I81ff6fe51daf35a40d19785167004664d7e0c59d
-
Jim Bankoski authored
need to rework these Change-Id: I17dc2c88d2faadd2f8fb117c52c25f04ea2e9856
-