- 10 Oct, 2017 21 commits
-
-
Lester Lu authored
In this experiment, sharp image discontinuity in the predicted block is detected. Based on this discontinuity, we choose particular LGTs as row and column transforms. Bitstream syntax, entropy coding, and RD search for LGT are added. One binary symbol is used to signal whether LGT is used. This experiment can work independently with the lgt experiment. lowres: -0.414% for key frames, -0.151% overall midres: -0.413% for key frames, -0.161% overall Change-Id: Iaa2f2c2839c34ca4134fa55e77870dc3f1fa879f
-
Angie Chiang authored
Change-Id: Idb1a4bf4dd655bde22862d76f6fa70457381a770
-
Angie Chiang authored
This is flag will allow us to calculate the context indexes of any two consecutive non-zero binaries in parallel Moreover, we can set MIN_SCAN_IDX_REDUCE_CONTEXT_DEPENDENCY to X, which let first X coefficients be immune from the context dependency reduction act Change-Id: I75b71452996161ba06ec449021c7dea8e3899800
-
Angie Chiang authored
This aims at facilitate the experiment about reduce context dependency Change-Id: I3d026bda1118cf613001efa32deed62997d5e3bb
-
Angie Chiang authored
Change-Id: I7a73dbe72b618e795191cc31bc32e31ad99d8587
-
Yushin Cho authored
When early skipped in var-tx, distortion is set the same as sse. If so, use pixel domain sse (i.e. skip error) since is more accureate than sse from transform domain. Change-Id: Id3cbc66ea6318108c031413646f3d06250e75e7e
-
Hui Su authored
The "txb_split_count" counter should be properly updated. BUG=aomedia:864 Change-Id: I3fb34a818c3f474085c4a2980a2d3b68bd33fb12
-
Angie Chiang authored
Change-Id: I6d68f03e3f9b1e40b05503f6bb4055e2fd870893
-
Yue Chen authored
Make the codec account for the 64x64 processing unit constraint when generating secondary predictions and applying overlapped filter. This issue was addressed in commit 440d4254 and 501294ce, but afterwards some features are not fully retained in an obmc refactoring commit. Change-Id: I6f16e6fccb966d45034d5b55447c9d9cb70e02cb
-
Yi Luo authored
Function speedup on i7-6700: D117 sse2 ssse3 4x4 ~1.8x 8x8 ~3.4x 16x16 ~5.5x 32x32 ~2.9x D135 sse2 ssse3 4x4 ~1.9 8x8 ~3.3x 16x16 ~5.3x 32x32 ~3.6x D153 sse2 ssse3 4x4 ~1.9x 8x8 ~2.8x 16x16 ~5.5x 32x32 ~3.6x Change-Id: I43ab5fa8dcbcfa51acbde554abf3e5d7d336f391
-
Debargha Mukherjee authored
Most of the fixes are related to replacing BLOCK_64X64 with cm->sb_size. Fixes the AV1/AqSegmentTest.TestNoMisMatchExtDeltaQ/* tests that were breaking before with ex-partition. Change-Id: I19d6045b422a93891b8cf4f8a929def97a595058
-
Rupert Swarbrick authored
If you have a structure, foo_t, with an alignment request then Visual Studio won't allow you to declare a function void use_foo(foo_t x); The reasoning is that x might be passed on the stack, and their ABI doesn't allow them to guarantee that x is aligned appropriately. More strangely, this isn't allowed either: void use_some_foos(foo_t x[10]); This is functionally equivalent to: void use_windows_foos(foo_t *x); (except that you can't tell how long the array should be from the function signature). Since Visual Studio is supposed to allow the latter form, use that instead. Change-Id: Icd449fc1058606fa7e48a6f791091bbb42a73b2c
-
Rupert Swarbrick authored
This was triggered by a visual studio compile warning: cdef_test.cc(128): warning C4804: '>>': unsafe use of type 'bool' in operation However the code is rather hard to parse for humans too: when I first looked, I thought this was something to do with C++ templating... The new version is equivalent but defines max_pos in an outer loop (and a smaller indent). Change-Id: I0c5cabeee44d0839a7956a4ab1cf4ec5abfcc9ee
-
Yushin Cho authored
The rd_stats->sse is already updated by "rd_stats->sse += tmp << 4;", which is measured by pixel_diff_dist(), i.e. in pixel domain and w/o quantization(). Change-Id: I4dc20a7e80af9dd846aa5de4298cb56e7f0d8f7e
-
Debargha Mukherjee authored
Change-Id: Ie4382b8a1c0f87ce50e9afefd1cef8ca55435c61
-
Hui Su authored
Change-Id: Ibea4c2c732b16851ad16b475ea40f021d5b5d5b3
-
Sarah Parker authored
When a neighboring block uses global motion, use the mv computed at the center of the current block as the candidate vector rather than the mv computed at the center of the neighboring block. 0.15% improvement on cam_lowres Change-Id: I79eff8bf27a7aa84ae4a6d56e4a10c41a4438fb9
-
Rupert Swarbrick authored
This patch fixes a bug in select_tx_type_yrd. The function works by looping over possible transform types to find the best option (calling select_tx_size_fix_type for each). Whenever there's a new best candidate, the code copies information about the transform from the mbmi structure into stack-allocated "best candidate" structures. At the end, it copies the "best candidate" data back to mbmi. Before the patch, if ref_best_rd was small, each call to select_tx_size_fix_type might return INT64_MAX (because they don't find anything better than ref_best_rd) and so we'd never actually copy anything to the "best candidate" structures. Then, at the end of the function, we'd merrily overwrite mbmi with whatever happened to be on the stack, causing general mayhem when something tried to read the data from mbmi later. This patch exits early if no candidates were found. It also adds an assertion saying that if no candidates were found, ref_best_rd must have been less than INT64_MAX. This should hopefully catch any bugs where the continue keywords in the loop stop us ever actually calling select_tx_size_fix_type. Change-Id: I54b998148281dd80f98d1570f736964593dc753f
-
Rupert Swarbrick authored
For large blocks this is about 8x the speed of the C version. The code needs SSE 4.1 for the PMULLD instruction that we use to do SIMD 32-bit multiplies. The patch uses av1_convolve_scale_test (written already to test the low bit depth path) to make sure the optimised code matches the C version. Change-Id: I9304d6bb3d2cb31390de93ed08ff1a852e3ace86
-
Rupert Swarbrick authored
For large blocks this is almost 8x the speed of the C version. The code needs SSE 4.1 for the PMULLD instruction that we use to do SIMD 32-bit multiplies. This patch also makes av1_convolve_scale_test actually test something, making sure the optimised code matches the C version. The slightly excessive generality in the test (all the templating) is because of a following patch, which is for the high bit depth path and can then use most of the same test code. Change-Id: I6732bc6b2378ffaadae5aa6441100cf660f7ee11
-
- 09 Oct, 2017 11 commits
-
-
Angie Chiang authored
Since 32x32 transform use DCT only, we can avoid update other types of transform Change-Id: I51dd8ec71975187d249d7e25130e994a48cac5c1
-
Sarah Parker authored
0.15% improvement on lowres set Change-Id: If16a8e07797c64508f9e2d9b26ae874ac53c57a4
-
Rupert Swarbrick authored
There's a bitstream conformance requirement that says that any block must subsample to a valid block size with the current subsampling mode. For example, this means that BLOCK_4X8 is illegal if there is subsampling in only the horizontal direction (since there is no BLOCK_2X8). This patch checks the bitstream is conformant as it reads partition information in decodeframe.c BUG=aomedia:875 Change-Id: I18139aa76d6f965282402edbb0b68959478a46c3
-
Urvang Joshi authored
Introduced by: https://aomedia-review.googlesource.com/c/aom/+/25181 Change-Id: I1f25178d6b273fbeade4c33f153b5f2bac4a8b99
-
Rupert Swarbrick authored
This unit test doesn't actually provide any test coverage and merely exists to benchmark the C function, av1_convolve_2d_scale_c. The following patch will add an SSE version of that function and extend this test to check that the SSE code matches the C code. Change-Id: Ic942ad8f9fd57d2659fc60f92c5a0b6c9a9f8cac
-
Debargha Mukherjee authored
Change-Id: I73e9d2d327b062828a75bc99fb348441dd32174a
-
Debargha Mukherjee authored
Change-Id: Iaff923f34100ecdce76d2319fab67cde59d485ae
-
Cheng Chen authored
Change-Id: I23344af711d9a31b819fca35ae3ad3b7edf4852e
-
Rupert Swarbrick authored
This returns true if a block signals tx_size in the stream and uses it in the bitstream writing code and the decoder. Note that we can't quite use it in pack_inter_mode_mvs when CONFIG_VAR_TX && !CONFIG_RECT_TX but I've switched the code to using it the rest of the time since rect-tx is adopted and eventually the other code path should be deleted. Also use the helper function in tx_size_cost in rdopt.c, where the test was wrong and caused underestimates of block costs. (Specifically, the code that subtracts tx_size_cost from this_rate_tokenonly in rd_pick_intra_sby_mode ended up subtracting zero for a 4x8 block). The behaviour of the decoder should be unchanged. The only change in the encoder's behaviour should be in tx_size_cost where it should now match the rest of the code. Change-Id: I97236c9ce444993afe01ac5c6f4a0bb9e5049217
-
Zoe Liu authored
This coding tool is to introduce a new prediction mode for the bi-predictive frames that have a forward referernce within 2 frames away (distance denoted as 'fwd_delta'), and a backward reference, within (3-fwd_delta) frames away. If this prediction mode, namely 'ext_skip' is set, it will be coded using compound prediction with the most recent forward and backward reference frames as its reference pair, NEARESTMV as its motion mode, and the skip flag is set for the residue. Change-Id: I826034ccf1a956f4b350f0bc2e2dca8ea71b5197
-
Zoe Liu authored
Frame sign bias value will not be signaled in frame header. Instead, the sign bias of reference frames are derived from their corresponding frame offsets at both encoder and decoder. The tool of 'frame_sign_bias' is dependent of 'frame_marker'. Compared against baseline, the enabling of both tools obtains a small coding gain of -0.08 ~ -0.11% in BDRate over Google lowres/midres tests. Change-Id: I8d85dc427ced0b2152712ccf61be4be6068075b9
-
- 08 Oct, 2017 8 commits
-
-
Cheng Chen authored
Change-Id: I5446327378938128f27186015619a079c2845d53
-
Debargha Mukherjee authored
Change-Id: I71c07652565c0e1ca44d73f3731459949271fe45
-
Debargha Mukherjee authored
Solves some Windows build issues Change-Id: Ia903ed05285362449829a2777999cf73058f7733
-
Zoe Liu authored
This coding tool is dependent on the tool of frame_marker. This tool derives the frame sign bias directly from the frame offset. No sign bias signaling is needed. Change-Id: I3a8c77904d73caeeb1b6777fb026279fd2bbc6fb
-
Yunqing Wang authored
Add an experiment "tmp", which includes: 1. Always use larger block size while storing frame MVs and make it consistent for CB4X4 or non-CB4X4 cases. Namely, use 8x8 for 4x4 mi size and 16x16 for 8x8 mi size. 2. Allocate smaller buffer for frame MVs and save memory usage. 3. Use nearby 8x8 or 16x16 location's previous frame MVs, and make the logic simple. 4. Reduce the number of copying for frame MVs, that is very costly in decoder. Baseline decoder got 5+% speedup. Borg test on lowres set showed a +0.009% PSNR difference before/after the patch. Change-Id: I61e14e95fd35bea88f338931b4f43c44f4e4cf1f
-
Debargha Mukherjee authored
Change-Id: I16cee2064ddc80f80a21560e9d192a39033949ca
-
Debargha Mukherjee authored
Change-Id: I599f8fbdd3c19ec67d9a2118a41d735e11dd3f07
-
Zoe Liu authored
Change-Id: Ibdcb1530b9f81a2a5222e95cf5c0b7b2938509a8
-