- 23 Nov, 2017 13 commits
-
-
Rupert Swarbrick authored
This doesn't have a big performance impact, and it's rather simpler just having one version of everything. Change-Id: I5fa5e7640a63d0ccb0c371f266c6eee99d9520f9
-
Rupert Swarbrick authored
Change-Id: Ifac3a3bf620061865b82b986d6b16bcabd96a187
-
Rupert Swarbrick authored
This fixes some Valgrind errors caused by reads from x_by_xplus1 that used tainted data as an address (see the comments in selfguided_sse4.c for what's going on). It also rewrites the algorithm to use an integral image approach instead of the handwritten filters that the code was using. The end result is roughly the same efficiency (I think that there's one more memory load per group of pixels, but this seems not to be measurable) and I've done some performance optimisation with perf too. Several 32-bit multiplications have been replaced by madd instructions which do 16-bit multiplications and add adjacent lanes. This is equivalent to a 32-bit multiplication when the 32-bit lanes contain numbers below 2^15, but runs significantly faster. Change-Id: I3d0f3043c7861707a56e2fd1849574dc73897d6c
-
Hui Su authored
BUG=b/69488541 Change-Id: I2113bba4589f61a09d0dd07c64a522f4d0ae304b (cherry picked from commit cccda0db727c2282375b174104294b40911d1447)
-
Hui Su authored
BUG=b/69445855,b/69441422 Change-Id: Iaf5aba78dc39f01c87fb726611e674d34af6bffe (cherry picked from commit 75ff22f309de2e25477d336a6a8e9e58d3bb2272)
-
James Zern authored
under visual studio c4334: result of 32-bit shift implicitly converted to 64 bits (was 64-bit shift intended?) Change-Id: If06793116ddfbe3265a17a17a2bcaa6ee8cf9e2d (cherry picked from commit 535ecf6b31fe97f704f6725989cffad88ad960d8)
-
Hui Su authored
BUG=b/69238080,b/69288165 Change-Id: Ia761d4b77049a55bd8040b5ed76063b2fac750ee (cherry picked from commit c9762668a3f25c2dfe31c426871450fbfd44b9e0)
-
Hui Su authored
BUG=b/69205191 Change-Id: I79e404dc2cd6db06e71a64338b74eb4b575ba431 (cherry picked from commit 85f2a5ae4c15de5dd530766eb3933b9de976d9cf)
-
Hui Su authored
BUG=69073461 Change-Id: Ib28b41adfa2738681357903a81a89bcab01c87b3 (cherry picked from commit 08b26a8a257e54210d8bbdba799980bc291f368e)
-
Jingning Han authored
Change-Id: Ief1bedd68de55c29de15f56d805e242d932ff359
-
Jingning Han authored
Change-Id: Ifb295cbcde5474d33c4eca008d89c9dda68d327e
-
Yaowu Xu authored
Reason for revert: nightly test failures due to imcompatibility with lv-map. BUG=aomedia:1052 BUG=aomedia:1058 BUG=aomedia:1059 Change-Id: Ifbe9cf4542b1b023b8b9e0a2f780e0075914bee0
-
Hui Su authored
To silence asan failures in fuzzing tests. BUG=:68825590,68825594,68825599 Change-Id: Ib2c713dc19af223da5e5fc5cec4652d71856f830 (cherry picked from commit e43ea91055133baaf3b691170a097a456c032e23)
-
- 22 Nov, 2017 13 commits
-
-
Cheng Chen authored
Unit tests for: aom_jnt_comp_avg_pred_c aom_jnt_sadmxn_avg_c aom_sadmxh_sse2 aom_jnt_comp_avg_pred_ssse3 aom_jnt_sadmxn_avg_ssse3 Change-Id: I22463f143fd6513fd68f4bb82203e1b4fe3bbab7
-
Frederic Barbier authored
Previous assumption on reduced_tx_set_used=0 led to many assertion failures and prevented signalling reduced_tx_set_used equal to 1. BUG=aomedia:1053 Change-Id: If9a9dff8d01ba3ec942e06559c153f06d34555f9
-
Cheng Chen authored
Add ssse3 implementations for the sad_avg c function at low bit-depth. With this, aom_jnt_sad c functions can all have simd implementations. This CL follows existing MACRO definitions for multiple combinations of block sizes. Change-Id: I882343684026525f5589a239337cfac2dd411e11
-
Rupert Swarbrick authored
The code was assuming that an mi was always 4 samples wide and high. For chroma planes with subsampling, this is wrong and the size and position of the sb in the plane was over-estimated by a factor of two. This meant that we sent all the coefficients in the top-left hand quarter of the tile. Since the encoder and decoder made the same mistake, this worked fine, but it's clearly not what we're supposed to do! Change-Id: I0da8ada1d76639ad476ad84491658bc25ef3a43f
-
Cheng Chen authored
Both c function and ssse3 have passed unit tests. Change-Id: I48cff97ebf2735b43256b83f3b41ce7ccdf27393
-
Cheng Chen authored
Change function names and add SIMD implementation for two c functions: (1) var_filter_block2d_bil_first_pass (2) var_filter_block2d_bil_second_pass This CL allows aom_jnt_sub_pixel_avg_variance now in SIMD. Change-Id: Ib41ef13d62ae91a0ca481bcebb24568dcd4722c4
-
Yaowu Xu authored
Change-Id: I91dd5d3351d5dcc70ffcdb883d1e7cbd054d1a27
-
Luc Trudeau authored
In order to address hardware concerns regarding luma averaging, this patch caps the maximum length of the sides of the luma averaging area to 32. This proposal was accepted by the hardware working group on November 20th 2017. Regression on Subset1: PSNR | PSNR Cb | PSNR Cr | PSNR HVS | SSIM | MS SSIM | CIEDE 2000 0.0078 | 0.0572 | -0.0705 | -0.0272 | -0.0202 | -0.0391 | 0.0375 Change-Id: I875a6f2114df4d857ed66c4690ee08da2df426e4
-
Jonathan Matthews authored
BUG=aomedia:1026 Change-Id: If1b62c728bb77ca733aaa5b9100608f6b9d33db1
-
Debargha Mukherjee authored
Change-Id: I9854d5ec193dadaa455c209b1482ead895982cb7
-
Urvang Joshi authored
- Reuse scan order - Truncate to max eob of 32*32 - Quantization and entropy coding same as done for TX_32X32 - Reuse quantization matrices of TX_32X32 Compression performance is roughly neutral: https://arewecompressedyet.com/?job=tx64x64_oldscans%402017-11-06T03%3A11%3A53.868Z&job=tx64x64_reusescans%402017-11-06T03%3A12%3A55.738Z Change-Id: Ie9182c1c69a42a3c1ab4fc980abbd6000c64f179
-
Hui Su authored
Change-Id: I6c08f4aaba10f89fd17f3100dc5f14bb2087e76b
-
Hui Su authored
Very little impact on comression quality. Change-Id: I3b0fbebe7c6e53f299a764aba49a22b931bb8bd0
-
- 21 Nov, 2017 14 commits
-
-
Urvang Joshi authored
Provisionally adopted on 11/17. Also, some related tweaks to fix build errors. Change-Id: I7d5592450e9284d489b46adc274cd0cfccd04b3c
-
Sarah Parker authored
This bug was causing a 14% drop in performance on the lowres set, this fix reduces the drop to 0.2%. Change-Id: I166efc9cc5735c3366a0989e924d2c9fae9e706b
-
Linfeng Zhang authored
Change-Id: Ia07e909c89eda3742682531dca068ecf63a6281e
-
Imdad Sardharwalla authored
A few bits of data were not zeroed because aom_wb_overwrite_literal was being used instead of aom_wb_write_literal. BUG=aomedia:1041 Change-Id: If9196551e99b6e5eeaefc3fe022088088f1dcd51
-
Debargha Mukherjee authored
The fixes in rdopt.c improves the coding performance of 4:1 transforms significantly. Change-Id: I0e8db93e3f6d9bf0b2de01f2ce83c305d78d2262
-
Yaowu Xu authored
Change-Id: Ia06a6f2ac6473f10f09df3c0cc9cd45c2b781146
-
Yaowu Xu authored
Change-Id: If6a6aad09d5773f8858d7c163c15e9fcefccc9cb
-
RogerZhou authored
BUG=aomedia:1031 Change-Id: I44007d418dba65cda9bd5fe44f8bfa66c080c7bc
-
Frederic Barbier authored
Change-Id: I2960f6d6d6bbdf747a32ed1cdd2b5a3c4e51ba8b
-
Jingning Han authored
Abstract the inter block transform coefficient writing unit. Change-Id: I8e7a83d2d92941258f7250fee4c96f5ddfc4572e
-
Ola Hugosson authored
Previously lv_map bins (cdf2) were treated specially in that the probability was rounded to 7bit before passed to the arithmetic engine. Change-Id: I75d8437a6185e529e42e9867e3df18384447f2fd
-
Dominic Symes authored
The horizontal delay is specified in pixels to work for SB64 and SB128. The wavefront gradient is changed so the above block is available. Change-Id: I24cc426bded6904925930f6d431f5737070f9e17
-
Yaowu Xu authored
BUG=aomedia:1036 Change-Id: I8444658957c6e19f0525e21c3918000a4097a729
-
Yunqing Wang authored
It seemed the bug got fixed in current TOT, and the unit tests failed before passed now. Re-enabled LOOPFILTERING_ACROSS_TILES. BUG=aomedia:1023 Change-Id: I11efe82d6c9232b702409e69750490fd3456c320
-