- 07 Dec, 2017 8 commits
-
-
Jingning Han authored
Check the bottom neighbor availability with tile boundary for the intra prediction condition. BUG=aomedia:1088 Change-Id: I9baa98f8f18da84f95fd83ceca5556cfe9d9d844
-
Zoe Liu authored
Change-Id: Ia790b7f12cf78f628e99bb7fe2e272771c1a68c7
-
Zoe Liu authored
This experiment allows two choices for the frame-level reference mode flag: SINGLE_REFERENCE and REFERENCE_MODE_SELECT. It removes the choice of COMPOUND_REFERENCE, as it has been barely used. Change-Id: I8af18acd2fe3c0d4928f2b05f35aad0ebcb1556a
-
Yunqing Wang authored
Changed convolve 1d function names to be more precise. Change-Id: I9bc1e99f5032ed8a84c61616036735af23d9a587
-
Debargha Mukherjee authored
BUG=aomedia:1065 Change-Id: I35eda84e249fc42e319604079c122df8ab101f90
-
Debargha Mukherjee authored
BUG=aomedia:1065 Change-Id: I0951a276865a5d810eb04bbb5251ed5c1b417ca4
-
Angie Chiang authored
BUG=aomedia:1096 Change-Id: Ibbb448e217ae1dd9a096c23d01b08c9804583003
-
Urvang Joshi authored
This is to keep the high range within 32 bits. AV1FwdTxfm2d.CfgTest passes after this fix. Change-Id: I2df463c4ec9260c544d68aad445b60cabe2b531b
-
- 06 Dec, 2017 9 commits
-
-
Yunqing Wang authored
The purpose of this change is to reduce the cycles needed for warped motion parameter estimation. Method 1: If we remove the 2-bit bit-depth reduction(as in patch set 2), the downshifting of A, Bx, By is also removed. The borg test result(over the baseline) is: avg_psnr ovr_psnr ssim lowres: 0.023 0.020 0.071 cam_lowres: -0.009 -0.017 -0.031 Method 2: In theory, the above change uses 2 more bits for elements of A, Bx, By. In patchset 3, we modified LS_STEP to be 8(1 full pixel), and now, the least 2 bits in A, Bx, By elements are always 0. Namely, 2-bit bit-depth reduction are achieved without extra operations. The borg test result(over the baseline) is: lowres: -0.004 -0.007 -0.023 cam_lowres: -0.031 -0.033 -0.045 This is a little better than patch set 2 result. Method 2 is the final choice. Change-Id: I945aaba412e2ea86b7d67e8a90741fdf395b94cd
-
Zoe Liu authored
Change-Id: Ia8321afd087f99371cdf07f3a03249580e09964d
-
Zoe Liu authored
This patch simplies the checking criteria for the two groups of compound modes. It also makes the encoder side cdf update inside the RD loop consistent with that in the bitstream. Experimental results on Google test sets (30 frames of lowres and midres) confirm this patch obtains identical coding performance. Change-Id: I170eea91f7d2be2170df544cfc2c692b09aa82d6
-
Yushin Cho authored
Fix the comments on the precision of quantizers and tx coefficients during a quantization process for different input depth and tx size. I think the author really meant "de-quantized/de-coded coefficients" by "quantized/coded coefficients". So, made it clear to avoid any possible misunderstandings. Change-Id: Ib92ac7dcfddcbe58cf3adfb9448497512381c1f5
-
Cheng Chen authored
Load four 8-bit input and process. Change-Id: I9b3ba58ea3a03c6a8129379afa37c54a57e04501
-
Sebastien Alaiwan authored
Also, make them const when appropriate. Change-Id: I96d544e2cc9a0bce4d52fd33e44a4eaa40edda3c
-
Maxym Dmytrychenko authored
Can be up to >10% faster with bit exact results Change-Id: I5f169673fd2d5af96f425f00d862f3c989228d2e
-
Urvang Joshi authored
16x64 reuses scan order of 16x32 64x16 reuses scan order of 32x16 max eob is curtailed to 512 (instead of 1024) for both. Change-Id: Iac2145aa5e3d090009e2a2f5715caa8d84dfb2ee
-
Zoe Liu authored
Change-Id: I17a82393f1b7230119f499e2f9ed8d0b8fe5ba25
-
- 05 Dec, 2017 23 commits
-
-
Luc Trudeau authored
Moving CfL to using partition unit DC_PRED requires 4:1 and 1:4 DC_PRED, which are not currently implemented. A simple solution is to disable CfL for 4:1 and 1:4 partitions. CfL is also disabled for luma intra partitions < 4x4. This is inherent to luma intra prediction partition sizes. We add an assert to enforce this. Resulting in the following regression for Subset1 PSNR | PSNR Cb | PSNR Cr | PSNR HVS | SSIM | MS SSIM | CIEDE 2000 -0.0093 | 0.1803 | 0.1519 | -0.0180 | 0.0256 | 0.0226 | 0.0352 https://two.arewecompressedyet.com/?job=CfL%402017-11-30T19%3A05%3A05.639Z&job=CfL-Disable-4to1%402017-11-30T19%3A04%3A00.761Z Change-Id: Ie2c8b4d9cb6b6f33a103b540209e1a2fb6df74a7
-
Angie Chiang authored
This is for speeding up the testing process Change-Id: I90866fa239794f14e4801675d471dbf50b779d18
-
Angie Chiang authored
When lv_map_multi is on, od_ec_encode_q15 is able to handle the situation of fh == fl Change-Id: I7c837dda561f1d25b0203c018763dadd0cbbc75a
-
Cheng Chen authored
Added a copy function (c version and sse2 version) for full-pixel motion vectors for jnt_comp experiment following existing av1_convolve_2d_copy function. Change-Id: I20fd2219799f9c1451f591574fbe97364f40e0f0
-
Sarah Parker authored
This allows for the following options: Set 0: Inter: All 16 txfms Intra: Discrete Trig transforms w/0 flip (4) + Identity (1) + 1D Hor/vert DCT (2) Set 1: Inter: Discrete Trig transforms w/ flip (9) + Identity (1) + 1D Hor/Ver DCT (2) Intra: Discrete Trig transforms w/0 flip (4) + Identity (1) Set 2: Inter: Discrete Trig transforms w/ flip (9) + Identity (1) Intra: Discrete Trig transforms w/0 flip (4) + Identity (1) Results on lowres 40 frames with disable-ext-partition disable-ext-partition-types Set 0: 0.03% Set 1: No change Set 2: 0.06% Change-Id: Iec57d8c8fcfa0891528de4ca88f54753dfcb5284
-
Cyril Concolato authored
Change-Id: Ieed4ecce63a2a3b2a74c40ccddabe91cb9386632
-
Debargha Mukherjee authored
Constrain 16x64 transform so that the bottom 16x32 is zero; constrain 64x16 transform so that the right 32x16 is zero; Also implement 32x64 transform better to reduce intermediate coefficient range. Change-Id: Ia9050ee741ed1d5b02a42616635b496d637d932f
-
Cheng Chen authored
Extend context model for comp_group_idx. Save sending comp_group_idx when masked_compound is not allowed. Change-Id: Ia7ae53958c9e1c8fe07be4b14a425d9b8648082d
-
Cheng Chen authored
Remove COMPOUND_AVERAGE from compound_type_cdfs since it is now grouped to compound_idx. However, COMPOUND_AVERAGE is still used elsewhere. Change-Id: Ie0d460aabf9252e80eb4130cfef9aaf0efc3969d
-
Cheng Chen authored
Divide compound inter prediction modes into two groups: Group A: jnt_comp, compound_average Group B: interintra, compound_segment, wedge Change-Id: I1142da2e3dfadf382d6b8183a87bde95119cf1b7
-
Timothy B. Terriberry authored
Change-Id: Ie3e599def556a90c474680567c4537508de2e30a
-
Nathan E. Egge authored
This 4-point Type-IV asymmetric DST uses the same computation graph as the 4-point Type-IV DST. This change improves the accuracy of the 8-point Type-II DCT: Old MSE: 1.8927096972341813413041010372151e-06 New MSE: 1.7946367518072710517065436117146e-06 subset-1: new_dst4@2017-12-04T06:31:41.096Z -> new_dst4a@2017-12-04T06:32:22.698Z PSNR | PSNR Cb | PSNR Cr | PSNR HVS | SSIM | MS SSIM | CIEDE 2000 -0.0143 | 0.0410 | -0.2166 | -0.0556 | -0.0379 | -0.0461 | -0.0002 Change-Id: Ifde11fca987220130c1657306b0df34ec2f3fe25
-
Nathan E. Egge authored
This change slightly improves the 16-point DCT round trip accuracy due to changes in the rounding. subset-1: new_dst2@2017-12-04T01:59:57.412Z -> new_dst4@2017-12-04T06:31:41.096Z PSNR | PSNR Cb | PSNR Cr | PSNR HVS | SSIM | MS SSIM | CIEDE 2000 0.0078 | -0.0001 | 0.0198 | 0.0432 | 0.0408 | 0.0502 | -0.0057 Change-Id: I75783ace97834af89e70c9ce3002c6f09176e343
-
Nathan E. Egge authored
This 2-point Type-IV DST uses the same computation graph as the asymmetric 2-point Type-IV DST. Because this transform is embedded, it may be possible to remove the initial averaging step by splitting the 2-point Type-IV DST into separate forward and inverse transforms. This change also reduces two multiplication constants (forward and inverse transform) so they are less than 1. subset-1: new_dst2a@2017-12-04T01:59:12.884Z -> new_dst2@2017-12-04T01:59:57.412Z PSNR | PSNR Cb | PSNR Cr | PSNR HVS | SSIM | MS SSIM | CIEDE 2000 -0.0126 | 0.0387 | 0.0441 | 0.0554 | -0.0301 | 0.0034 | -0.0342 Change-Id: I98568e0c5b97e3a6af27653ddab845ce97d2a53d
-
Nathan E. Egge authored
This change improves the accuracy of the 4-point Type-II DCT: Old MSE: 6.2711279572488185887270981198199e-08 New MSE: 6.0281623825882593130347914239103e-08 It also reduces a multiplication constant so it is less than 1. subset-1: daala_tx@2017-12-04T01:58:11.321Z -> new_dst2a@2017-12-04T01:59:12.884Z PSNR | PSNR Cb | PSNR Cr | PSNR HVS | SSIM | MS SSIM | CIEDE 2000 0.0274 | 0.0255 | 0.0969 | -0.0024 | 0.0274 | 0.0027 | 0.0110 Change-Id: I7c9d389af8e98cb39f3bc5923134b5dfe174ba0a
-
Timothy B. Terriberry authored
We use the former variant for the 8-point row transforms when the number of columns exceeds 8, since the scaling can exceed 16 bits. We ues the latter variant for the 8-point column transforms when the number of rows exceeds 8, since it allows us to perform twice as many transforms in parallel. Change-Id: Ia2595ad827636342f70c3d5b99cf05c278bd1389
-
Timothy B. Terriberry authored
On x86 there is no PMULHRSD for use in the 32-bit transform versions, so the fastest approach is to just do a normal 32-bit multiply and manually shift and round. This requires keeping the constants in their reduced precision instead of always promoting them to Q15. Change-Id: I76339b5567da3f08f34882a707e0c93122991946
-
Timothy B. Terriberry authored
This creates the mechanism by which we can define multiple versions for different instruction sets and word sizes. This commit makes no functional changes. Change-Id: If49ebfc989247692df9c501bea05eb811944d52a
-
Timothy B. Terriberry authored
This will aid us in defining multiple versions for different instruction sets and word sizes without duplicating all of the code. This commit makes no functional changes. Change-Id: I7f240281b0e9edba19c2ee17b9ff3ae36400dcc2
-
Timothy B. Terriberry authored
Change-Id: I49e64d3e062c32925abe30118a64a714073fd4d0
-
Timothy B. Terriberry authored
This is using the Type IV version with flattened multiplies for now, since we've identified some potential 16-bit overflows in the Type VII inverse. Change-Id: Ib79413ea27efac8b0207602001595ae3ac294eae
-
Timothy B. Terriberry authored
Change-Id: Ieb20d64e6531960188feb65296acfa952c858043
-