- 15 Aug, 2017 8 commits
-
-
Monty Montgomery authored
Rather than disabling MMX (well, all of SIMD) for daala transforms, selectively disable the AV1 TX SIMD through av1/common/av1_rtcd_defs.pl This also requires quite a few testing build fixups. Change-Id: I689eaafbdd3a87e3a8eeef97412a1846ef886055
-
Monty Montgomery authored
CONFIG_DAALA_DCT4 currently force-enables CONFIG_DCT_ONLY due to a missing 4-point DST. The DST had not been included because it was a significant coding performance loss; this turned out to be a bug that has since been corrected. This patch adds a 4-point type IV DST to the DAALA_DCT4 experiment. There is a small coding performance loss in using the type IV over AV1's current type VII. subset-1: monty-newdst4test-baseline-s1-F@2017-07-29T04:58:43.976Z -> monty-newdst4test-daala-s1-F@2017-07-29T04:59:56.094Z PSNR | PSNR Cb | PSNR Cr | PSNR HVS | SSIM | MS SSIM | CIEDE 2000 -0.0336 | 0.1393 | 0.0491 | 0.4118 | -0.0439 | 0.2084 | 0.0476 objective-1-fast: monty-newdst4test-baseline-o1f-F@2017-07-29T04:58:10.439Z -> monty-newdst4test-daala-o1f-F@2017-07-29T04:59:04.678Z PSNR | PSNR Cb | PSNR Cr | PSNR HVS | SSIM | MS SSIM | CIEDE 2000 0.0064 | 0.1071 | -0.0108 | 0.1133 | -0.0035 | 0.0765 | 0.0502 Change-Id: Ie29835edbe0e41bc86f4b09457e88d924cc9bf7e
-
Zoe Liu authored
This will remove the compilation failure for the weekly run on speed checking. Change-Id: Idf688c7e4c6fcb4c5aabef68b0e9f68996cd9a12
-
Monty Montgomery authored
This experiment replaces the 64-point Type-II DCT and related scaling vp9 transforms with the 64-point orthonormal Daala transforms. subset-1: monty-square-baseline-s1-F2@2017-07-28T03:35:45.962Z -> monty-square-dct64-s1-F2@2017-07-29T04:50:58.412Z PSNR | PSNR Cb | PSNR Cr | PSNR HVS | SSIM | MS SSIM | CIEDE 2000 -0.1930 | -0.2037 | -0.0643 | -0.1917 | -0.2331 | -0.3510 | -0.1810 objective-1-fast: monty-square-baseline-o1f-F2@2017-07-28T03:35:35.533Z -> monty-square-dct64-o1f-F2@2017-07-29T04:50:28.542Z PSNR | PSNR Cb | PSNR Cr | PSNR HVS | SSIM | MS SSIM | CIEDE 2000 -0.2557 | -0.1743 | -0.4900 | -0.3028 | -0.4147 | -0.5764 | -0.2864 Change-Id: I1f944df29e44d2e350c42555af274f2d75a62a92
-
Urvang Joshi authored
This experiment has been adopted as it has been cleared by Tapas. Change-Id: I0682face60f62dd43091efa0a92d09d846396850
-
Urvang Joshi authored
Change-Id: I6b529f8aac561c746bf2805e601931f982bdbb88
-
Thomas Davies authored
No change to metrics, as quantization matrices are not used unless --enable-qm=1 is set on the command line. Fix no highbitdepth compilation, and fix compile errors and warnings for PVQ and NEW_QUANT experiments. Change-Id: I49aceb5acf6ca6790c81e760e5b208788f87086d
-
Monty Montgomery authored
This experiment replaces the 32-point Type-II DCT and 32-point Type-IV DST scaling vp9 transforms with the 32-point orthonormal Daala transforms. subset-1: monty-square-baseline-s1-F3@2017-08-02T11:50:51.375Z -> monty-square-dct32-s1-F3@2017-08-02T11:50:18.859Z PSNR | PSNR Cb | PSNR Cr | PSNR HVS | SSIM | MS SSIM | CIEDE 2000 0.0000 | 0.0115 | -0.1044 | -0.0185 | -0.0069 | -0.0603 | 0.0555 objective-1-fast (4 frames): monty-square-baseline-o1f-F3-l4-fine@2017-08-12T02:18:05.560Z -> monty-square-dct32-o1f-F3-l4-fine@2017-08-12T02:19:44.461Z PSNR | PSNR Cb | PSNR Cr | PSNR HVS | SSIM | MS SSIM | CIEDE 2000 -0.0269 | -0.0715 | N/A | -0.0547 | -0.0268 | -0.0590 | N/A Change-Id: Ib1bad991d82eb67956e94a6216298a84e908b169
-
- 11 Aug, 2017 1 commit
-
-
Steinar Midtskogen authored
Low latency, cpu-used=0: PSNR | PSNR Cb | PSNR Cr | PSNR HVS | SSIM | MS SSIM | CIEDE 2000 -0.3162 | -0.6719 | -0.6535 | 0.0089 | -0.3890 | -0.1515 | -0.6682 High latency, cpu-used=0: PSNR | PSNR Cb | PSNR Cr | PSNR HVS | SSIM | MS SSIM | CIEDE 2000 -0.0293 | -0.3556 | -0.5505 | 0.0684 | -0.0862 | 0.0513 | -0.2765 Low latency, cpu-used=4: PSNR | PSNR Cb | PSNR Cr | PSNR HVS | SSIM | MS SSIM | CIEDE 2000 -0.2248 | -0.7764 | -0.6630 | -0.2109 | -0.3240 | -0.2532 | -0.6980 High latency, cpu-used=4: PSNR | PSNR Cb | PSNR Cr | PSNR HVS | SSIM | MS SSIM | CIEDE 2000 -0.1118 | -0.5841 | -0.7406 | -0.0463 | -0.2442 | -0.1064 | -0.4187 Change-Id: I9ca8399c8f45489541a66f535fb3d771eb1d59ab
-
- 10 Aug, 2017 1 commit
-
-
Urvang Joshi authored
This experiment is now adopted as it was cleared by Tapas. Note: Palette use can still be controlled by command-line option "--tune-content=..." in 'aomenc'. Change-Id: I832f49f20f60c34bdef5b424755849c496687e87
-
- 09 Aug, 2017 1 commit
-
-
Yushin Cho authored
Because the palette is not supported by PVQ yet. Change-Id: If432f5c43d24726ad99161f1f76fa6c28267ca8b
-
- 08 Aug, 2017 1 commit
-
-
RogerZhou authored
Change-Id: Iec7969ffd8f53ca2f4eefd1d757cfec7b3bde131
-
- 04 Aug, 2017 3 commits
-
-
Rupert Swarbrick authored
Change-Id: I0c3772110e9fa62ac687bd99e290b5006bf3bd6c
-
Tom Finegan authored
Change-Id: I5d8615b585f3c4da6af1c1bfd073bdea94ac9df0
-
Yushin Cho authored
Distortion metric that is currently used for CDEF is also used for distortion of luma channel during RDO-based mode decision. This experiment works on the top of 'dist-8x8' experiment. The BD-Rate change by this experiment for three frames of objective-1-fast in AWCY is: PSNR | PSNR Cb | PSNR Cr | PSNR HVS | SSIM | MS SSIM | CIEDE 2000 1.1589 | -2.0036 | -1.9620 | -0.0076 | -1.4145 | -1.4561 | -0.6410 Change-Id: I1142fe2f186f4ed86e4d33468e00b84e30b20233
-
- 03 Aug, 2017 1 commit
-
-
Sarah Parker authored
When ext-tx is disabled, the mrc-tx implementation is not complete, so they must be enabled together for now. Change-Id: Ib049f0e15023272c44a905581842db0626cdf14d
-
- 02 Aug, 2017 3 commits
-
-
Angie Chiang authored
This experiment aims at merging lbd/hbd txfms So far this exp uses hbd transform on lbd path. The performances I observed are lowres -0.089% midres 0.065% (negative means performance drop) Started from here, two main things are needed to be done. 1) Fix overflow due to quantizer noise 2) Generate a 16-bit version from the hbd txfm Change-Id: I35bb1fc0cbb78decad2570ff5826ed665f739752
-
Tom Finegan authored
CONFIG_ONTHEFLY_BITPACKING no longer guards any code. Remove the flag from the configure and CMake builds. Change-Id: Id5605155bdedbf540fe5b9cea3899e8de5ee1062
-
Zoe Liu authored
Compared against baseline with default enabled tools (except for ext-tx and global-motion for speed concern): altref2 -> altref2 + flex-refs lowres: avg_psnr -0.395% -> -0.460% midres: avg_psnr -0.418% -> -0.478% In particular, flex-refs improves the coding performance for the following 3 clips while no impact on all other clips: bowing_cif.y4m: avg_psnr 0.023% -> -1.022% pamphlet_cif.y4m: avg_psnr 0.454% -> -1.111% snow_mnt_480p.y4m: avg_psnr -0.162% -> -1.948% Change-Id: I612c1ae5feb1f07d8bd5aaf67e21a076445e10b9
-
- 01 Aug, 2017 1 commit
-
-
Thomas Daede authored
This stores frame contexts alongside a reference frame, and always uses the frame in reference slot 0 (LAST_FRAME) as the source of the frame context. The encoder could then reorder reference frames as to control which frame context is used, however currently it does not. Low Latency AWCY result: PSNR | PSNR Cb | PSNR Cr | PSNR HVS | SSIM | MS SSIM | CIEDE 2000 -0.1438 | 0.4161 | N/A | 0.0386 | -0.0281 | 0.0453 | 0.2514 https://arewecompressedyet.com/?job=before-frame-context-signaling%402017-06-07T23%3A20%3A49.473Z&job=after-frame-context-signaling%402017-06-07T23%3A21%3A36.117Z Change-Id: I4f6f9b12cb403573efbf9e5c3077d62f5dedc467
-
- 31 Jul, 2017 1 commit
-
-
Angie Chiang authored
The performance on default experiment is lowres: 0.812% midres/hdres and AWCY tests are still running Change-Id: Id2209c79df6517732dd06c2712a7bdefde118ead
-
- 29 Jul, 2017 1 commit
-
-
Monty Montgomery authored
This experiment replaces the 16-point Type-II DCT and 16-point Type-IV DST scaling vp9 transforms with the 16-point orthonormal Daala transforms. These have reduced complexity and are perfect reconstruction. There is currently no net coding performance impact. subset-1: monty-square-baseline-s1-F@2017-07-23T03:43:45.042Z -> monty-square-dct16-s1-F@2017-07-23T03:42:29.805Z PSNR | PSNR Cb | PSNR Cr | PSNR HVS | SSIM | MS SSIM | CIEDE 2000 -0.0152 | -0.0028 | -0.0929 | -0.0432 | -0.0457 | -0.0425 | -0.0237 objective-1-fast: monty-square-baseline-o1f-F@2017-07-23T03:44:19.973Z -> monty-square-dct16-o1f-F@2017-07-23T03:43:22.549Z PSNR | PSNR Cb | PSNR Cr | PSNR HVS | SSIM | MS SSIM | CIEDE 2000 0.0305 | 0.0926 | -0.1600 | 0.0471 | 0.0219 | -0.0075 | 0.0135 Change-Id: I54fed26d65fd8450693334bb400b1fafd7e0dacb
-
- 28 Jul, 2017 1 commit
-
-
Luc Trudeau authored
CfL is now an independent mode. Results on Subset1 (Compared to 4266a7ed with CFL enabled) PSNR | PSNR Cb | PSNR Cr | PSNR HVS | SSIM | MS SSIM | CIEDE 2000 -0.1645 | -0.4017 | 0.2475 | -0.1851 | -0.2179 | -0.2338 | -0.2897 Change-Id: I2e86e7ea7bfc12bb1d763e70a136ca992d57a3c5
-
- 27 Jul, 2017 1 commit
-
-
Cheng Chen authored
Previously, U, V planes share the same filter level with Y. Here, we search and pick the best filter level for U, V planes. Selected filter levels are transmitted per frame. This works with parallel_deblocking. Coding gain on Google test set: Avg_psnr ovr_psnr ssim lowres: -0.116 -0.120 -0.339 midres: -0.218 -0.228 -0.338 hdres: -0.260 -0.264 -0.365 Change-Id: I03d2ac47539f3eea9f3c4b08007bd6d3f4b73572
-
- 26 Jul, 2017 4 commits
-
-
Yue Chen authored
Change-Id: Ie2c34490dc50cb242bcd701308e6b55243883b15
-
Sarah Parker authored
MRC_DCT uses a mask based on the prediction signal to modify the residual before applying DCT_DCT. This adds all necessary functions to perform this transform and makes the prediction signal available to the 32x32 txfm functions so the mask can be created. I am still experimenting with different types of mask generation functions and so this patch contains a placeholder. This patch has no impact on performance. Change-Id: Ie3772f528e82103187a85c91cf00bb291dba328a
-
Di Chen authored
Use three metrics to identify the still gf group. Performance: lowres: pamphlet_cif -1.395; bowing_cif -0.989; others remain same. Overall -0.064 midres: snow_mnt_480p -0.827. others remain same. Overall -0.028 Change-Id: I22a6429c7ebdad2c36ec73c7a69cabc07e8208b7
-
Monty Montgomery authored
This experiment replaces the 8-point Type-II DCT and 8-point Type-IV DST scaling vp9 transforms with the 8-point orthonormal Daala transforms. These have reduced complexity and are perfect reconstruction at the cost of a slightly worse coding performance. This is because the Daala transforms expect the input to be shifted by 4 bits but the output scale of the vp9 transforms is only 3 bits. subset-1: monty-square-baseline-subset1 -> monty-square-dct8-subset1@2017-07-17T21:37:44.281Z PSNR | PSNR Cb | PSNR Cr | PSNR HVS | SSIM | MS SSIM | CIEDE 2000 0.0019 | -0.0011 | -0.0585 | -0.0111 | 0.0305 | 0.0317 | 0.0187 objective-1-fast: monty-square-baseline-o1f -> monty-square-dct8-o1f@2017-07-17T21:37:15.735Z PSNR | PSNR Cb | PSNR Cr | PSNR HVS | SSIM | MS SSIM | CIEDE 2000 0.0285 | 0.0129 | -0.5080 | 0.0529 | 0.0345 | 0.0441 | 0.0054 Change-Id: I2b775495398fb717204a295397c3c5e3ca938183
-
- 24 Jul, 2017 2 commits
-
-
Urvang Joshi authored
- Use 'tx_size' in function signatures. - filter_intra_taps_3 and filter_intra_taps_4 updated to support TX_SIZES_ALL (thanks to yuec@) With these changes, filter-intra works correctly with rect-intra-pred. So, we remove the temporary workaround for this. Change-Id: Ide0f593419c21a74c08c61859f8dad918ca169fa
-
Urvang Joshi authored
This workaround is temporary, until filter-intra can work with rectangular blocks. Tested OK: make clean; ../../configure --disable-install-docs --enable-unit-tests --enable-debug --enable-aom-highbitdepth --enable-experimental --enable-adapt-scan --enable-dual-filter --enable-ext-inter --enable-ext-intra --enable-ext-refs --enable-ext-tx --enable-filter-intra --enable-loop-restoration --enable-rect-tx --enable-compound-segment --enable-interintra --enable-wedge make -j ./test_libaom Change-Id: I4554d1f25de9448b22465e93a7616df0c206e298
-
- 21 Jul, 2017 2 commits
-
-
Thomas Davies authored
Tile groups are now an integral part of the codec. Change-Id: I620a88ec7a44b057d5cce0bf6cf602822a3339a9
-
Urvang Joshi authored
This experiment was provisionally adopted on 2017-06-27. Change-Id: I5ebce1df7cec42804df553a26848ddfe8a449a59
-
- 20 Jul, 2017 4 commits
-
-
Cheng Chen authored
New deblocking filter that smooths block boundaries in an estimated direction of object orientation. 1. Select the proper direction for deblocking filtering. Compute abs gradient line by line for the block. Select the direction with least sum of abs gradient. 2. Apply deblocking filtering for a block along this direction. Apply directional filtering for Y, U, V planes. Coding gain on Google test set: % avg_psnr ovr_psnr ssim lowres -0.129 -0.136 -0.277 midres -0.103 -0.127 -0.188 hdres -0.159 -0.158 -0.173 screen_content -0.408 -0.397 -0.695 Change-Id: Ie8646dcc163ace5d8faf5e502b38342d885efc30
-
Yunqing Wang authored
In ext_tile experiment, when cm->large_scale_tile is 1, prev_frame_id can be the same as current_frame_id, which is prohibited in reference_buffer experiment and causes "CORRUPT_FRAME" error to be reported. In this patch, enable/disable reference_buffer according to large_scale_tile value, and thus make these 2 experiments compatible. Change-Id: If64943acb91e7a7b859db4e2ac62581e9b53ef85
-
Yushin Cho authored
A framework for computing a distortion at 8x8 luma block level during RDO-based mode decision search. New 8x8 distortion metric can be plugged in by way of this tool. Existing daala_dist now uses this experiment as well. Other possible applications that can make use of this experiment would be a distortion meric, which should apply at 8x8 pixels such as PSNR-HVS, SSIM, or etc. A rd_cost for final coding mode decision for a super block is computed for a partition size 8x8 or larger. For a block larger than 8x8, a distortion of each 8x8 block is independently computed then summed up. The rd_cost for 8x8 block with new 8x8 distortion metric is computed only when the mode decision of its sub8x8 blocks are completed. However, MSE distortion metric is used with sub8x8 mode decision. Thus, early termination is also determined with the MSE based rd_cost. Because the best rd_cost (i.e. the reference rd_cost) during sub8x8 prediction or sub8x8 tx is based on new 8x8 distortion while each sub8x8 uses MSE, the existing early termination cannot be used (And this can be the one of possible reason for the BD-Rate change with this revision). For a sub8x8 prediction, prediction mode for each sub8x8 block of a 8x8 block is decided with existing MSE and then av1_dist_8x8() is applied to the 8x8 pixels. (There is also av1_dist_8x8_diff, which can input diff signal directly) For a sub8x8 tx in a block larger than 8x8, instead of computing MSE distortion for each sub8x8 tx block, we wait until all sub8x8 tx blocks are encoded before av1_dist_8x8() is applied to 8x8 pixels. Sub8x8 prediction and transformas were most of tricky parts in this change. Two kind of distortions, for a) predicted pixels and b) decoded pixels (i.e. predicted + possible reconstructed residue), are always computed during RDO. In order to access those two signals a) and b) for a 8x8 block after its sub8x8 mode decision is finished, a) and b) need be properly stored for later retrieval. The CB4X4 makes the task of accessing a) and b) signals for sub8x8 block further difficult, since the intermediate data (i.e. a and/or b) for sub8x8 block are not easily accessible outside of current partition unless reconstruced with decided coding modes. Change-Id: If60301a890c0674a3de1d8206965bbd6a6495bb7
-
Zoe Liu authored
This experiment is to add ALTREF2_FRAME to allow 2 altref backward predictions. Each video frame will then have up to 7 reference frames to choose from: (1) 4 forward predictive references, namely LAST_FRAME, LAST2_FRAME, LAST3_FRAME, and GOLDEN_FRAME; and (2) 3 backward predictive references, namely BWDREF_FRAME, ALTREF2_FRAME, and ALTREF_FRAME. The tool of "altref2" is built on top of the "ext_refs" experiment. Change-Id: Idbb0bb53b43c5c2c7baf4959331fc5a31c77a118
-
- 18 Jul, 2017 2 commits
-
-
Ryan Lei authored
this change enables parallel_deblocking by default after it has been officially adopted. the parallel_deblocking_15taps experiment is merged into the parallel_deblocking experiment so it is removed to clean up the code. internal compile flags are added to disable 15 tap for both luma and chroma plane for future experiment purpose. the internal compile flags are disabled by default. Change-Id: I1668fd2cb7676d756c52263d6993241618d33ee6
-
Angie Chiang authored
This flag will allow us to skip key frame's stats Therefore, we can test inter frame performance when frame number is small. The inter frame's stats won't get underwhelmed because of key frame's stats Change-Id: I9eaa8e5775fb2e740406cfa4b4f64f96f180d9db
-
- 14 Jul, 2017 1 commit
-
-
Yunqing Wang authored
Added a 1-bit flag 'large_scale_tile'. If it is 0 that is the default value, use normal tile coding in TILE_GROUPS. If it is 1, use large-scale tile coding in EXT_TILE. At large_scale_tile=1 case, if single-tile-decoding is required, then the loopfilter is disabled. Related API and unit tests were modified. Change-Id: I3ba12dc3d80ccf1ab21543ab3b16c02282c34e3b
-
- 13 Jul, 2017 1 commit
-
-
Sarah Parker authored
Change-Id: I2d688d21de4a1d180f8ac36bb0363b57be55c0af
-