- 15 Sep, 2017 1 commit
-
-
Nathan E. Egge authored
This patch fixes a regression introduced in 1d190950 where the encoder was using the 16x16 VP9/AV1 transforms for RDO, but then used the Daala transforms for encoding. subset1: master-daala_dct16@2017-09-13T12:05:18.013Z -> master_daala_dct16_use_c@2017-09-13T13:05:02.252Z PSNR | PSNR Cb | PSNR Cr | PSNR HVS | SSIM | MS SSIM | CIEDE 2000 -0.3654 | -0.7634 | -0.7407 | -0.4884 | -0.4699 | -0.4945 | -0.5104 master-no_rect_tx-no_var_tx@2017-09-12T00:23:18.153Z -> master_daala_dct16_use_c@2017-09-13T13:05:02.252Z PSNR | PSNR Cb | PSNR Cr | PSNR HVS | SSIM | MS SSIM | CIEDE 2000 -0.0133 | 0.1040 | -0.0440 | -0.0492 | -0.0151 | -0.0120 | 0.0699 Change-Id: Id1830d0975db4bd0320a47fdf45b4bca20881cfb
-
- 11 Sep, 2017 1 commit
-
-
Sarah Parker authored
This allows a mask for mrc-tx to be sent in the bitstream for inter or intra 32x32 transform blocks. The option to send the mask vs build it from the prediction signal is currently controlled with a macro. In the future, it is likely the macro will be removed and it will be possible for a block to select either method. The mask building functions are still placeholders and will be filled in in a followup. Change-Id: Ie27643ff172cc2b1a9b389fd503fe6bf7c9e21e3
-
- 25 Aug, 2017 2 commits
-
-
Nathan E. Egge authored
This patch fixes a regression introduced in 1d190950 where the encoder was using the 4x4 VP9/AV1 transforms for RDO, but then used the Daala transforms for encoding. The ~2% improvement below comes from forcing the C implementation of the 4x4 and 8x8 transforms to be used when CONFIG_DAALA_DCT4 and CONFIG_DAALA_DCT8 are enabled respectively. subset-1 (--enable-experimental --enable-daala_dct4): master@2017-08-21T21:41:18.302Z -> master_daala_dct4_use_c@2017-08-22T02:39:14.457Z PSNR | PSNR Cb | PSNR Cr | PSNR HVS | SSIM | MS SSIM | CIEDE 2000 -2.1953 | -1.2044 | -1.1865 | -1.6173 | -1.7029 | -1.6784 | -1.7235 Change-Id: I44d2b24094e89b2857ae03d743180e706cef45eb
-
Yue Chen authored
Make it 0 to run at higher precision Change-Id: I51decbf9179efa18a1a06dcc3f0e939d9895a5cd
-
- 22 Aug, 2017 2 commits
-
-
Sebastien Alaiwan authored
This is undefined behaviour in C and might confuse the optimizer, leading to incorrect code. Change-Id: Ia4bb60478068da678f013bdd6ab6a49814d89ebe
-
Lester Lu authored
Change get_lgt in order to integrate a later experiment lgt_from_pred with lgt. There are two main changes. The main purpose for this change is to unify get_fwd_lgt and get_inv_lgt functions into a get_lgt function so the lgt basis functions can always be selected through the same function in both forward and inverse transform paths. The structure of those functions will also be consistent with the get_lgt_from_pred functions that will be added in the lgt-from-pred experiment. These changes have no impact on the bitstream. Change-Id: Ifd3dfc1a9e1a250495830ddbf42c201e80aa913e
-
- 18 Aug, 2017 2 commits
-
-
Hui Su authored
Coding gain becomes tiny on top of other experiments. Change-Id: Ia89b1c2a2653f3833dff8ac8bb612eaa3ba18446
-
Rupert Swarbrick authored
The AVX2 code only supports DCT_DCT. For other transform types, use the C fallback. Change-Id: I6b472ebd7d963c02aae80ff5846b7f2dcaf092ea
-
- 15 Aug, 2017 2 commits
-
-
Monty Montgomery authored
This experiment replaces the 64-point Type-II DCT and related scaling vp9 transforms with the 64-point orthonormal Daala transforms. subset-1: monty-square-baseline-s1-F2@2017-07-28T03:35:45.962Z -> monty-square-dct64-s1-F2@2017-07-29T04:50:58.412Z PSNR | PSNR Cb | PSNR Cr | PSNR HVS | SSIM | MS SSIM | CIEDE 2000 -0.1930 | -0.2037 | -0.0643 | -0.1917 | -0.2331 | -0.3510 | -0.1810 objective-1-fast: monty-square-baseline-o1f-F2@2017-07-28T03:35:35.533Z -> monty-square-dct64-o1f-F2@2017-07-29T04:50:28.542Z PSNR | PSNR Cb | PSNR Cr | PSNR HVS | SSIM | MS SSIM | CIEDE 2000 -0.2557 | -0.1743 | -0.4900 | -0.3028 | -0.4147 | -0.5764 | -0.2864 Change-Id: I1f944df29e44d2e350c42555af274f2d75a62a92
-
Monty Montgomery authored
This experiment replaces the 32-point Type-II DCT and 32-point Type-IV DST scaling vp9 transforms with the 32-point orthonormal Daala transforms. subset-1: monty-square-baseline-s1-F3@2017-08-02T11:50:51.375Z -> monty-square-dct32-s1-F3@2017-08-02T11:50:18.859Z PSNR | PSNR Cb | PSNR Cr | PSNR HVS | SSIM | MS SSIM | CIEDE 2000 0.0000 | 0.0115 | -0.1044 | -0.0185 | -0.0069 | -0.0603 | 0.0555 objective-1-fast (4 frames): monty-square-baseline-o1f-F3-l4-fine@2017-08-12T02:18:05.560Z -> monty-square-dct32-o1f-F3-l4-fine@2017-08-12T02:19:44.461Z PSNR | PSNR Cb | PSNR Cr | PSNR HVS | SSIM | MS SSIM | CIEDE 2000 -0.0269 | -0.0715 | N/A | -0.0547 | -0.0268 | -0.0590 | N/A Change-Id: Ib1bad991d82eb67956e94a6216298a84e908b169
-
- 08 Aug, 2017 3 commits
-
-
Sarah Parker authored
This allows inter and intra modes to use different mask functions. The mask functions checked in are still place holders to allow for easy experimentation. Change-Id: Ic20d88200676df81dffee8c43555d0ff0c7bfc28
-
Sebastien Alaiwan authored
Change-Id: I51e4a61bfca6d662330476e2d90680f1a9a80304
-
Angie Chiang authored
Change-Id: I5558e18c8c706474df28e51e6ac9f598e0e2ab48
-
- 04 Aug, 2017 1 commit
-
-
Sarah Parker authored
If the mask is invalid, do not allow the encoder to select MRC_DCT. Currently the mask is invalid if it is all 1 or all 0, but these criteria will likely expand in a future patch. Change-Id: I77230ea8357bfdb2bf1e6338903d44bbf1db22d1
-
- 03 Aug, 2017 1 commit
-
-
Sarah Parker authored
This will aid in testing different masking methods for inter and intra blocks. Change-Id: Ic038da77e55405e3303177e6cd260bd5e19311c1
-
- 02 Aug, 2017 1 commit
-
-
Angie Chiang authored
This experiment aims at merging lbd/hbd txfms So far this exp uses hbd transform on lbd path. The performances I observed are lowres -0.089% midres 0.065% (negative means performance drop) Started from here, two main things are needed to be done. 1) Fix overflow due to quantizer noise 2) Generate a 16-bit version from the hbd txfm Change-Id: I35bb1fc0cbb78decad2570ff5826ed665f739752
-
- 29 Jul, 2017 1 commit
-
-
Monty Montgomery authored
This experiment replaces the 16-point Type-II DCT and 16-point Type-IV DST scaling vp9 transforms with the 16-point orthonormal Daala transforms. These have reduced complexity and are perfect reconstruction. There is currently no net coding performance impact. subset-1: monty-square-baseline-s1-F@2017-07-23T03:43:45.042Z -> monty-square-dct16-s1-F@2017-07-23T03:42:29.805Z PSNR | PSNR Cb | PSNR Cr | PSNR HVS | SSIM | MS SSIM | CIEDE 2000 -0.0152 | -0.0028 | -0.0929 | -0.0432 | -0.0457 | -0.0425 | -0.0237 objective-1-fast: monty-square-baseline-o1f-F@2017-07-23T03:44:19.973Z -> monty-square-dct16-o1f-F@2017-07-23T03:43:22.549Z PSNR | PSNR Cb | PSNR Cr | PSNR HVS | SSIM | MS SSIM | CIEDE 2000 0.0305 | 0.0926 | -0.1600 | 0.0471 | 0.0219 | -0.0075 | 0.0135 Change-Id: I54fed26d65fd8450693334bb400b1fafd7e0dacb
-
- 28 Jul, 2017 1 commit
-
-
Urvang Joshi authored
- Wrong function argument fix: this was not caught by compile test because DCT_DCT has a value of 0, which was converted to a NULL pointer. - Wrong prob array size. Change-Id: Iaf1747dc7fb40db1d1ab35f965fb60994d8dec95
-
- 26 Jul, 2017 3 commits
-
-
Yue Chen authored
Change-Id: Ie2c34490dc50cb242bcd701308e6b55243883b15
-
Sarah Parker authored
MRC_DCT uses a mask based on the prediction signal to modify the residual before applying DCT_DCT. This adds all necessary functions to perform this transform and makes the prediction signal available to the 32x32 txfm functions so the mask can be created. I am still experimenting with different types of mask generation functions and so this patch contains a placeholder. This patch has no impact on performance. Change-Id: Ie3772f528e82103187a85c91cf00bb291dba328a
-
Monty Montgomery authored
This experiment replaces the 8-point Type-II DCT and 8-point Type-IV DST scaling vp9 transforms with the 8-point orthonormal Daala transforms. These have reduced complexity and are perfect reconstruction at the cost of a slightly worse coding performance. This is because the Daala transforms expect the input to be shifted by 4 bits but the output scale of the vp9 transforms is only 3 bits. subset-1: monty-square-baseline-subset1 -> monty-square-dct8-subset1@2017-07-17T21:37:44.281Z PSNR | PSNR Cb | PSNR Cr | PSNR HVS | SSIM | MS SSIM | CIEDE 2000 0.0019 | -0.0011 | -0.0585 | -0.0111 | 0.0305 | 0.0317 | 0.0187 objective-1-fast: monty-square-baseline-o1f -> monty-square-dct8-o1f@2017-07-17T21:37:15.735Z PSNR | PSNR Cb | PSNR Cr | PSNR HVS | SSIM | MS SSIM | CIEDE 2000 0.0285 | 0.0129 | -0.5080 | 0.0529 | 0.0345 | 0.0441 | 0.0054 Change-Id: I2b775495398fb717204a295397c3c5e3ca938183
-
- 20 Jul, 2017 1 commit
-
-
Sarah Parker authored
This adds the new transform to the list of possible transforms. The impact on performance is in the noise range because the transform implementation currently performs DCT as a placeholder. This transform will initially only have an implementation for TX_32X32 and it is skipped in the tx search for smaller transform sizes. Change-Id: Iab2faddc525b478ca06972a753428a4f4ef53ac6
-
- 19 Jul, 2017 1 commit
-
-
Jingning Han authored
Use the row and column indexes to fetch txk_type, which allows the chroma components to derive the tx type from the corresponding luma components. It improves the coding performance of txk-sel by 0.18%. Change-Id: I3f4bca5839e13ae95e51053e76cd86fe58202ac9
-
- 17 Jul, 2017 2 commits
-
-
Lester Lu authored
Change two similar structs, FWD_TXFM_PARAM and INV_TXFM_PARAM, into a common struct: TxfmParam. Its definition is moved to aom_dsp/txfm_common.h to simplify dependency. This change is made so that, in later changes of the LGT experiment, functions requiring FWD_TXFM_PARAM and INV_TXFM_PARAM, such as get_fwd_lgt4 and get_inv_lgt4, can also be unified. Change-Id: I756b0176a02314005060adbf8e62386f10eeb344
-
hui su authored
Change-Id: I802c9e41ebfed090b5ad8300917aad5e16ad026a
-
- 14 Jul, 2017 1 commit
-
-
hui su authored
Change-Id: I2888bd8905253e02e3ac74597275cf56e5142d29
-
- 12 Jul, 2017 2 commits
-
-
Monty Montgomery authored
This experiment replaces the 4-point Type-II scaled-output vp9 DCT transform with the 4-point Type-II orthonormal Daala DCT transform. Right now the CONFIG_DAALA_DCT4 experiment depends on CONFIG_DCT_ONLY as it does not add an orthonormal 4-point DST. subset-1: monty-baseline-dctonly-squaretx-subset1 -> monty-dct4-dctonly-squaretx-subset1-rerun PSNR | PSNR Cb | PSNR Cr | PSNR HVS | SSIM | MS SSIM | CIEDE 2000 0.0055 | -0.0132 | -0.0405 | 0.0261 | 0.0005 | 0.0246 | 0.0226 objective-1-fast: monty-baseline-dctonly-squaretx-o1f -> monty-dct4-dctonly-squaretx-o1f PSNR | PSNR Cb | PSNR Cr | PSNR HVS | SSIM | MS SSIM | CIEDE 2000 -0.0215 | -0.1573 | N/A | -0.0131 | -0.0347 | -0.0390 | -0.1121 Change-Id: Idef8f6e5525037d5bbb2d0927675c21d1922d69a
-
Monty Montgomery authored
Change-Id: Ib5337dfa78b73059ad169ca98a07119aa991864b
-
- 11 Jul, 2017 1 commit
-
-
Monty Montgomery authored
Building with --enable-dct_only will force the encoder to use only tx_type == DCT_DCT. This experiment gives a loss and is only added for testing. subset-1: master@2017-02-21T01:23:58.825Z -> master-dct_only@2017-02-21T02:57:28.585Z PSNR | PSNR Cb | PSNR Cr | PSNR HVS | SSIM | MS SSIM | CIEDE 2000 2.5467 | 1.0524 | 0.9171 | 1.8849 | 2.6626 | 2.4995 | 1.8402 objective-1-fast: master@2017-02-21T01:47:43.790Z -> master-dct_only@2017-02-20T16:54:03.578Z PSNR | PSNR Cb | PSNR Cr | PSNR HVS | SSIM | MS SSIM | CIEDE 2000 1.6625 | 0.3948 | 0.3368 | 1.5268 | 1.7142 | 1.7097 | 1.0743 Change-Id: I19b738f3d1a450bc50422149ac42bc184bfae08a
-
- 10 Jul, 2017 1 commit
-
-
Lester Lu authored
Here we have an LGT to replace ADST for intra residual blocks, and another LGT to replace ADST for inter residual blocks. The changes are only applied to transform length 4 and 8, and only for the lowbitdepth path. lowres: -0.18% Change-Id: Iadc1e02b53e3756b44f74ca648cfa8b0e8ca7af4
-
- 07 Jul, 2017 1 commit
-
-
Lester Lu authored
The input arguments of av1_fht* and av1_iht* functions (and their HBD versions) are slightly changed. Input arguments tx_type and bd are carried by a struct fwd_txfm_param/inv_txfm_param. This struct is meant to later on carry other prediction information, such as intra top/left boundaries to the transform level, so that the choice of transforms can be more adaptive to the prediction mode and local video content. Change-Id: Ia42544248a51845be64b72855b642ef1fe5910a9
-
- 29 Jun, 2017 1 commit
-
-
Frederic Barbier authored
Cleanup related unit-tests. Change-Id: Ic756e6bbad80f5b9947ca1cdd55cdef77b985f81
-
- 27 Jun, 2017 1 commit
-
-
Yi Luo authored
We are going to have several commits to setup new low/high bitdepth data path selection logic. This patch is for inverse transform. Let me summarize the ideas as following. - For low/high bitdepth selection, encoder depends on input configuration, e.g., video sequence bitdepth, profile. Decoder depends on input bitstream. This has nothing to do with compiler/build configuration. - Typical encoder usage for sampling format 4:2:0. 1) 8-bit video sequence: a) --profile=0 Fastest encoding/decoding pipeline on speedup. b) --profile=2 --bit-depth=10 Image pixels are left shifted by 2 bits. It employs 16-bit reference frame buffer and has high calculation precision. It usually enjoys higher compression performance. 2) 10/12-bit video sequence (HDR): --profile=2 --bit-depth=10/12 - Transform coefficient type: Lowbitdepth: int16_t Highbitdepth: int32_t - The type, tran_low_t is still used in codebase, Which is int32_t, defining the data path capacity. Naturally, it is high bitdepth. Eventually we shall remove the configuration flags, CONFIG_HIGHBITDEPTH/CONFIG_LOWBITDEPTH, and seperate low and high bitdepth data path. Two data paths co-exist in the same build environment. Change-Id: I35c06d4d4f19ebf80d909168fdddbae57c3cc884
-
- 26 Jun, 2017 1 commit
-
-
Lester Lu authored
In previous ADSTs, DST-7 and DST-4 are used for length 4 and length 8/16/32, respectively. In this LGT experiment we explore transforms between DST-4 and DST-7. When CONFIG_LGT flag is on, adst4 and adst8 are replaced by lgt4 and lgt8, the intermediate transforms with pre-chosen parameters. The LGTs applied here are lgt4_160 and lgt8_170, where the numbers mean the self-loop weights times 100. The associated values for DST-7 and DST-4 are 100 and 200. ovr_psnr: lowres: -0.140 midres: -0.131 hdres: -0.078 These changes are not applied to the highbd scenario in the current version. Change-Id: I20600456da8766528b2b6b11aa28801e70af498e
-
- 12 Jun, 2017 1 commit
-
-
Sarah Parker authored
Responding to some left over cosmetic comments from 2b5cdb1cf87c933331a16cc0221455d0a8c255e1 Change-Id: I42e126593526cedd6675adf35b9c1df78e1ddf54
-
- 08 Jun, 2017 1 commit
-
-
Sarah Parker authored
This unifies the codepath for high-bitdepth transforms and deletes all calls to the old deprecated versions. This required reworking the way 1d configurations are combined in order to support rectangular transforms. There is one remaining codepath that calls the deprecated 4x4 hbd transform from encoder/encodemb.c. I need to take a closer look at what is happening there and will leave that for a followup since this change has already gotten so large. lowres 10 bit: -0.035% lowres 12 bit: 0.021% BUG=aomedia:524 Change-Id: I34cdeaed2461ed7942364147cef10d7d21e3779c
-
- 01 Jun, 2017 1 commit
-
-
Timothy B. Terriberry authored
cb4x4 itself should not require these sizes. This simplifies compatibility with other experiments, since we can first make them work with cb4x4 (which is now on by default), and then worry about chroma_2x2 (which is not) in separate steps. Encoder and decoder output should remain unchanged. Change-Id: I4e9fcdae49f238b5099a3c74a398fe993c2545f8
-
- 20 May, 2017 1 commit
-
-
hui su authored
Encode a block line by line, horizontally or vertically. In the vertical mode, each row is predicted by the reconsturcted row above; in the horizontal mode, each column is predicted by the reconstructed column to the left. The DPCM modes are enabled automatically for blocks with horizontal or vertical prediction mode, and 1D transform types (ext-tx). Change-Id: I133ab6b537fa24a6e314ee1ef1d2fe9bd9d56c13
-
- 19 May, 2017 2 commits
-
-
Jonathan Matthews authored
Exposed by Change-Id: I048c6e9cc790520247cc21ae9b92a9c8d84d00a7 BUG=aomedia:525 Change-Id: Ia83f8a8efcf0eac4912f247f38887c0dd533da85
-
Sarah Parker authored
This adds the proper cfgs to av1_{inv/fwd}_txfm1d_cfg for the identity transform so all hbd transforms can use the same codepath. This has no impact on performance since the new identity transforms that correspond with the cfgs are not yet being called. Once this is checked in, we should be able to delete all deprecated transform functions and have a single code flow for all hbd transforms. BUG=aomedia:524 Change-Id: I3d1bfbc8bc29b367e8ddf7dcd27525af0bd31067
-