- 03 Feb, 2018 8 commits
-
-
Jingning Han authored
Make the rate distortion structure fed into the search function independent. Change-Id: Id3997fea87e8aa6d0b42e64b11aa79a8c3e15af7
-
Jingning Han authored
Change-Id: Ib95dba539a3677421d4c7ee5e2f3faaf2ebc8773
-
Yushin Cho authored
With aq_mode=VARIANCE_AQ, the av1_init_plane_quantizers() is called in set_segment_rdmult(). Change-Id: Id2584a0544ee633832b844ba06c137236068c4b9
-
Peng Bin authored
There are 3 valid input width of aom_comp_mask_pred_ssse3. Process each width(8,16,32) separately achieves 1.2x~1.5x speed up compare to origin ssse3 version. Change-Id: Ida3699e2e6ca98d1f9c7662d48806b299af26f10
-
Yaowu Xu authored
Change-Id: Ic51231510fc8bb897f8ca771dd4e750d0e1cd693
-
James Zern authored
Change-Id: I35a6ac83d8a94c803148e7ad9366053599f747a0
-
James Zern authored
Change-Id: I972d0304c6ff495f5f484fe77270c420a0dfe376
-
James Zern authored
unused var plane_bsize Change-Id: I02d75ec5ceab2f9d61a1a4ff5b5f1bc2d1b0a7a4
-
- 02 Feb, 2018 16 commits
-
-
Angie Chiang authored
Change-Id: I9a42b75de3e623f6af325edbe91e299c0662f19c
-
Thomas Daede authored
BUG=aomedia:1293 Change-Id: Iabdeb4ef7a98b034a4777527f727231f7b8815ee
-
Sebastien Alaiwan authored
Change-Id: Id7ea17a5124215907d076e0e3500b9aeea1146fc
-
Debargha Mukherjee authored
Change-Id: I01cecc829e2d57517427a1de6387e91ba3c64312
-
Imdad Sardharwalla authored
The SSE4.1 and AVX2 implementations of the self-guided filter have been updated to match the updated FAST_SGR C implementation in restoration.c. The self-guided filter speed tests have been altered to compare the speeds of the SIMD and C implementations of the relevant functions. Speed Tests (code compiled with CLANG) =========== For LowBD: - The SSE4.1 implementation is ~220% faster (~69% less time) than the C code - The AVX2 implementation is ~314% faster (~76% less time) than the C code For HighBD: - The SSE4.1 implementation is ~240% faster (~71% less time) than the C code - The AVX2 implementation is ~343% faster (~77% less time) than the C code Change-Id: Ic2734bb89ccd3f66667c68647e5f677a5a496233
-
Angie Chiang authored
Change-Id: I8dcaa6882d47a097498c8f8af515b1185df4fdf3
-
Hui Su authored
In preparation for supporting q_adapt_probs. Change-Id: I4a39b81b0d2c4ceb1586ae411a1216c6c20d896d
-
Hui Su authored
Reduce the length of inter_tx_size[] from 1024 to 16. On a cif test sequence, encoder memory consumption decreases by 18% (380MB -> 312MB); decoder memory consumption decreases by 56% (21.4MB -> 9.4MB). Change-Id: I42928eb9312748f96f4393c8d8040791f38f98b6
-
Frederic Barbier authored
Change-Id: I91f18c498c694829b933bb73812ad94d66962994
-
Imdad Sardharwalla authored
Added an AVX2 version of the Wiener filter, along with associated tests. Speed tests have been added for all implementations of the Wiener filter. Speed Test results ================== GCC --- Low bit-depth filter: - SSE2 vs C: SSE2 takes ~92% less time - AVX2 vs C: AVX2 takes ~96% less time - SSE2 vs AVX2: AVX2 takes ~43% less time (~74% faster) High bit-depth filter: - SSSE3 vs C: SSSE3 takes ~92% less time - AVX2 vs C: AVX2 takes ~96% less time - SSSE3 vs AVX2: AVX2 takes ~46% less time (~84% faster) CLANG ----- Low bit-depth filter: - SSE2 vs C: SSE2 takes ~84% less time - AVX2 vs C: AVX2 takes ~88% less time - SSE2 vs AVX2: AVX2 takes ~27% less time (~36% faster) High bit-depth filter: - SSSE3 vs C: SSSE3 takes ~85% less time - AVX2 vs C: AVX2 takes ~89% less time - SSS3 vs AVX2: AVX2 takes ~24% less time (~31% faster) Change-Id: Ide22d7c09c0be61483e9682caf17a39438e4a208
-
Debargha Mukherjee authored
Changes the CONFIG_FAST_SGR=1 strategy to not use any subsampling for the r=1 filter, but for the r=2 filter sub-sample vertically but combine only by filtering horizontally in the last stage for odd rows. Coding efficiency loss sems quite minimal. Change-Id: I5644ac400b387c37a2d278db7f6ad3ac0a6b5e93
-
Debargha Mukherjee authored
Change-Id: I6138519456b2ad3ffc8bced803ddc4418b246e74
-
Debargha Mukherjee authored
Some parameter tuning included. lowres (q, 30 frames, speed 1): -1.243% av PSNR, -2.337% ov PSNR, +0.577% SSIM lowres (vbr, 30 frames, speed 1): -0.327% av PSNR, -1.007% ov PSNR, +0.182% SSIM A few videos become a lot worse in SSIM, which needs to be investigated. But PSNR-wise the patch seems pretty good. Change-Id: I17c8d812c96ee49ddae7d3959a459aa3ffcea208
-
Peng Bin authored
Since aom_comp_mask_upsampled_pred just call aom_upsampled_pred and aom_comp_mask_pred, no need to separate c version from simd version any more. Change-Id: I1ff8bcae87d501c68a80708fd2dc6b74c6952f88
-
Yaowu Xu authored
BUG=aomedia:1306 Change-Id: I5a8bdbd472213ded2de706c5b044a1bf24823670
-
Jingning Han authored
The current aq mode encoder setting would alter the segment_id between the rate-distortion optimization and the block encoding stages. Disable the corresponding consistency check in this case. BUG=aomedia:1251 Change-Id: Ic910a23fd64a9b4554567d3c8c9a9ae5f6062c7b
-
- 01 Feb, 2018 14 commits
-
-
James Zern authored
until av1_idct*_new are optimized this function sits high on the decode perf Change-Id: Ic55c9a92b9926fc09eaee211a45fde00333b7c15
-
Debargha Mukherjee authored
Change-Id: I1d7f33546053615a334b67b75147bd5e027a545b
-
Debargha Mukherjee authored
Change-Id: Ia34909cc6edc20f17a777e0b7bff97a62e0ac0c2
-
Jonathan Matthews authored
This reverts Change-Id: Ie11dd055255d200954b704b8c2ad8ca3dff7bf5c BUG=aomedia:1305 Change-Id: I6894928dcadc99a79417034a7096a215693a46f2
-
Debargha Mukherjee authored
Change-Id: I1e6a8a74d0ca1e6aa01d2da12bd9b19c8307154e
-
Cheng Chen authored
Change-Id: Ia448b44ca734fe111422de9afdad97ac48e78b66
-
Hui Su authored
When cb_partition_scan is true, only DCT_DCT is considered. Therefore there's no need to prune transform types; and if DCT_DCT is pruned, we end up with no transform type to use. Change-Id: I1d65fe94e72de66fde18e271a598f9e67ade9cfb
-
Yaowu Xu authored
Change-Id: Ibe4f7bb61837b6bae6717f0c683fa23f78de5b80
-
Jingning Han authored
Obtain the most likely partition range from a first pass square block base partition search. Use the constrained partition search region for full rate-distortion optimization search in the second pass. Tested on pedestrian 1080p at 2000 kbps, it makes the encoding speed 40% faster for speed 0 and 30% faster for speed 1. The average coding performance loss is around 0.15%. Change-Id: Ifc83d48e6413d1b887e68cd1962084e018a2258f
-
Jingning Han authored
Use simple rate-distortion search route for the first pass coding block partition. Change-Id: Iaaec3e1af83f46f625d3de8361eddd79a2bc6cef
-
Jingning Han authored
Add square block partition to serve as the first pass partition search. Change-Id: Ib637bba205d2cd0f6b0a5e2e91b270e22dce5580
-
Yaowu Xu authored
BUG=aomedia:1274 Change-Id: Ib1d814db4ef1bcb075444e4da855fd840e945a7d
-
Peng Bin authored
Same as https://aomedia-review.googlesource.com/c/aom/+/42901 Adopt the same code refactoring to aom_comp_mask_pred_c (Should be bitwise identical). Change-Id: Ieea71d370f5df48d216f40515842ad62499432c8
-
Maxym Dmytrychenko authored
SSE2 version already extended to support 13 TAPs Change-Id: I58e04527b297256b6ca63b12097d9345196a12bd
-
- 31 Jan, 2018 2 commits
-
-
Hui Su authored
Change-Id: If0b1d2fe31569104f2d8eef3cfd42cab30162c7e