- 29 Sep, 2014 1 commit
-
-
Deb Mukherjee authored
One is a more aggressive version of the pruned subpel tree search where only a single halfpel candidate is searched. The search candidate is based on a surface fit result. The other is a method to obtain the subpel position at one shot based on the same surface fit. The methods have not been deployed in any speed setting yet. Change-Id: I34fef3f2e34f11396c9d1ba97f4be8c4ffca62d3
-
- 26 Sep, 2014 1 commit
-
-
Yunqing Wang authored
This patch re-enabled the feature in Pengchong's patch (commit 12861260). Originally, it was turned on while use_lastframe_partitioning > 0(not used anymore). Now it was added as a feature, and turned on while speed >= 2. As described in the original patch, this feature helps speed up the slideshows in YouTube. Change-Id: I1b0f18d65da1ee1c8d1e117dabba910c5207c471
-
- 22 Sep, 2014 1 commit
-
-
Jingning Han authored
This commit enables an adaptive mode search order scheduling scheme in the rate-distortion optimization. It changes the compression performance by -0.433% and -0.420% for derf and stdhd respectively. It provides speed improvement for speed 3: bus CIF 1000 kbps 24590 b/f, 35.513 dB, 7864 ms -> 24696 b/f, 35.491 dB, 7408 ms (6% speed-up) stockholm 720p 1000 kbps 8983 b/f, 35.078 dB, 65698 ms -> 8962 b/f, 35.054 dB, 60298 ms (8%) old_town_cross 720p 1000 kbps 11804 b/f, 35.666 dB, 62492 ms -> 11778 b/f, 35.609 dB, 56040 ms (10%) blue_sky 1080p 1500 kbps 57173 b/f, 36.179 dB, 77879 ms -> 57199 b/f, 36.131 dB, 69821 ms (10%) pedestrian_area 1080p 2000 kbps 74241 b/f, 41.105 dB, 144031 ms -> 74271 b/f, 41.091 dB, 133614 ms (8%) Change-Id: Iaad28cbc99399030fc5f9951eb5aa7fa633f320e
-
- 11 Sep, 2014 2 commits
-
-
Jingning Han authored
The speed feature that skips compound inter prediction modes was subsumed by other speed features and effectively was not in use. This commit removes it. Change-Id: I22b0c71a8ddd15d93b25d86fa63a1dce2ba6a1a9
-
Jingning Han authored
This commit refactor the rate-distortion optimization search for regular block sizes to remove the speed feature dependency on mode search order. Change-Id: Ied033ee484c2957e17baa7b6450b720fe7dd0e7d
-
- 29 Aug, 2014 1 commit
-
-
Jingning Han authored
This commit allows encoder to skip intra coding mode test, when the known inter residual is less than the source variance. It reduces the runtime of speed 3 for test clips: bus cif 1000 kbps: 8587 ms -> 8260 ms, 3.8% speed-up pedestrian 1080p 2000 kbps: 161381 ms -> 155241 ms, 3.7% speed-up. The compression performance is down by derf -0.36% stdhd -0.25% Change-Id: I75ce1e035b4da2153cb1ac14111d1a07c05a735d
-
- 28 Aug, 2014 2 commits
-
-
Yunqing Wang authored
In the partition search, the encoder checks all possible partitionings in the superblock's partition search tree. This patch proposed a set of criteria for partition search early termination, which effectively decided whether or not to terminate the search in current branch based on the "skippable" result of the quantized transform coefficients. The "skippable" information was gathered during the partition mode search, and no overhead calculations were introduced. This patch gives significant encoding speed gains without sacrificing the quality. Borg test results: 1. At speed 1, stdhd set: psnr: +0.074%, ssim: +0.093%; derf set: psnr: -0.024%, ssim: +0.011%; 2. At speed 2, stdhd set: psnr: +0.033%, ssim: +0.100%; derf set: psnr: -0.062%, ssim: +0.003%; 3. At speed 3, stdhd set: psnr: +0.060%, ssim: +0.190%; derf set: psnr: -0.064%, ssim: -0.002%; 4. At speed 4, stdhd set: psnr: +0.070%, ssim: +0.143%; derf set: psnr: -0.104%, ssim: +0.039%; The speedup ranges from several percent to 60+%. speed1 speed2 speed3 speed4 (1080p, 100f): old_town_cross: 48.2% 23.9% 20.8% 16.5% park_joy: 11.4% 17.8% 29.4% 18.2% pedestrian_area: 10.7% 4.0% 4.2% 2.4% (720p, 200f): mobcal: 68.1% 36.3% 34.4% 17.7% parkrun: 15.8% 24.2% 37.1% 16.8% shields: 45.1% 32.8% 30.1% 9.6% (cif, 300f) bus: 3.7% 10.4% 14.0% 7.9% deadline: 13.6% 14.8% 12.6% 10.9% mobile: 5.3% 11.5% 14.7% 10.7% Change-Id: I246c38fb952ad762ce5e365711235b605f470a66
-
Deb Mukherjee authored
Updates the vp9_pattern_search function to return integer one-away neighbors' sad values, for subsequent use in speeding up the sub-pel search. Also, removes code for the do_refine option which is not being used currently. Updates the integer and subpel functions to pass in a 5-element sad list for output or input. A new pruned sub-pel search algorithm is implemented that uses the sad returned from the integer pel search. But it is not deployed yet. Change-Id: Ifa9f5ad024b5b660570366d2bd900343e1891520
-
- 26 Aug, 2014 1 commit
-
-
Yaowu Xu authored
This commit addes a new strategy to reduce the search for optimal interpolation filter type. The encoder counts and store how many each filter type is selected and used for each of the reference frames. A filter type that is rarely used for all three reference frames is masked out to avoid computation. The impact on compression is neglectible: -0.02% on derf +0.02% on stdhd Encoding time is seen to reduce by 2~3%. Change-Id: Ibafa92291b51185de40da513716222db4b230383
-
- 18 Aug, 2014 2 commits
-
-
Yunqing Wang authored
In the full-rd transform size search, we go through all transform sizes to choose the one with best rd score. In this patch, an early termination is added to stop the search once we see that the smaller size won't give better rd score than the larger size. Also, the search starts from largest transform size, then goes down to smallest size. A speed feature tx_size_search_breakout is added, which is turned off at speed 0, and on for other speeds. The transform size search is turned on at speed 1. Borg test results: 1. At speed 1, derf set: psnr gain: 0.618%, ssim gain: 0.377%; stdhd set: psnr gain: 0.594%, ssim gain: 0.162%; No noticeable speed change. 3. At speed 2, derf set: psnr loss: 0.157%, ssim loss: 0.175%; stdhd set: psnr loss: 0.090%, ssim loss: 0.101%; speed gain: ~4%. Change-Id: I22535cd2017b5e54f2a62bb6a38231aea4268b3f
-
Jingning Han authored
This commit enables the encoder to record the location of the center frame to generate alter reference frame. It then allows to skip checking prediction modes of other reference frame types when it comes to encode this frame. The speed 3 runtime is reduced for the test sequences: bus at CIF 1000 kbps, 9791 ms -> 9446 ms, i.e., 3.5% speed-up, pedestrian at 1080p 2000 kbps, 184043 ms -> 175730 ms, i.e., 4.5% speed-up. No compression performance change observed. Change-Id: Iacfde3bcc1445964e7a241f239bd6ea11cb94bd1
-
- 15 Aug, 2014 2 commits
-
-
Pengchong Jin authored
Add a speed feature to give the tighter partition search range. Before partition search, calculate the histogram of the partition sizes of the left, above and previous co-located blocks of the current block. If the variance of observed partition sizes is small enough, adjust the search range around the mean partition size, which will be tigher. The feature is currently turned on at speed 2. Experiments on sample youtube clips show on average the runtime is reduced by 3-7%. For hard stdhd clips: park_joy_1080p @ 15000kbps: 509251 ms -> 491953 ms (3.3%) pedestrian_area_1080p @ 2000kbps: 223941 ms -> 214226 ms (4.3%) The PSNR performance is changed: derf: -0.112% yt: -0.099% hd: -0.090% stdhd:-0.102% Change-Id: Ie205ec5325bf92ec5676c243e30ba9d0adca10f2
-
Yunqing Wang authored
Removed disable_split_var_thresh, which is not used anymore. Change-Id: I50119b150442e1571157433b5effc6aae0dbe0fd
-
- 13 Aug, 2014 1 commit
-
-
Jingning Han authored
This commit allows the encoder to check the above and left neighbor blocks' reference frames and motion vectors. If they are all consistent, skip checking the NEARMV and ZEROMV modes. This is enabled in speed 3. The coding performance is improved: pedestrian area 1080p at 2000 kbps, from 74773 b/f, 41.101 dB, 198064 ms to 74795 b/f, 41.099 dB, 193078 ms park joy 1080p at 15000 kbps, from 290727 b/f, 30.640 dB, 609113 ms to 290558 b/f, 30.630 dB, 592815 ms Overall compression performance of speed 3 is changed derf -0.171% stdhd -0.168% Change-Id: I8d47dd543a5f90d7a1c583f74035b926b6704b95
-
- 30 Jul, 2014 1 commit
-
-
Jingning Han authored
This commit enables a chessboard pattern constrained partition search for 720p and above resolutions. The scheme applies stricter partition search to alternative blocks based on its above/left neighboring blocks' partition range, as well as that of the collocated blocks in the previous frame. It is currently turned on at 16x16 block size level. The chessboard pattern is flipped per coding frame. The speed 3 runtime is reduced: park_joy_1080p, 652832 ms -> 607738 ms (7% speed-up) pedestrian_area_1080p, 215998 ms -> 200589 ms (8% speed-up) The compression performance is changed: hd -0.223% stdhd -0.295% Change-Id: I2d4d123ae89f7171562f618febb4d81789575b19
-
- 22 Jul, 2014 1 commit
-
-
Jingning Han authored
This commit enables a chessboard pattern prediction filter type search scheme for rate-distortion optimization speed-up. For the inferred motion vector modes, the encoder can re-use its above/left neighbor blocks' prediction filter type and skip a full test on all possible filter types. Such operation is turned on/off alternatively in a chessboard manner. It is turned on in speed 3. For test clip pedestrian 1080p, the runtime is reduced from 231500 ms -> 221700 ms. The compression performance is changed: derf: -0.147% yt: -0.134% hd: -0.079% stdhd: -0.220% Change-Id: I1912f278e7576c2dc632688e3ad7a257410c605a
-
- 16 Jul, 2014 1 commit
-
-
Yaowu Xu authored
This commit changed the hard-coded DEFAULT_INTERP_FILTER to a speed feature with the same default value: SWITCHABLE. Change-Id: I7f54f40f1bd3f5277841d04b85db7a84e47313f1
-
- 07 Jul, 2014 1 commit
-
-
Alex Converse authored
* Replace max_step_search_steps with constant MAX_MVSEARCH_STEPS * Fold (reduce_first_step_size + speed > 5) into reduce_first_step_size replacing uses of reduce_first_step_size that don't add the speed check with zero. Change-Id: Iae46395dbf3eaca138bf4d18b838a9e364b5a198
-
- 02 Jul, 2014 2 commits
-
-
Yaowu Xu authored
This commit added a speed feature to control the step_param used in full pixel motion search. The intention is to reduced the search steps for high speed real time coding. Change-Id: I21d2f0105c2b647783a6688615da7fcf2b6d670b
-
Jingning Han authored
This commit re-designs the quantization process for transform coefficient blocks of size 4x4 to 16x16. It improves compression performance for speed 7 by 3.85%. The SSSE3 version for the new quantization process is included. The average runtime of the 8x8 block quantization is reduced from 285 cycles -> 255 cycles, i.e., over 10% faster. Change-Id: I61278aa02efc70599b962d3314671db5b0446a50
-
- 01 Jul, 2014 1 commit
-
-
Yunqing Wang authored
The current threshold is knid of low, and in many cases NEWMV mode is checked but not picked as the best mode. This patch added a speed feature to increase NEWMV threshold, so that less partition mode checking goes to check NEWMV. This feature is enabled for speed 6 and 7. Rtc set borg tests showed: 1. Speed 6, overall psnr: -0.088%, ssim: -1.339%; Average speedup on rtc set is 11.1%. 2. Speed 7, overall psnr: -0.505%, ssim: -2.320% Average speedup on rtc set is 12.9%. Change-Id: I953b849eeb6e0d5a1f13eacba30c14204472c5be
-
- 30 Jun, 2014 2 commits
-
-
Yunqing Wang authored
For real time speed 7, once encode breakout is on(i.e. encoding setting --static-thresh=1), a proper encode breakout threshold is set to speed up the encoder. Set --static-thresh=1, RTC set borg test showed a slight overall psnr loss of 0.162%, but ssim gain of 0.287%. The average speedup on RTC set is 6%, and for some clips, the speedup can be 10+%. Change-Id: Id522d9ce779ff7c699936d13d0c47083de4afb85
-
Yunqing Wang authored
Before encoding a frame, calculate and store each 16x16 block's variance of source difference between last and current frame. Find partitioning threshold T for the frame from its variance histogram, and then use T to make partition decisions. Comparing with fixed 16x16 partitioning, rtc set test showed an overall psnr gain of 3.242%, and ssim gain of 3.751%. The best psnr gain is 8.653%. The overall encoding speed didn't change much. It got faster for some clips(for example, 12% speedup for vidyo1), and a little slower for others. Also, a minor modification was made in datarate unit test. Change-Id: Ie290743aa3814e83607b93831b667a2a49d0932c
-
- 27 Jun, 2014 1 commit
-
-
Yaowu Xu authored
As a way to speed-up rtc encoding at speed 7. Change-Id: Ie36a010392cf7b741dc130df21a4e733622a75b7
-
- 24 Jun, 2014 2 commits
-
-
Yunqing Wang authored
In real-time speed 6, no partition search is done. The inter prediction results got from picking mode can be reused in the following encoding process. A speed feature reuse_inter_pred_sby is added to only enable the resue in speed 6. This patch doesn't change encoding result. RTC set tests showed that the encoding speed gain is 2% - 5%. Change-Id: I3884780f64ef95dd8be10562926542528713b92c
-
Paul Wilkins authored
Fix some bugs relating to the use of buffers in the overlay frames. Fix bug where a mid sequence overlay was propagating large partition and transform sizes into the subsequent frame because of :- sf->last_partitioning_redo_frequency > 1 and sf->tx_size_search_method == USE_LARGESTALL Change-Id: Ibf9ef39a5a5150f8cbdd2c9275abb0316c67873a
-
- 12 Jun, 2014 1 commit
-
-
Dmitry Kovalev authored
Moving all motion vector related speed parameters from SPEED_FEATURES to MV_SPEED_FEATURES. Change-Id: I3e9af0039c7162f8671878c5920bce3cb256a84e
-
- 10 Jun, 2014 1 commit
-
-
Dmitry Kovalev authored
Change-Id: I33a38bb9f46e7ef509bbbf0cfd7bc3ea5072d022
-
- 09 Jun, 2014 1 commit
-
-
Yunqing Wang authored
In non-rd real-time mode, choosing smaller transform size in encoding gives better video quality and good speed gain than choosing larger transform size. This patch set tx size search method to ALLOW_8X8, which is better than using 4x4 or other larger sizes. Borg tests on rtc set at speed 6 showed significant gain on quality. PSNR gain: 11.034% and SSIM gain: 15.466%. The speed gain is 5% - 12% for <720p clips, and 2% - 7% for 720p clips. Change-Id: If4dc74ed2df359346b059f47fb73b4a0193ec548
-
- 06 Jun, 2014 1 commit
-
-
Dmitry Kovalev authored
This is not a speed feature, adding inline function instead. Change-Id: Ia48c41802eec9e92cf990339d724097279695c9a
-
- 29 May, 2014 1 commit
-
-
Dmitry Kovalev authored
Making this consistent with intra mode masks: you need to specify allowed inter/intra modes to use. Change-Id: Iaecd28bf79047259707d8e7a59a57bb7b856383e
-
- 21 May, 2014 1 commit
-
-
Yaowu Xu authored
This commit changed to enable the encoder to adjust motion dection speed threshold based on picture size. In addition, cpu-used 1 now does a partition search every other frame instead of every third frame for low resolution inputs. The change has no quality/speed impact for 720p and above. Test showed the change increase encoding time by between 3% to 6% for cpu-used 2 encodiong of 360p sequences. It also has a compression gain about .3%. For cpu-used 2, the change resolved some very disturbing visual artifacts in certain sequences when large block partitionings and transforms are used as a result of copying the partition from a previous frame. Change-Id: Ic7fd22508cdb811d4ca935655adbf20109286cfa
-
- 19 May, 2014 1 commit
-
-
Yunqing Wang authored
Added a skipping test in non-rd inter-mode. After interpolation prediction step, the residuals are tested to see if they will be quantized to 0 based on modeling between spatial domain and frequency domain. Set static-thresh to 800 for >=720p and 300 for <720p, rtc set tests showed 1. Speed 5, psnr: -0.514%; ssim: -1.748%; speedup on related clips: 5% -11% 2. Speed 6, psbr: -0.628%; ssim: -1.637%; speedup on related clips: 4% - 9% Change-Id: I62fbf26bc043ecd2b584f255f1a4ee5ab52bfcf3
-
- 12 May, 2014 1 commit
-
-
Yaowu Xu authored
Change-Id: If1afb9f3eaec88079d1d97907870409bce691c2a
-
- 01 May, 2014 1 commit
-
-
Dmitry Kovalev authored
Change-Id: I961d50d6fafdd37ef7f23f0a871d28e28d2084ca
-
- 23 Apr, 2014 1 commit
-
-
Jingning Han authored
This commit introduces a chessboard pattern search for the prediction filter type search. It runs extensive search in alternate blocks and allows the rest blocks to refer coding decisions of their nearby neighbors. For pedestrian 1080p at 4000 kbps, the runtime of speed -5 goes down from 43990 ms to 42200 ms. The overall compression performance for RTC set is changed by -1.37%. Change-Id: Icfe220c49451cda796f0ca91d935c9ed01e56c9d
-
- 10 Apr, 2014 1 commit
-
-
Yunqing Wang authored
Minor change to use matching type in comparison. Change-Id: I670cae2d584918c67c1af791a797629f392f599e
-
- 09 Apr, 2014 1 commit
-
-
Yunqing Wang authored
Calculate the difference variance between last source frame and current source frame. The variance is calculated at 16x16 block level. The variances are compared to several thresholds to decide final partition sizes. An adaptive strategy is implemented to decide using SOURCE_VAR_BASED_PARTITION or FIXED_PARTITION based on motions in the video. The switching test is done once every search_type_check_frequency frames. The selection of source_var_thresh needs to be investigated further later. RTC set Borg test showed 0.424% overall psnr gain, and 0.357% ssim gain. For clips with large enough static area, the encoding speedup is around 2% to 15%. Change-Id: Id7d268f1d8cbca7fb8026aa4a53b3c77459dc156
-
- 07 Apr, 2014 1 commit
-
-
Marco Paniconi authored
Copy up to a certain bsize, otherwise set to a fixed bsize. This helsp to reduce artifact near moving boundary caused by full partition copy without checking motion of super-block. This artifact can occur at speeds 3,4 in real-time mode. Issue: https://code.google.com/p/webm/issues/detail?id=738. Change-Id: I05812521fd38816a467f72eb6a951cae4c227931
-
- 04 Apr, 2014 1 commit
-
-
Dmitry Kovalev authored
Change-Id: I75ad328c6d719df81cc24f3ae21c152af4ebdacc
-