- 03 Sep, 2014 3 commits
-
-
Yaowu Xu authored
Change-Id: I453b167f03811a3cd3592089593b3f2823f62ab3
-
Yaowu Xu authored
This commit removes the special case for key frame, as transform size decision is controlled by the appropriate speed feature for all lossy coding modes: tx_size_search_method. Change-Id: I9677171e3f2432ec23705f7c5ea8170dd4562fae
-
Jingning Han authored
This commit allows the encoder to skip check on compound inter modes in the rate-distortion optimization loop, if the reference frame bias signs are the same. Change-Id: Ib753e6bb11cbdd338aee69dbe2b649671f75a6b0
-
- 02 Sep, 2014 3 commits
-
-
Dmitry Kovalev authored
Removed functions: * vp9_sad_16x16_mmx * vp9_sad_8x16_mmx * vp9_sad_16x8_mmx * vp9_sad_8x8_mmx * vp9_sad_4x4_mmx Change-Id: Ic5174b93b64d65d846f0c11e72cab149e9472bc3
-
Deb Mukherjee authored
Adds config parameter vp9_highbitdepth, to support highbitdepth profiles. Also includes most vpx level high bit-depth functions. However encode/decode in the highbitdepth profiles will not work until the rest of the code is in place. Change-Id: I34c53b253c38873611057a6cbc89a1361b8985a6
-
Jingning Han authored
This commit skips the compound inter mode prediction check in the rate-distortion optimization loop for ARF coding. It reduces the runtime for certain test clips at speed 3, at no compression performance change: bus CIF 1000 kbps, 8260 ms -> 8090 ms, 1.8% speed-up stockholm 720p 1000 kbps, 74453 ms -> 71826 ms, 2.9% speed-up No visible speed-up for pedestrian area 1080p at 2000 kbps. Change-Id: Ic68aa56837159b726563b784e2e3729e846465ad
-
- 30 Aug, 2014 2 commits
-
-
Dmitry Kovalev authored
Change-Id: I571ce84c97087f8a1a36a10058393bfdcefbf72a
-
Dmitry Kovalev authored
New code is 10% faster for 64-bit and 25% faster for 32-bit. Compiled using clang. Change-Id: I8ba1544c30dd6f3ca479db806384317549650dfc
-
- 29 Aug, 2014 7 commits
-
-
Jingning Han authored
Use unsigned int type to store the sse in the pixel domain. The precision is sufficient to handle sse of block size up to 64x64. The transform domain version however needs int64_t, since there is a transfer gain applied in the forward transformation that might cause unsigned int overflow. Change-Id: Ifef97c38597e426262290f35341fbb093cf0a079
-
Yunqing Wang authored
Added the missing "int". Change-Id: I7c8af3dee700837b40f010d53e1431a59370ae3a
-
James Zern authored
store the number of allocated rows in VP9LfSync, the calculated values can not be relied on when dealing with corrupt material. Change-Id: I13b8bcec9738c299a71df726772ab7ac05511e5b
-
Dmitry Kovalev authored
Removed functions: * vp9_mse16x16_mmx * vp9_get_mb_ss_mmx * vp9_get4x4var_mmx * vp9_get8x8var_mmx * vp9_variance4x4_mmx * vp9_variance8x8_mmx * vp9_variance16x16_mmx * vp9_variance16x8_mmx * vp9_variance8x16_mmx They all have SSE2 equivalent. Change-Id: I3796f2477c4f59b35b4828f46a300c16e62a2615
-
Jingning Han authored
This commit allows encoder to skip intra coding mode test, when the known inter residual is less than the source variance. It reduces the runtime of speed 3 for test clips: bus cif 1000 kbps: 8587 ms -> 8260 ms, 3.8% speed-up pedestrian 1080p 2000 kbps: 161381 ms -> 155241 ms, 3.7% speed-up. The compression performance is down by derf -0.36% stdhd -0.25% Change-Id: I75ce1e035b4da2153cb1ac14111d1a07c05a735d
-
Jingning Han authored
This commit extends the sse and forward transform computation flag to support the case 64x64 blocks where there are 4 32x32 2D-DCT blocks. Change-Id: I86a3e805dfaa0f3abd812f590520c71aa0e40473
-
James Zern authored
prevents any problems resuming decode after decoding a corrupt frame Change-Id: Ib7eb1b5c062aebe71074fef1ece32a32822c16be
-
- 28 Aug, 2014 4 commits
-
-
Dmitry Kovalev authored
New SSE2 function is three times faster than MMX one. Change-Id: I4f387ce9f75b88379176ec7bdc62d86eb5f70fbe
-
Dmitry Kovalev authored
In order to understand memory layout consider the declaration of the following structs. The first one is a part of our API: struct vpx_codec_ctx { // ... struct vpx_codec_priv *priv; }; The second one is defined in vpx_codec_internal.h: struct vpx_codec_priv { // ... }; The following struct is defined 4 times for encoder/decoder VP8/VP9: struct vpx_codec_alg_priv { struct vpx_codec_priv base; // ... }; Private data allocation for the given ctx: struct vpx_codec_ctx *ctx = <get> struct vpx_codec_alg_priv *alg_priv = <allocate> ctx->priv = (struct vpx_codec_priv *)alg_priv; The cast works because vpx_codec_alg_priv has a vpx_codec_priv instance as a first member 'base'. Change-Id: I10d1afc8c9a7dfda50baade8c7b0296678bdb0d0
-
Yunqing Wang authored
In the partition search, the encoder checks all possible partitionings in the superblock's partition search tree. This patch proposed a set of criteria for partition search early termination, which effectively decided whether or not to terminate the search in current branch based on the "skippable" result of the quantized transform coefficients. The "skippable" information was gathered during the partition mode search, and no overhead calculations were introduced. This patch gives significant encoding speed gains without sacrificing the quality. Borg test results: 1. At speed 1, stdhd set: psnr: +0.074%, ssim: +0.093%; derf set: psnr: -0.024%, ssim: +0.011%; 2. At speed 2, stdhd set: psnr: +0.033%, ssim: +0.100%; derf set: psnr: -0.062%, ssim: +0.003%; 3. At speed 3, stdhd set: psnr: +0.060%, ssim: +0.190%; derf set: psnr: -0.064%, ssim: -0.002%; 4. At speed 4, stdhd set: psnr: +0.070%, ssim: +0.143%; derf set: psnr: -0.104%, ssim: +0.039%; The speedup ranges from several percent to 60+%. speed1 speed2 speed3 speed4 (1080p, 100f): old_town_cross: 48.2% 23.9% 20.8% 16.5% park_joy: 11.4% 17.8% 29.4% 18.2% pedestrian_area: 10.7% 4.0% 4.2% 2.4% (720p, 200f): mobcal: 68.1% 36.3% 34.4% 17.7% parkrun: 15.8% 24.2% 37.1% 16.8% shields: 45.1% 32.8% 30.1% 9.6% (cif, 300f) bus: 3.7% 10.4% 14.0% 7.9% deadline: 13.6% 14.8% 12.6% 10.9% mobile: 5.3% 11.5% 14.7% 10.7% Change-Id: I246c38fb952ad762ce5e365711235b605f470a66
-
Deb Mukherjee authored
Updates the vp9_pattern_search function to return integer one-away neighbors' sad values, for subsequent use in speeding up the sub-pel search. Also, removes code for the do_refine option which is not being used currently. Updates the integer and subpel functions to pass in a 5-element sad list for output or input. A new pruned sub-pel search algorithm is implemented that uses the sad returned from the integer pel search. But it is not deployed yet. Change-Id: Ifa9f5ad024b5b660570366d2bd900343e1891520
-
- 27 Aug, 2014 6 commits
-
-
James Zern authored
attempting to decode a frame after the previous frame failed has the potential of interrupting an earlier loop filter task Change-Id: I6f2b1ddcdf5b89c3e2ee8caf5289dada2a087d66
-
Jingning Han authored
This commit re-work the operation flow related to prediction residual generation and the rate-distortion modeling. It saves one call for model_rd_for_sb. Change-Id: Icaf96c0ff09c903637ed5283448afe01d798195f
-
Jingning Han authored
The value of switchable rate has been stored in a local variable. This change skips the second call to vp9_get_switchable_rate() by reusing the local variable. Change-Id: Ib7d3fef7621cc4bde94c6d6e6b3a71f1fd4559f2
-
Jingning Han authored
Check the mode and motion vector cost. If it is already above the existing best rate-distortion cost, skip the rest check process on this mode. Change-Id: Ie065cebdfda2a3be3be18b8e8b43dc29aaa8c179
-
Jingning Han authored
This commit makes the rate distortion modeling run in the unit of maximum transform block size. No compression/speed change observed. It is for the use of later fast forward transform purpose. Change-Id: Ibaaedb69c765e8d0c5d5012f0ec07f36fd9f68fd
-
James Zern authored
if the first frame was corrupt and loop filter not called, the next call would assume the necessary allocations had been done and segfault when accessing a NULL pointer Change-Id: Ib6ef505e5c594e6f0fe65ab0700172bcf06b92a6
-
- 26 Aug, 2014 6 commits
-
-
Dmitry Kovalev authored
Change-Id: Icfacc695a711ec325b1d8f2b5d927a720e2bd6b4
-
Dmitry Kovalev authored
Change-Id: I483a2fefc5f9ea4533dfd64448f3b6b426dd9eed
-
Yaowu Xu authored
This commit addes a new strategy to reduce the search for optimal interpolation filter type. The encoder counts and store how many each filter type is selected and used for each of the reference frames. A filter type that is rarely used for all three reference frames is masked out to avoid computation. The impact on compression is neglectible: -0.02% on derf +0.02% on stdhd Encoding time is seen to reduce by 2~3%. Change-Id: Ibafa92291b51185de40da513716222db4b230383
-
Dmitry Kovalev authored
Change-Id: Icab9a4399c5687453f4bec14b8cb5000464335e5
-
Dmitry Kovalev authored
Using local variable instead. Change-Id: If592d73ba2b04972cdae938751155c183a6db25a
-
Dmitry Kovalev authored
We don't output invisible frames with VP9. Change-Id: I7b874d3ac454c1b2966d5d7d72e12a864b49afae
-
- 25 Aug, 2014 3 commits
-
-
Dmitry Kovalev authored
The variable is never read. Change-Id: I94141c1667fa5d10604cd6f83c5f64df107dee94
-
Dmitry Kovalev authored
Change-Id: I2b9609dd22bacbf26e669f70bf155613b0316eb3
-
Minghai Shang authored
We can use one frame context for each layer so that we don't have to reset the probs every frame. But we can't use prev_mi since we may drop enhancement layers. So we have to generate a non vp9 compatible bitstream and modify it in the player. 1. We need to code all frames as invisible frame to let prev_mi not to be used. But in the bitstream we need to code the show_frame flag to 1 so that the publisher will know it's supposed to be a visible frame. 2. In the player we need to change the show_frame flag to 0 for all frames. Then add an one byte frame into the super frame to tell the decoder which layer we want to show. Change-Id: I75b7304cf31f0ab952f043e33c034495e88f01f3
-
- 22 Aug, 2014 6 commits
-
-
Dmitry Kovalev authored
Using local variables instead. Change-Id: I68737f7e392b81492ffd3ef2c2ff9afbf55fb097
-
Dmitry Kovalev authored
Change-Id: Ia3be6b5a18e1ff6cc5c5f4d37e4a5d0972388308
-
Dmitry Kovalev authored
This patch fixes slow first pass problem. Mode could only be determined from the deadline value during frame encode call. Unfortunately, we use mode value before any encode calls during the first pass encoding (see set_speed_features() logic). The mode for the first pass must be different from BEST to make first pass fast. Change-Id: I562a7d32004ff631695d91c09a44d8a9076fd6b5
-
Jim Bankoski authored
Change-Id: I2b4f4e929495837817010eae12aa6225899afaff
-
Jim Bankoski authored
Change-Id: Ic39cc0deafb3ed509434d3d9953b99713de7394a
-
Jim Bankoski authored
Change-Id: I6d77a7c775c0482fd1f9bb03ea6f336dd2973fa0
-