1. 03 Sep, 2014 3 commits
  2. 02 Sep, 2014 3 commits
    • Dmitry Kovalev's avatar
      Removing MMX SAD calculation code. · 318fc0c3
      Dmitry Kovalev authored
      Removed functions:
      * vp9_sad_16x16_mmx
      * vp9_sad_8x16_mmx
      * vp9_sad_16x8_mmx
      * vp9_sad_8x8_mmx
      * vp9_sad_4x4_mmx
      Change-Id: Ic5174b93b64d65d846f0c11e72cab149e9472bc3
    • Deb Mukherjee's avatar
      Adds config opt for highbitdepth + misc. vpx · 5acfafb1
      Deb Mukherjee authored
      Adds config parameter vp9_highbitdepth, to support highbitdepth profiles.
      Also includes most vpx level high bit-depth functions. However
      encode/decode in the highbitdepth profiles will not work until
      the rest of the code is in place.
      Change-Id: I34c53b253c38873611057a6cbc89a1361b8985a6
    • Jingning Han's avatar
      Skip comp inter mode tests for arf coding · 33176fef
      Jingning Han authored
      This commit skips the compound inter mode prediction check in the
      rate-distortion optimization loop for ARF coding. It reduces the
      runtime for certain test clips at speed 3, at no compression
      performance change:
      bus CIF 1000 kbps, 8260 ms -> 8090 ms, 1.8% speed-up
      stockholm 720p 1000 kbps, 74453 ms -> 71826 ms, 2.9% speed-up
      No visible speed-up for pedestrian area 1080p at 2000 kbps.
      Change-Id: Ic68aa56837159b726563b784e2e3729e846465ad
  3. 30 Aug, 2014 2 commits
  4. 29 Aug, 2014 7 commits
    • Jingning Han's avatar
      Fix int64_t to unsigned int conversion warnings · 6ddf1e15
      Jingning Han authored
      Use unsigned int type to store the sse in the pixel domain. The
      precision is sufficient to handle sse of block size up to 64x64.
      The transform domain version however needs int64_t, since there is
      a transfer gain applied in the forward transformation that might
      cause unsigned int overflow.
      Change-Id: Ifef97c38597e426262290f35341fbb093cf0a079
    • Yunqing Wang's avatar
      Minor fix in vp9_encoder.h · 96c43e8a
      Yunqing Wang authored
      Added the missing "int".
      Change-Id: I7c8af3dee700837b40f010d53e1431a59370ae3a
    • James Zern's avatar
      vp9: fix m/t loop filter invalid free · fec40f92
      James Zern authored
      store the number of allocated rows in VP9LfSync, the calculated values
      can not be relied on when dealing with corrupt material.
      Change-Id: I13b8bcec9738c299a71df726772ab7ac05511e5b
    • Dmitry Kovalev's avatar
      Removing variance MMX code. · 12cd6f42
      Dmitry Kovalev authored
      Removed functions:
      * vp9_mse16x16_mmx
      * vp9_get_mb_ss_mmx
      * vp9_get4x4var_mmx
      * vp9_get8x8var_mmx
      * vp9_variance4x4_mmx
      * vp9_variance8x8_mmx
      * vp9_variance16x16_mmx
      * vp9_variance16x8_mmx
      * vp9_variance8x16_mmx
      They all have SSE2 equivalent.
      Change-Id: I3796f2477c4f59b35b4828f46a300c16e62a2615
    • Jingning Han's avatar
      Skip intra mode tests depending on inter residuals · 4282955e
      Jingning Han authored
      This commit allows encoder to skip intra coding mode test, when
      the known inter residual is less than the source variance. It
      reduces the runtime of speed 3 for test clips:
      bus cif 1000 kbps: 8587 ms -> 8260 ms, 3.8% speed-up
      pedestrian 1080p 2000 kbps: 161381 ms -> 155241 ms, 3.7% speed-up.
      The compression performance is down by
      derf   -0.36%
      stdhd  -0.25%
      Change-Id: I75ce1e035b4da2153cb1ac14111d1a07c05a735d
    • Jingning Han's avatar
      Extend block level sse to support multiple txfm blocks · 02e6ecdc
      Jingning Han authored
      This commit extends the sse and forward transform computation flag
      to support the case 64x64 blocks where there are 4 32x32 2D-DCT
      Change-Id: I86a3e805dfaa0f3abd812f590520c71aa0e40473
    • James Zern's avatar
      vp9: sync workers at the start of decode_tiles_mt() · dbdff12b
      James Zern authored
      prevents any problems resuming decode after decoding a corrupt frame
      Change-Id: Ib7eb1b5c062aebe71074fef1ece32a32822c16be
  5. 28 Aug, 2014 4 commits
    • Dmitry Kovalev's avatar
      Implementing 4x4 variance calculation with SSE2. · dcac083c
      Dmitry Kovalev authored
      New SSE2 function is three times faster than MMX one.
      Change-Id: I4f387ce9f75b88379176ec7bdc62d86eb5f70fbe
    • Dmitry Kovalev's avatar
      Removing alg_priv from vpx_codec_priv struct. · 73edeb03
      Dmitry Kovalev authored
      In order to understand memory layout consider the declaration of the
      following structs. The first one is a part of our API:
      struct vpx_codec_ctx {
        // ...
        struct vpx_codec_priv *priv;
      The second one is defined in vpx_codec_internal.h:
      struct vpx_codec_priv {
        // ...
      The following struct is defined 4 times for encoder/decoder VP8/VP9:
      struct vpx_codec_alg_priv {
        struct vpx_codec_priv base;
        // ... 
      Private data allocation for the given ctx:
      struct vpx_codec_ctx *ctx = <get>
      struct vpx_codec_alg_priv *alg_priv = <allocate>
      ctx->priv = (struct vpx_codec_priv *)alg_priv;
      The cast works because vpx_codec_alg_priv has a
      vpx_codec_priv instance as a first member 'base'.
      Change-Id: I10d1afc8c9a7dfda50baade8c7b0296678bdb0d0
    • Yunqing Wang's avatar
      Early termination in encoding partition search · 4d2c3769
      Yunqing Wang authored
      In the partition search, the encoder checks all possible
      partitionings in the superblock's partition search tree.
      This patch proposed a set of criteria for partition search
      early termination, which effectively decided whether or
      not to terminate the search in current branch based on the
      "skippable" result of the quantized transform coefficients.
      The "skippable" information was gathered during the
      partition mode search, and no overhead calculations were
      This patch gives significant encoding speed gains without
      sacrificing the quality.
      Borg test results:
      1. At speed 1,
         stdhd set: psnr: +0.074%, ssim: +0.093%;
         derf set:  psnr: -0.024%, ssim: +0.011%;
      2. At speed 2,
         stdhd set: psnr: +0.033%, ssim: +0.100%;
         derf set:  psnr: -0.062%, ssim: +0.003%;
      3. At speed 3,
         stdhd set: psnr: +0.060%, ssim: +0.190%;
         derf set:  psnr: -0.064%, ssim: -0.002%;
      4. At speed 4,
         stdhd set: psnr: +0.070%, ssim: +0.143%;
         derf set:  psnr: -0.104%, ssim: +0.039%;
      The speedup ranges from several percent to 60+%.
                       speed1    speed2    speed3    speed4
      (1080p, 100f):
      old_town_cross:  48.2%     23.9%     20.8%     16.5%
      park_joy:        11.4%     17.8%     29.4%     18.2%
      pedestrian_area: 10.7%      4.0%      4.2%      2.4%
      (720p, 200f):
      mobcal:          68.1%     36.3%     34.4%     17.7%
      parkrun:         15.8%     24.2%     37.1%     16.8%
      shields:         45.1%     32.8%     30.1%      9.6%
      (cif, 300f)
      bus:              3.7%     10.4%     14.0%      7.9%
      deadline:        13.6%     14.8%     12.6%     10.9%
      mobile:           5.3%     11.5%     14.7%     10.7%
      Change-Id: I246c38fb952ad762ce5e365711235b605f470a66
    • Deb Mukherjee's avatar
      Updates vp9_pattern search to return integer sads · 04b100b2
      Deb Mukherjee authored
      Updates the vp9_pattern_search function to return integer one-away
      neighbors' sad values, for subsequent use in speeding up the
      sub-pel search. Also, removes code for the do_refine option
      which is not being used currently.
      Updates the integer and subpel functions to pass in a 5-element
      sad list for output or input.
      A new pruned sub-pel search algorithm is implemented that uses
      the sad returned from the integer pel search. But it is not
      deployed yet.
      Change-Id: Ifa9f5ad024b5b660570366d2bd900343e1891520
  6. 27 Aug, 2014 6 commits
    • James Zern's avatar
      vp9: fix crash in inline loopfilter w/corrupt file · cde790c3
      James Zern authored
      attempting to decode a frame after the previous frame failed has the
      potential of interrupting an earlier loop filter task
      Change-Id: I6f2b1ddcdf5b89c3e2ee8caf5289dada2a087d66
    • Jingning Han's avatar
      Re-work RD modeling based on inter frame prediction residual · 993ef8bd
      Jingning Han authored
      This commit re-work the operation flow related to prediction
      residual generation and the rate-distortion modeling. It saves one
      call for model_rd_for_sb.
      Change-Id: Icaf96c0ff09c903637ed5283448afe01d798195f
    • Jingning Han's avatar
      Re-use switchable rate value in handle_inter_mode · 4db022c3
      Jingning Han authored
      The value of switchable rate has been stored in a local variable.
      This change skips the second call to vp9_get_switchable_rate() by
      reusing the local variable.
      Change-Id: Ib7d3fef7621cc4bde94c6d6e6b3a71f1fd4559f2
    • Jingning Han's avatar
      Add an early termination check in handle_inter_mode · cd228fcd
      Jingning Han authored
      Check the mode and motion vector cost. If it is already above
      the existing best rate-distortion cost, skip the rest check process
      on this mode.
      Change-Id: Ie065cebdfda2a3be3be18b8e8b43dc29aaa8c179
    • Jingning Han's avatar
      Use max txfm size unit in rate-distortion cost modeling · ec7ce316
      Jingning Han authored
      This commit makes the rate distortion modeling run in the unit of
      maximum transform block size. No compression/speed change observed.
      It is for the use of later fast forward transform purpose.
      Change-Id: Ibaaedb69c765e8d0c5d5012f0ec07f36fd9f68fd
    • James Zern's avatar
      vp9: fix crash in mt loopfilter w/corrupt file · 4f27202d
      James Zern authored
      if the first frame was corrupt and loop filter not called, the next call
      would assume the necessary allocations had been done and segfault when
      accessing a NULL pointer
      Change-Id: Ib6ef505e5c594e6f0fe65ab0700172bcf06b92a6
  7. 26 Aug, 2014 6 commits
  8. 25 Aug, 2014 3 commits
    • Dmitry Kovalev's avatar
      Removing tx_stepdown_count from VP9_COMP. · 4478553e
      Dmitry Kovalev authored
      The variable is never read.
      Change-Id: I94141c1667fa5d10604cd6f83c5f64df107dee94
    • Dmitry Kovalev's avatar
      Cleaning up is_background(). · e576c42f
      Dmitry Kovalev authored
      Change-Id: I2b9609dd22bacbf26e669f70bf155613b0316eb3
    • Minghai Shang's avatar
      [spatial svc]Multiple frame context feature · d4a407c0
      Minghai Shang authored
      We can use one frame context for each layer so that we don't have
      to reset the probs every frame. But we can't use prev_mi since we
      may drop enhancement layers. So we have to generate a non vp9
      compatible bitstream and modify it in the player.
      1. We need to code all frames as invisible frame to let prev_mi
         not to be used. But in the bitstream we need to code the
         show_frame flag to 1 so that the publisher will know it's
         supposed to be a visible frame.
      2. In the player we need to change the show_frame flag to 0 for
         all frames. Then add an one byte frame into the super frame
         to tell the decoder which layer we want to show.
      Change-Id: I75b7304cf31f0ab952f043e33c034495e88f01f3
  9. 22 Aug, 2014 6 commits