1. 19 Apr, 2013 2 commits
    • John Koleszar's avatar
      Removing rounding from UV MV calculation · 2987fa1d
      John Koleszar authored
      Consider the previous behavior for the MV 1 3/8 (11/8 pel). In the
      existing code, the fractional part of the MV is considered separately,
      and rounded is applied, giving a result of 6/8. Rounding is not required
      in this case, as we're increasing the precision from a q3 to a q4, and
      the correct value 11/16 can be represented exactly.
      
      Slight gain observed (+.033 average on derf)
      
      Change-Id: I320e160e8b12f1dd66aa0ce7966b5088870fe9f8
      2987fa1d
    • John Koleszar's avatar
      make buid_inter_predictors block size agnostic (luma) · 4924934d
      John Koleszar authored
      This commit converts the luma versions of vp9_build_inter_predictors_sb
      to use a common function. Update the convolution functions to support
      block sizes larger than 16x16, and add a foreach_predicted_block walker.
      
      Next step will be to calculate the UV motion vector and implement SBUV,
      then fold in vp9_build_inter16x16_predictors_mb and SPLITMV.
      
      At the 16x16, 32x32, and 64x64 levels implemented in this commit, each
      plane is predicted with only a single call to vp9_build_inter_predictor.
      This is not yet called for SPLITMV. If the notion of SPLITMV/I8X8/I4X4
      goes away, then the prediction block walker can go away, since we'll
      always predict the whole bsize in a single step. Implemented using a
      block walker at this stage for SPLITMV, as a 4x4 "prediction block size"
      within the BLOCK_SIZE_MB16X16 macroblock. It would also support other
      rectangular sizes too, if the blocks smaller than 16x16 remain
      implemented as a SPLITMV-like thing. Just using 4x4 for now.
      
      There's also a potential to combine with the foreach_transformed_block
      walker if the logic for calculating the size of the subsampled
      transform is made more straightforward, perhaps as a consequence of
      supporing smaller macroblocks than 16x16. Will watch what happens there.
      
      Change-Id: Iddd9973398542216601b630c628b9b7fdee33fe2
      4924934d
  2. 18 Apr, 2013 1 commit
    • Jingning Han's avatar
      Make the use of pred buffers consistent in MB/SB · 6f43ff58
      Jingning Han authored
      Use in-place buffers (dst of MACROBLOCKD) for  macroblock prediction.
      This makes the macroblock buffer handling consistent with those of
      superblock. Remove predictor buffer MACROBLOCKD.
      
      Change-Id: Id1bcd898961097b1e6230c10f0130753a59fc6df
      6f43ff58
  3. 16 Apr, 2013 1 commit
    • Yunqing Wang's avatar
      Optimize the scaling calculation · 148eb803
      Yunqing Wang authored
      In decoder, the scaling calculation, such as (mv * x_num / x_den),
      is fairly time-consuming. In this patch, we check if the scaling
      happens or not at frame level, and then decide which function to
      call to skip scaling calculation when no scaling is needed. Tests
      showed a 3% decoder performance gain.
      
      Change-Id: I270901dd0331048e50368cfd51ce273dd82b8733
      148eb803
  4. 12 Apr, 2013 1 commit
    • Jingning Han's avatar
      Enable inter predictor for rectangular block size · 3ba9dd41
      Jingning Han authored
      Combine superblock inter predictors into a unified function that
      allows configurable block width and height. The inter predictions
      of block sizes smaller than 16x16 are handled differently. To be
      continued on merging them later.
      
      Change-Id: I14075959dd5e221f00c205c99ca35c1c31ef728e
      3ba9dd41
  5. 11 Apr, 2013 1 commit
    • Scott LaVarnway's avatar
      WIP: removing predictor buffer usage from decoder · 6189f2bc
      Scott LaVarnway authored
      This patch will use the dest buffer instead of the
      predictor buffer.  This will allow us in future commits
      to remove the extra mem copy that occurs in the dequant
      functions when eob == 0.  We should also be able to remove
      extra params that are passed into the dequant functions.
      
      Change-Id: I7241bc1ab797a430418b1f3a95b5476db7455f6a
      6189f2bc
  6. 27 Mar, 2013 1 commit
    • Dmitry Kovalev's avatar
      Removing redundant function arguments. · 17cddb4e
      Dmitry Kovalev authored
      Almost all arguments for vp9_build_inter32x32_predictors_sb and
      vp9_build_inter64x64_predictors_sb can be deduced from the first macroblock
      argument.
      
      Change-Id: I5d477a607586d05698d5b3b9b9bc03891dd3fe83
      17cddb4e
  7. 27 Feb, 2013 1 commit
    • John Koleszar's avatar
      Spatial resamping of ZEROMV predictors · eb939f45
      John Koleszar authored
      This patch allows coding frames using references of different
      resolution, in ZEROMV mode. For compound prediction, either
      reference may be scaled.
      
      To test, I use the resize_test and enable WRITE_RECON_BUFFER
      in vp9_onyxd_if.c. It's also useful to apply this patch to
      test/i420_video_source.h:
      
        --- a/test/i420_video_source.h
        +++ b/test/i420_video_source.h
        @@ -93,6 +93,7 @@ class I420VideoSource : public VideoSource {
      
           virtual void FillFrame() {
             // Read a frame from input_file.
        +    if (frame_ != 3)
             if (fread(img_->img_data, raw_sz_, 1, input_file_) == 0) {
               limit_ = frame_;
             }
      
      This forces the frame that the resolution changes on to be coded
      with no motion, only scaling, and improves the quality of the
      result.
      
      Change-Id: I1ee75d19a437ff801192f767fd02a36bcbd1d496
      eb939f45
  8. 26 Feb, 2013 2 commits
  9. 05 Feb, 2013 1 commit
    • John Koleszar's avatar
      Convert subpixel filters to use convolve framework · 7a07eea1
      John Koleszar authored
      Update the code to call the new convolution functions to do subpixel
      prediction rather than the existing functions. Remove the old C and
      assembly code, since it is unused. This causes a 50% performance
      reduction on the decoder, but that will be resolved when the asm for
      the new functions is available.
      
      There is no consensus for whether 6-tap or 2-tap predictors will be
      supported in the final codec, so these filters are implemented in
      terms of the 8-tap code, so that quality testing of these modes
      can continue. Implementing the lower complexity algorithms is a
      simple exercise, should it be necessary.
      
      This code produces slightly better results in the EIGHTTAP_SMOOTH
      case, since the filter is now applied in only one direction when
      the subpel motion is only in one direction. Like the previous code,
      the filtering is skipped entirely on full-pel MVs. This combination
      seems to give the best quality gains, but this may be indicative of a
      bug in the encoder's filter selection, since the encoder could
      achieve the result of skipping the filtering on full-pel by selecting
      one of the other filters. This should be revisited.
      
      Quality gains on derf positive on almost all clips. The only clip
      that seemed to be hurt at all datarates was football
      (-0.115% PSNR average, -0.587% min). Overall averages 0.375% PSNR,
      0.347% SSIM.
      
      Change-Id: I7d469716091b1d89b4b08adde5863999319d69ff
      7a07eea1
  10. 10 Jan, 2013 1 commit
  11. 08 Jan, 2013 1 commit
  12. 06 Jan, 2013 1 commit
  13. 18 Dec, 2012 1 commit
  14. 30 Nov, 2012 1 commit
  15. 28 Nov, 2012 1 commit
  16. 27 Nov, 2012 1 commit
    • John Koleszar's avatar
      Add vp9_ prefix to all vp9 files · fcccbcbb
      John Koleszar authored
      Support for gyp which doesn't support multiple objects in the same
      static library having the same basename.
      
      Change-Id: Ib947eefbaf68f8b177a796d23f875ccdfa6bc9dc
      fcccbcbb
  17. 01 Nov, 2012 3 commits
  18. 31 Oct, 2012 2 commits
  19. 17 Oct, 2012 1 commit
  20. 16 Oct, 2012 1 commit
  21. 14 Oct, 2012 2 commits
  22. 21 Aug, 2012 1 commit
  23. 16 Aug, 2012 1 commit
  24. 15 Aug, 2012 1 commit
    • Paul Wilkins's avatar
      Code clean up. · 77dc5c65
      Paul Wilkins authored
      Further cases of inconsistent naming convention.
      
      Change-Id: Id3411ecec6f01a4c889268a00f0c9fd5a92ea143
      77dc5c65
  25. 10 Aug, 2012 1 commit
  26. 09 Aug, 2012 1 commit
  27. 30 Jul, 2012 1 commit
    • Deb Mukherjee's avatar
      Adds support for switchable interpolation filters. · 52597441
      Deb Mukherjee authored
      Allows for swtiching/setting interpolation filters at the MB
      level. A frame level flag indicates whether to use a specifc
      filter for the entire frame or to signal the interpolation
      filter for each MB. When switchable filters are used, the
      encoder chooses between 8-tap and 8-tap sharp filters. The
      code currently has options to explore other variations as well,
      which will be cleaned up subsequently.
      
      One issue with the framework is that encoding is slow. I
      tried to do some tricks to speed things up but it is still slow.
      Decoding speed should not be affected since the number of
      filter taps remain unchanged.
      
      With the current version, we are up 0.5% on derf on average but
      some videos city/mobile improve by close to 4 and 2% respectively.
      If we did a full-search by turning the SEARCH_BEST_FILTER flag
      on, the results are somewhat better.
      
      The framework can be combined with filtered prediction, and I
      seek feedback regarding that.
      
      Rebased.
      
      Change-Id: I8f632cb2c111e76284140a2bd480945d6d42b77a
      52597441
  28. 18 Apr, 2012 1 commit
  29. 15 Mar, 2012 1 commit
    • Yaowu Xu's avatar
      WebM Experimental Codec Branch Snapshot · 6035da54
      Yaowu Xu authored
      This is a code snapshot of experimental work currently ongoing for a
      next-generation codec.
      
      The codebase has been cut down considerably from the libvpx baseline.
      For example, we are currently only supporting VBR 2-pass rate control
      and have removed most of the code relating to coding speed, threading,
      error resilience, partitions and various other features.  This is in
      part to make the codebase easier to work on and experiment with, but
      also because we want to have an open discussion about how the bitstream
      will be structured and partitioned and not have that conversation
      constrained by past work.
      
      Our basic working pattern has been to initially encapsulate experiments
      using configure options linked to #IF CONFIG_XXX statements in the
      code. Once experiments have matured and we are reasonably happy that
      they give benefit and can be merged without breaking other experiments,
      we remove the conditional compile statements and merge them in.
      
      Current changes include:
      * Temporal coding experiment for segments (though still only 4 max, it
        will likely be increased).
      * Segment feature experiment - to allow various bits of information to
        be coded at the segment level. Features tested so far include mode
        and reference frame information, limiting end of block offset and
        transform size, alongside Q and loop filter parameters, but this set
        is very fluid.
      * Support for 8x8 transform - 8x8 dct with 2nd order 2x2 haar is used
        in MBs using 16x16 prediction modes within inter frames.
      * Compound prediction (combination of signals from existing predictors
        to create a new predictor).
      * 8 tap interpolation filters and 1/8th pel motion vectors.
      * Loop filter modifications.
      * Various entropy modifications and changes to how entropy contexts and
        updates are handled.
      * Extended quantizer range matched to transform precision improvements.
      
      There are also ongoing further experiments that we hope to merge in the
      near future: For example, coding of motion and other aspects of the
      prediction signal to better support larger image formats, use of larger
      block sizes (e.g. 32x32 and up) and lossless non-transform based coding
      options (especially for key frames). It is our hope that we will be
      able to make regular updates and we will warmly welcome community
      contributions.
      
      Please be warned that, at this stage, the codebase is currently slower
      than VP8 stable branch as most new code has not been optimized, and
      even the 'C' has been deliberately written to be simple and obvious,
      not fast.
      
      The following graphs have the initial test results, numbers in the
      tables measure the compression improvement in terms of percentage. The
      build has  the following optional experiments configured:
      --enable-experimental --enable-enhanced_interp --enable-uvintra
      --enable-high_precision_mv --enable-sixteenth_subpel_uv
      
      CIF Size clips:
      http://getwebm.org/tmp/cif/
      HD size clips:
      http://getwebm.org/tmp/hd/
      (stable_20120309 represents encoding results of WebM master branch
      build as of commit#7a159071)
      
      They were encoded using the following encode parameters:
      --good --cpu-used=0 -t 0 --lag-in-frames=25 --min-q=0 --max-q=63
      --end-usage=0 --auto-alt-ref=1 -p 2 --pass=2 --kf-max-dist=9999
      --kf-min-dist=0 --drop-frame=0 --static-thresh=0 --bias-pct=50
      --minsection-pct=0 --maxsection-pct=800 --sharpness=0
      --arnr-maxframes=7 --arnr-strength=3(for HD,6 for CIF)
      --arnr-type=3
      
      Change-Id: I5c62ed09cfff5815a2bb34e7820d6a810c23183c
      6035da54
  30. 31 Jan, 2012 1 commit
    • Scott LaVarnway's avatar
      BLOCKD structure cleanup · 749bc986
      Scott LaVarnway authored
      Removed redundancies.  All of the information can be
      found in the MACROBLOCKD structure.
      
      Change-Id: I7556392c6f67b43bef2a5e9932180a737466ef93
      749bc986
  31. 15 Dec, 2011 1 commit
    • Scott LaVarnway's avatar
      Moved dequant idct into common · a53d5a4c
      Scott LaVarnway authored
      These functions are now used by the encoder.
      This is WIP with the goal of creating a common idct/add for
      the encoder and decoder.  A boost of 1.8% was seen for
      the HD rt test clip used.
      
      [Tero] Added needed changes to ARM side.
      
      Change-Id: Ibbb8000be09034203d7adffc457d3c3f8b06a5bf
      a53d5a4c
  32. 06 Dec, 2011 1 commit
    • Ronald S. Bultje's avatar
      Dual 16x16 inter prediction. · 60cb39da
      Ronald S. Bultje authored
      This patch introduces the concept of dual inter16x16 prediction. A
      16x16 inter-predicted macroblock can use 2 references instead of 1,
      where both references use the same mvmode (new, near/est, zero). In the
      case of newmv, this means that two MVs are coded instead of one. The
      frame can be encoded in 3 ways: all MBs single-prediction, all MBs dual
      prediction, or per-MB single/dual prediction selection ("hybrid"), in
      which case a single bit is coded per-MB to indicate whether the MB uses
      single or dual inter prediction.
      
      In the future, we can (maybe?) get further gains by mixing this with
      Adrian's 32x32 work, per-segment dual prediction settings, or adding
      support for dual splitmv/8x8mv inter prediction.
      
      Gain (on derf-set, CQ mode) is ~2.8% (SSIM) or ~3.6% (glb PSNR). Most
      gain is at medium/high bitrates, but there's minor gains at low bitrates
      also. Output was confirmed to match between encoder and decoder.
      
      Note for optimization people: this patch introduces a 2nd version of
      16x16/8x8 sixtap/bilin functions, which does an avg instead of a
      store. They may want to look and make sure this is implemented to
      their satisfaction so we can optimize it best in the future.
      
      Change-ID: I59dc84b07cbb3ccf073ac0f756d03d294cb19281
      60cb39da
  33. 18 Oct, 2011 1 commit
    • Scott LaVarnway's avatar
      Remove usage of predict buffer for decode · ed9c66f5
      Scott LaVarnway authored
      Instead of using the predict buffer, the decoder now writes
      the predictor into the recon buffer.  For blocks with eob=0,
      unnecessary idcts can be eliminated.  This gave a performance
      boost of ~1.8% for the HD clips used.
      
      Tero: Added needed changes to ARM side and scheduled some
            assembly code to prevent interlocks.
      
      Patch Set 6:  Merged (I1bcdca7a95aacc3a181b9faa6b10e3a71ee24df3)
      into this commit because of similarities in the idct
      functions.
      Patch Set 7: EC bug fix.
      
      Change-Id: Ie31d90b5d3522e1108163f2ac491e455e3f955e6
      ed9c66f5
  34. 24 Aug, 2011 1 commit
    • Scott LaVarnway's avatar
      Removed bmi copy to/from BLOCKD · b870947d
      Scott LaVarnway authored
      for SPLITMV and B_PRED modes.  Modified code to use the bmi
      found in mode_info_context instead of BLOCKD.  On the decode
      side, the uvmvs are calculated only when required, instead of
      every macroblock.  This is WIP. (bmi should eventually be
      removed from BLOCKD)
      Small performance gains noticed for RT encodes and decodes.(VGA)
      
      Change-Id: I2ed7f0fd5ca733655df684aa82da575c77a973e7
      b870947d