1. 22 Dec, 2011 1 commit
    • John Koleszar's avatar
      Use lookup tables for thresh_mult · efb4783d
      John Koleszar authored
      Mostly cosmetic. Trying for a more compact representation of speed
      selection thresholds.
      
      Change-Id: Icaebea632c7bb71ca8e07b4def04a046d4515e27
      efb4783d
  2. 16 Dec, 2011 1 commit
    • John Koleszar's avatar
      Avoid heap allocation of firstpass stats · 26c6a44c
      John Koleszar authored
      The total_stats, this_frame_stats, and total_left_stats structures
      were previously create by a heap allocation, despite being of fixed
      size. These structures were allocated and deallocated during
      {de,}allocate_compressor_data, which is reinvoked whenever the frame
      size changes. Unfortunately, this clobbers the total_stats and
      total_left_stats data.
      
      Historically, these were variable size at one time, due to the first
      pass motion map, which necessitated their being created by a unique
      heap allocation. However, this bug with the total_stats being
      clobbered has probably been present since that initial implementation.
      
      These structures are instead moved to be stored within the struct
      twopass_rc directly, rather than being heap allocated separately.
      
      Change-Id: I7f9e519e25c58b92969071f0e99fa80307e0682b
      26c6a44c
  3. 14 Dec, 2011 1 commit
  4. 10 Dec, 2011 1 commit
  5. 07 Dec, 2011 1 commit
    • Attila Nagy's avatar
      Reduce mem copies in encoder loopfilter level picking · e570b040
      Attila Nagy authored
      Do the test filtering in the existing backup frame buffer instead of
      the original. Copy the original data into extra buffer before doing
      the  filtering. This way there is no need to restore the original
      unfiltered  frame at the end of level picking process.
      
      This came up in some discussions with Johann. Thanks!
      
      Change-Id: I495f4301d983854673276c34ec0ddf9a9d622122
      e570b040
  6. 05 Dec, 2011 1 commit
    • Yunqing Wang's avatar
      Multiple-resolution encoder · aa7335e6
      Yunqing Wang authored
      The example encoder down-samples the input video frames a number of
      times with a down-sampling factor, and then encodes and outputs
      bitstreams with different resolutions.
      
      Support arbitrary down-sampling factor, and down-sampling factor
      can be different for each encoding level.
      
      For example, the encoder can be tested as follows.
      1. Configure with multi-resolution encoding enabled:
      ../libvpx/configure --target=x86-linux-gcc --disable-codecs
      --enable-vp8 --enable-runtime_cpu_detect --enable-debug
      --disable-install-docs --enable-error-concealment
      --enable-multi-res-encoding
      2. Run make
      3. Encode:
      If input video is 1280x720, run:
      ./vp8_multi_resolution_encoder 1280 720 input.yuv 1.ivf 2.ivf 3.ivf 1
      (output: 1.ivf(1280x720); 2.ivf(640x360); 3.ivf(320x180).
      The last parameter is set to 1/0 to show/not show PSNR.)
      4. Decode:
      ./simple_decoder 1.ivf 1.yuv
      ./simple_decoder 2.ivf 2.yuv
      ./simple_decoder 3.ivf 3.yuv
      5. View video:
      mplayer 1.yuv -demuxer rawvideo -rawvideo w=1280:h=720 -loop 0 -fps 30
      mplayer 2.yuv -demuxer rawvideo -rawvideo w=640:h=360 -loop 0 -fps 30
      mplayer 3.yuv -demuxer rawvideo -rawvideo w=320:h=180 -loop 0 -fps 30
      
      The encoding parameters can be modified in vp8_multi_resolution_encoder.c,
      for example, target bitrate, frame rate...
      
      Modified API. John helped a lot with that. Thanks!
      
      Change-Id: I03be9a51167eddf94399f92d269599fb3f3d54f5
      aa7335e6
  7. 23 Nov, 2011 1 commit
    • Attila Nagy's avatar
      Fix encoder partitioned output on ARM · 97259b46
      Attila Nagy authored
      API was not returning correct partition sizes on arm targets.
      The armv5 token packing functions were not storing the information to the
      partition size table.
      As a fix, have one boolcoder instance allocated for each partition so
      that partition sizes are internally available after all partitions
      were encoded. This will also allow more flexibility in producing
      several partitions in parallel.
      
      Use buffer validation (overflow check) in all ARM bitpacking
      functions.
      
      Change-Id: I31c8a11d8a7613676f0ff50928cb2a2ab14fd169
      97259b46
  8. 18 Nov, 2011 2 commits
    • John Koleszar's avatar
      Speed selection support for disabled reference frames · e55974bf
      John Koleszar authored
      There was an implicit reference frame test order (typically LAST,
      GOLD, ARF) in the mode selection logic, but this doesn't provide the
      expected results when some reference frames are disabled. For
      instance, in real-time mode, the speed selection logic often disables
      the ARF modes. So if the user disables the LAST and GOLD frames, the
      encoder was always choosing INTRA, when in reality searching the ARF
      in this case has the same speed penalty as searching LAST would have
      had.
      
      Instead, introduce the notion of a reference frame search order. This
      patch preserves the former priorities, so if a frame is disabled, the
      other frames bump up a slot to take its place. This patch lays the
      groundwork for doing something smarter in the frame test order, for
      example considering temporal distance or looking at the frames used by
      nearby blocks.
      
      Change-Id: I1199149f8662a408537c653d2c021c7f1d29a700
      e55974bf
    • Attila Nagy's avatar
      Validate encoder buffer writes for single token partition · c84d42f8
      Attila Nagy authored
      Extend buffer write validation (overflow check) to single token
      partition packing, both mb and row based functions.
      
      Change-Id: I36e19b7d37fc43712d05c70e3ad223d3eb5b973d
      c84d42f8
  9. 11 Nov, 2011 1 commit
    • John Koleszar's avatar
      avoid resetting framerate during vpx_codec_enc_config_set() · bdd35c13
      John Koleszar authored
      The calculated frame_rate is a state variable in the codec, and
      shouldn't be maintained in the configuration struct. Move it to the
      main part of cpi so that it isn't clobbered when the configuration
      struct is updated. The initial framerate estimate is moved from the
      vp8_cx_iface.c wrapper into the body of init_config() in onyx_if.c, so
      that it is only called once and not reset on every call to
      vp8_change_config().
      
      Change-Id: I8d9a3d1283330d1ee297d07e9d78d1f2875f2465
      bdd35c13
  10. 08 Nov, 2011 2 commits
    • Adrian Grange's avatar
      Additional clipping of buffer level to maximum buffer size · fa25a31e
      Adrian Grange authored
      Added additional check of buffer level against maximum
      buffer size.
      
      Change-Id: Iaf1fbaf008601161e402b43ce82c3dbc129bf740
      fa25a31e
    • Adrian Grange's avatar
      Added check to make sure maximum buffer size not exceeded · 9dc95b0a
      Adrian Grange authored
      Added code to clip the buffer level to the maximum buffer
      size. Without this the buffer level would increase
      unchecked.
      
      This bug was found when encoding an essentially static
      scene at 2Mb/s. The encoder is unable to generate frames
      consistent with the high data-rate because Q bottoms out
      at Qmin.
      
      As frames generated are consistently undersized the buffer
      level increases and does not get checked against the
      maximum size specified by the user (or default).
      
      Change-Id: Id8a3c6323d3246da50f7cb53ddbf78b5528032c6
      9dc95b0a
  11. 20 Oct, 2011 1 commit
    • James Berry's avatar
      Fix: check cx_data buffer prior to write · bc715113
      James Berry authored
      check to make sure that cx_data buffer has enough room before
      writting to it, prior behavior did not which could result in a crash.
      
      Change-Id: I3fab6f2bc4a96d7c675ea81acd39ece121738b28
      bc715113
  12. 11 Oct, 2011 2 commits
    • Adrian Grange's avatar
      Added rate-targeted temporal scalability · 217591fd
      Adrian Grange authored
      Added the ability to create rate-targeted, temporally
      scalable, VP8 compatible bitstreams.
      
      The application vp8_scalable_patterns.c demonstrates how
      to use this capability. Users can create output bitstreams
      containing upto 5 temporally separable streams encoded
      as a single VP8 bitstream.
      (previously abandoned as:
      I92d1483e887adb274d07ce9e567e4d0314881b0a)
      
      Change-Id: I156250a3fe930be57c069d508c41b6a7a4ea8d6a
      217591fd
    • John Koleszar's avatar
      Reset FPU state after calc_plane_error() · 07ba4119
      John Koleszar authored
      Fixes a MMX/SSE2 mismatch when building with --enable-internal-stats.
      
      Change-Id: I0c50a1f246f6916b7a5fc6f36864ceb362f25520
      07ba4119
  13. 30 Sep, 2011 1 commit
    • Paul Wilkins's avatar
      CQ and two pass rate control. · b6e27d5f
      Paul Wilkins authored
      Changes to the selection of Q limits for two pass
      and two pass CQ mode.
      
      Allowance made for Mode and motion vector costs.
      Some refactoring of common code.
      
      For Derf and YT sets CQ mode average improvement
      circa 1% (SSIM and Global PSNR).
      
      Some increased tendency to undershoot even when
      user CQ not reached.
      
      Patch2: Removed some test code accidentally merged.
      
      Change-Id: Icf74d13af77437c08602571dc7a97e747cce5066
      b6e27d5f
  14. 29 Sep, 2011 1 commit
    • Attila Nagy's avatar
      Multithreaded encoder, late sync loopfilter · 380d64ec
      Attila Nagy authored
      Sync with loopfilter thread just at the beginning of next frame encoding.
      This returns control to application faster and allows a better multicore scaling.
      When PSNR packets are generated the final filtered frame is needed imediatly
      so we cannot delay the sync.
      
      Change-Id: I288d97b5e331d41d6f5bb49d97986fa12ac6f066
      380d64ec
  15. 25 Aug, 2011 1 commit
    • Yunqing Wang's avatar
      Minor modification on key frame decision · 1f20202e
      Yunqing Wang authored
      This change makes sure that no key frame recoding in real-time mode
      even if CONFIG_REALTIME_ONLY is not configured.
      
      Change-Id: Ifc34141f3217a6bb63cc087d78b111fadb35eec2
      1f20202e
  16. 19 Aug, 2011 1 commit
    • Alpha Lam's avatar
      Copy less when active map is in use · 4e8d35a4
      Alpha Lam authored
      When active map is specified and the current frame is not a key frame,
      golden frame nor a altref frame then copy only those active regions.
      
      This significantly reduces encoding time by as much as 19% on the test
      system where realtime encoding is used. This is particularly useful
      when the frame size is large (e.g. 2560x1600) and there's only a few
      action macroblocks.
      
      Change-Id: If394a813ec2df5a0201745d1348dbde4278f7ad4
      4e8d35a4
  17. 12 Aug, 2011 1 commit
    • John Koleszar's avatar
      Revert "Improved 1-pass CBR rate control" · e9613170
      John Koleszar authored
      This reverts commit b5ea2fbc. Further
      testing showed noticable keyframe popping in some cases, reverting this
      for now to give time for a proper fix.
      
      Conflicts:
      
      	vp8/encoder/onyx_if.c
      	vp8/encoder/ratectrl.c
      
      Change-Id: I159f53d1bf0e24c035754ab3ded8ccfd58fd04af
      e9613170
  18. 03 Aug, 2011 1 commit
    • John Koleszar's avatar
      Fix source buffer selection · 238dae86
      John Koleszar authored
      This patch fixes a bug in the interaction between the recode loop and
      spatial resampling. If the codec was in a spatial resampling state,
      and a subsequent iteration of the recode loop disables resampling,
      then the source buffer must be reset to the unscaled source.
      
      Change-Id: I4e4cd47b943f6cd26a47449dc7f4255b38e27c77
      238dae86
  19. 01 Aug, 2011 1 commit
  20. 26 Jul, 2011 1 commit
  21. 22 Jul, 2011 2 commits
    • Johann's avatar
      fix sharpness bug and clean up · a04ed0e8
      Johann authored
      sharpness was not recalculated in vp8cx_pick_filter_level_fast
      
      remove last_filter_type. all values are calculated, don't need to update
      the lfi data when it changes.
      
      always use cm->sharpness_level. the extra indirection was annoying.
      
      don't track last frame_type or sharpness_level manually. frame type
      only matters for motion search and sharpness_level is taken care of in
      frame_init
      
      move function declarations to their proper header
      
      Change-Id: I7ef037bd4bf8cf5e37d2d36bd03b5e22a2ad91db
      a04ed0e8
    • Yunqing Wang's avatar
      Preload reference area to an intermediate buffer in sub-pixel motion search · 20bd1446
      Yunqing Wang authored
      In sub-pixel motion search, the search range is small(+/- 3 pixels).
      Preload whole search area from reference buffer into a 32-byte
      aligned buffer. Then in search, load reference data from this buffer
      instead. This keeps data in cache, and reduces the crossing cache-
      line penalty. For tulip clip, tests on Intel Core2 Quad machine(linux)
      showed encoder speed improvement:
        3.4%   at --rt --cpu-used =-4
        2.8%   at --rt --cpu-used =-3
        2.3%   at --rt --cpu-used =-2
        2.2%   at --rt --cpu-used =-1
      
      Test on Atom notebook showed only 1.1% speed improvement(speed=-4).
      Test on Xeon machine also showed less improvement, since unaligned
      data access latency is greatly reduced in newer cores.
      
      Next, I will apply similar idea to other 2 sub-pixel search functions
      for encoding speed > 4.
      
      Make this change exclusively for x86 platforms.
      
      Change-Id: Ia7bb9f56169eac0f01009fe2b2f2ab5b61d2eb2f
      20bd1446
  22. 20 Jul, 2011 1 commit
    • Timothy B. Terriberry's avatar
      Increase chrow row alignment to 16 bytes. · 7d1b37cd
      Timothy B. Terriberry authored
      This is done by expanding luma row to 32-byte alignment, since
       there is currently a bunch of code that assumes that
       uv_stride == y_stride/2 (see, for example, vp8/common/postproc.c,
       common/reconinter.c, common/arm/neon/recon16x16mb_neon.asm,
       encoder/temporal_filter.c, and possibly others; I haven't done a
       full audit).
      It also uses replaces the hardcoded border of 16 in a number of
       encoder buffers with VP8BORDERINPIXELS (currently 32), as the
       chroma rows start at an offset of border/2.
      Together, these two changes have the nice advantage that simply
       dumping the frame memory as a contiguous blob produces a valid,
       if padded, image.
      
      Change-Id: Iaf5ea722ae5c82d5daa50f6e2dade9de753f1003
      7d1b37cd
  23. 18 Jul, 2011 1 commit
    • John Koleszar's avatar
      Improved 1-pass CBR rate control · b5ea2fbc
      John Koleszar authored
      This patch attempts to improve the handling of CBR streams with
      respect to the short term buffering requirements. The "buffer level"
      is changed to be an average over the rc buffer, rather than a long
      running average. Overshoot is also tracked over the same interval
      and the golden frame targets suppressed accordingly to correct for
      overly aggressive boosting.
      
      Testing shows that this is fairly consistently positive in one
      metric or another -- some clips that show significant decreases
      in quality have better buffering characteristics, others show
      improvenents in both.
      
      Change-Id: I924c89aa9bdb210271f2e03311e63de3f1f8f920
      b5ea2fbc
  24. 14 Jul, 2011 1 commit
    • John Koleszar's avatar
      Remove unused speed features · 04dce631
      John Koleszar authored
      min_fs_radius, max_fs_radius, full_freq were set but never read.
      
      Change-Id: I82657f4e7f2ba2acc3cbc3faa5ec0de5b9c6ec74
      04dce631
  25. 13 Jul, 2011 1 commit
    • Yunqing Wang's avatar
      Add improvements made in good-quality mode to real-time mode · 0e9a6ed7
      Yunqing Wang authored
      Several improvements we made in good-quality mode can be added
      into real-time mode to speed up encoding in speed 1, 2, and 3
      with small quality loss. Tests using tulip clip showed:
      
      --rt --cpu-used=-1
      (before change)
      PSNR: 38.028
      time: 1m33.195s
      (after change)
      PSNR: 38.014
      time: 1m20.851s
      
      --rt --cpu-used=-2
      (before change)
      PSNR: 37.773
      time: 0m57.650s
      (after change)
      PSNR: 37.759
      time: 0m54.594s
      
      --rt --cpu-used=-3
      (before change)
      PSNR: 37.392
      time: 0m42.865s
      (after change)
      PSNR: 37.375
      time: 0m41.949s
      
      Change-Id: I76ab2a38d72bc5efc91f6fe20d332c472f6510c9
      0e9a6ed7
  26. 08 Jul, 2011 1 commit
    • Attila Nagy's avatar
      New loop filter interface · 62295844
      Attila Nagy authored
      Separate simple filter with reduced no. of parameters.
      MB filter level picking based on precalculated table. Level table updated for
      each frame. Inside and edge limits precalculated and updated just when
      sharpness changes. HEV threshhold is constant.
      ARM targets use scalars and others vectors.
      
      Change works only with --target=generic-gnu
      All other targets have to be updated!
      
      Change-Id: I6b73aca6b525075b20129a371699b2561bd4d51c
      62295844
  27. 07 Jul, 2011 1 commit
    • John Koleszar's avatar
      Set VPX_FRAME_IS_DROPPABLE · 37de0b8b
      John Koleszar authored
      Allow the encoder to inform the application that the encoded frame will not
      be used as a reference.
      
      Change-Id: I90e41962325ef73d44da03327deb340d6f7f4860
      37de0b8b
  28. 29 Jun, 2011 1 commit
    • Paul Wilkins's avatar
      Change to arf boost calculation. · 11694aab
      Paul Wilkins authored
      In this commit I have added an experimental function
      that tests prediction quality either side of a central position
      to calculate a suggested boost number for an ARF frame.
      
      The function is passed an offset from the current position and
      a number of frames to search forwards and backwards.
      It returns a forward, backward and compound boost number.
      
      The new code can be deactivated using #define NEW_BOOST 0
      
      In its current default state the code searches forwards and backwards
      from the proposed  position of the next alt ref.
      
      The the old code used a boost number calculated by scanning forward
      from the previous GF up to the proposed alt ref frame position.
      
      I have also added some code to try and prevent placement of a gf/arf
      where there is a brief flash.
      
      Change-Id: I98af789a5181148659f10dd5dd2ff2d4250cd51c
      11694aab
  29. 28 Jun, 2011 1 commit
    • Stefan Holmer's avatar
      Adding support for independent partitions · 4cb0ebe5
      Stefan Holmer authored
      Adding support in the encoder for generating
      independent residual partitions by forcing
      equal probabilities over the prev coef entropy
      contexts.
      
      Change-Id: I402f5c353255f3ca20eae2620af739f6a498cd21
      4cb0ebe5
  30. 23 Jun, 2011 1 commit
    • John Koleszar's avatar
      Revert "Reduce overshoot in 1 pass rate control" · db67dcba
      John Koleszar authored
      This reverts commit 212f6183.
      
      Further testing shows that the overshoot accumulation/damping is too
      aggressive on some clips. Allowing the accumulated overshoot to
      decay and limiting to damping to golden frames shows some promise.
      But some clips show significant overshoot in the buffer window, so
      I think this still needs work.
      
      Change-Id: Ic02a9ca34f55229f9cc04786f4fab54cdc1a3ef5
      db67dcba
  31. 03 Jun, 2011 1 commit
    • John Koleszar's avatar
      Reduce overshoot in 1 pass rate control · 212f6183
      John Koleszar authored
      This patch attempts to reduce the peak bitrate hit by the encoder
      when using small buffer windows.
      
      Tested on the CIF set over 200-500kbps using these settings:
      
        --buf-sz=500 --buf-initial-sz=250 --buf-optimal-sz=250 \
        --undershoot-pct=100
      
      Two pass encodes were tested at best quality. One pass encodes were
      tested only at realtime speed 4:
      
        --rt --cpu-used=-4
      
      The peak datarate (over the specified 500ms window) was measured
      for each encode, and averaged together to get metric for
      "average peak," computed as SUM(peak)/SUM(target). This patch
      reduces the average peak datarate as follows:
      
        One pass:
          baseline:   1.29715
          this patch: 1.23664
      
        Two pass:
          baseline:   1.32702
          this patch: 1.37824
      
      This change had a positive effect on our quality metrics as well:
      
        One pass CBR:
                          Min  / Mean / Max (pct)
          Average PSNR    -0.42 / 2.86 / 27.32
          Overall PSNR    -0.90 / 2.00 / 17.27
          SSIM            -0.05 / 3.95 / 37.46
      
        Two pass CBR:
                          Min  / Mean / Max (pct)
          Average PSNR    -4.47 / 4.35 / 35.99
          Overall PSNR    -3.40 / 4.18 / 36.46
          SSIM            -4.56 / 6.98 / 53.67
      
        One pass VBR:
                          Min  / Mean / Max (pct)
          Average PSNR    -5.21 /  0.01 / 3.30
          Overall PSNR    -8.10 / -0.38 / 1.21
          SSIM            -7.38 / -0.11 / 3.17
          (note: most values here were close to the mean, there were a few
           outliers on files that were very sensitive to golden frame size)
      
        Two pass VBR:
                          Min  / Mean / Max (pct)
          Average PSNR    0.00 / 0.00 / 0.00
          Overall PSNR    0.00 / 0.00 / 0.00
          SSIM            0.00 / 0.00 / 0.00
      
      Neither one pass or two pass CBR mode adheres particularly strictly
      to the short term buffer constraints, and two pass is less
      consistent, even in the baseline commit. This should be addressed
      in a later commit. This likely will hurt the quality numbers, as it
      will have to reduce the burstiness of golden frames.
      
      Aside: My work on this commit makes it clear that we need to make
      rate control modes "pluggable", where you can easily write a new
      one or work on one in isolation.
      
      Change-Id: I1ea9a48f2beedd59891f1288aabf7064956b4716
      212f6183
  32. 01 Jun, 2011 2 commits
    • Ronald S. Bultje's avatar
      Fix code under #if CONFIG_INTERNAL_STATS. · 34ba1876
      Ronald S. Bultje authored
      Change-Id: Iccbd78d91c3071b16fb3b2911523a22092652ecd
      34ba1876
    • Tero Rintaluoma's avatar
      neon fast quantize block pair · 61f0c090
      Tero Rintaluoma authored
      vp8_fast_quantize_b_pair_neon function added to quantize
      two adjacent blocks at the same time to improve performance.
       - Additional 3-6% speedup compared to neon optimized fast
         quantizer (Tanya VGA@30fps, 1Mbps stream, cpu-used=-5..-16)
      
      Change-Id: I3fcbf141e5d05e9118c38ca37310458afbabaa4e
      61f0c090
  33. 31 May, 2011 1 commit
    • John Koleszar's avatar
      Initialize first_time_stamp_ever · 0a72f568
      John Koleszar authored
      Misplaced #endif caused first_time_stamp_ever to only be initialized if
      CONFIG_INTERNAL_STATS was set.
      
      Change-Id: I2296a4ab00f7dfb767583edcc5d59b94f48c0621
      0a72f568
  34. 27 May, 2011 2 commits
    • James Berry's avatar
      bug fix check frame buffer index before copy · 8795b525
      James Berry authored
      in onyx_if.c update_reference_frames() make
      sure that frame buffer indexes are not equal
      before preforming a buffer copy.  If two frames
      share the same buffer the flags will already be
      set correctly.
      
      Change-Id: Ida9b5516d08e3435c90f131d2dc19d842cfb536e
      8795b525
    • Yunqing Wang's avatar
      Use hex search for realtime mode speed>4 · 4d052bdd
      Yunqing Wang authored
      Test showed using hex search in realtime mode largely speed up
      encoding process, and still achieves similar quality like the
      diamond search we have. Therefore, removed the diamond search
      option.
      
      Change-Id: I975767d0ec0539f9f6ed7fdfc09506e39761b66c
      4d052bdd