1. 21 Dec, 2011 1 commit
    • John Koleszar's avatar
      Remove opaque pointer VP8_PTR · b0056c3b
      John Koleszar authored
      Use an opaque struct rather than typecasting through VP8_PTR, an int*.
      
      Change-Id: I5ed4d9238ba2e8d51bfa07a8da87a2eb4c8fa43a
      b0056c3b
  2. 15 Dec, 2011 1 commit
    • Scott LaVarnway's avatar
      Moved dequant idct into common · a53d5a4c
      Scott LaVarnway authored
      These functions are now used by the encoder.
      This is WIP with the goal of creating a common idct/add for
      the encoder and decoder.  A boost of 1.8% was seen for
      the HD rt test clip used.
      
      [Tero] Added needed changes to ARM side.
      
      Change-Id: Ibbb8000be09034203d7adffc457d3c3f8b06a5bf
      a53d5a4c
  3. 08 Dec, 2011 1 commit
  4. 05 Dec, 2011 1 commit
    • Yunqing Wang's avatar
      Multiple-resolution encoder · aa7335e6
      Yunqing Wang authored
      The example encoder down-samples the input video frames a number of
      times with a down-sampling factor, and then encodes and outputs
      bitstreams with different resolutions.
      
      Support arbitrary down-sampling factor, and down-sampling factor
      can be different for each encoding level.
      
      For example, the encoder can be tested as follows.
      1. Configure with multi-resolution encoding enabled:
      ../libvpx/configure --target=x86-linux-gcc --disable-codecs
      --enable-vp8 --enable-runtime_cpu_detect --enable-debug
      --disable-install-docs --enable-error-concealment
      --enable-multi-res-encoding
      2. Run make
      3. Encode:
      If input video is 1280x720, run:
      ./vp8_multi_resolution_encoder 1280 720 input.yuv 1.ivf 2.ivf 3.ivf 1
      (output: 1.ivf(1280x720); 2.ivf(640x360); 3.ivf(320x180).
      The last parameter is set to 1/0 to show/not show PSNR.)
      4. Decode:
      ./simple_decoder 1.ivf 1.yuv
      ./simple_decoder 2.ivf 2.yuv
      ./simple_decoder 3.ivf 3.yuv
      5. View video:
      mplayer 1.yuv -demuxer rawvideo -rawvideo w=1280:h=720 -loop 0 -fps 30
      mplayer 2.yuv -demuxer rawvideo -rawvideo w=640:h=360 -loop 0 -fps 30
      mplayer 3.yuv -demuxer rawvideo -rawvideo w=320:h=180 -loop 0 -fps 30
      
      The encoding parameters can be modified in vp8_multi_resolution_encoder.c,
      for example, target bitrate, frame rate...
      
      Modified API. John helped a lot with that. Thanks!
      
      Change-Id: I03be9a51167eddf94399f92d269599fb3f3d54f5
      aa7335e6
  5. 25 Nov, 2011 1 commit
    • Scott LaVarnway's avatar
      Modified the inverse walsh to output directly · 4a91541c
      Scott LaVarnway authored
      to the dqcoeff or qcoeff buffer.  The encoder would
      populate the dc coeffs of the y blocks as a separate
      stage (recon_dcblock) and the decoder would use a special
      version of the idct.  This change eliminates the extra copy
      and reduces the code footprint.
      
      [Tero] Added needed changes to armv6 and NEON assembly.
      
      Change-Id: I83202ffdbaf83f6e5dd69f4ba2519fcf0b13b3ba
      4a91541c
  6. 19 Nov, 2011 1 commit
    • Johann's avatar
      Move shared data to shared location · f2cd4ded
      Johann authored
      Storing vp8_bilinear_filters_mmx in an mmx file and using it in an sse2
      file is bad
      
      Moving towards allowing --disable-mmx
      
      Change-Id: I20493b35bdedcdcfc0915e6f05fdbe6c81a4a742
      f2cd4ded
  7. 11 Nov, 2011 1 commit
    • John Koleszar's avatar
      avoid resetting framerate during vpx_codec_enc_config_set() · bdd35c13
      John Koleszar authored
      The calculated frame_rate is a state variable in the codec, and
      shouldn't be maintained in the configuration struct. Move it to the
      main part of cpi so that it isn't clobbered when the configuration
      struct is updated. The initial framerate estimate is moved from the
      vp8_cx_iface.c wrapper into the body of init_config() in onyx_if.c, so
      that it is only called once and not reset on every call to
      vp8_change_config().
      
      Change-Id: I8d9a3d1283330d1ee297d07e9d78d1f2875f2465
      bdd35c13
  8. 09 Nov, 2011 3 commits
    • Scott LaVarnway's avatar
      Relocated idct/add calls for encoder · 861ed6a5
      Scott LaVarnway authored
      Call the idct/add after the tokenize.  This is WIP with
      the goal of creating a common idct/add for the encoder and
      decoder. This move is necessary because the decoder's version
      of the idct clobbers qcoeff, which is used by the tokenize.
      
      Change-Id: I6b08d8e8397cd873647fa4fb9469884e3c876756
      861ed6a5
    • Tero Rintaluoma's avatar
      ARMv6 optimized Intra4x4 prediction · 5a2fd63a
      Tero Rintaluoma authored
      Added ARM optimized intra 4x4 prediction
       - 2x faster on Profiler compared to C-code compiled with -O3
       - Function interface changed a little to improve BLOCKD structure
         access
      
      Change-Id: I9bc2b723155943fe0cf03dd9ca5f1760f7a81f54
      5a2fd63a
    • James Zern's avatar
      threading: avoid defining _WIN32_WINNT · 9d605061
      James Zern authored
      The referenced function (SignalObjectAndWait) isn't used. Reduces the
      warnings with mingw32-w64 which defines this.
      
      Change-Id: I4ce592879ec9372bf196dac640204c4d370bd210
      9d605061
  9. 08 Nov, 2011 1 commit
    • John Koleszar's avatar
      Remove unused file recon.c · f89e109f
      John Koleszar authored
      File not referenced from anywhere and no longer compiles.
      
      Change-Id: I38b11bd60db615c2c2c9d7ad35caba3a1adf1750
      f89e109f
  10. 05 Nov, 2011 1 commit
    • James Zern's avatar
      fix file permissions · f89ea343
      James Zern authored
      all of googletest import (0ab00a22) was marked executable
      
      Change-Id: Id7b7ee03efc21ab998bb03349bd91644e8af25da
      f89ea343
  11. 03 Nov, 2011 1 commit
    • Tero Rintaluoma's avatar
      Change use of eob in the encoder · e4f2ec7a
      Tero Rintaluoma authored
      Changed 'int eob' to 'char *eob' in BLOCKD so that both encoder and
      decoder will use eobs[25] array from MACROBLOCKD structure. In future,
      this will enable use of the decoder side IDCT in the encoder.
      
      Change-Id: I6e1c011628cb8864fd4a0b80f0279ce16a5ca978
      e4f2ec7a
  12. 01 Nov, 2011 1 commit
    • Stefan Holmer's avatar
      Changing decoder input partition API to input fragments. · 14272052
      Stefan Holmer authored
      Adding support for several partitions within one input fragment.
      This is necessary to fully support all possible packetization
      combinations in the VP8 RTP profile. Several partitions can
      be transmitted in the same packet, and they can only be split
      by reading the partition lengths from the bitstream.
      
      Change-Id: If7d7ea331cc78cb7efd74c4a976b720c9a655463
      14272052
  13. 27 Oct, 2011 1 commit
    • Scott LaVarnway's avatar
      Improved decode_split_mv() · 6064384d
      Scott LaVarnway authored
      Tests showed ~1.2% performance boost on the HD clip used.
      Performance will vary based on material.
      
      Change-Id: Icbcf1a828750d5b4ae5252bf596b3ef594042e8a
      6064384d
  14. 26 Oct, 2011 2 commits
    • Scott LaVarnway's avatar
      Improved mv_bias · 21970d1d
      Scott LaVarnway authored
      Small performance gains.
      
      Change-Id: I709b9390a8a27a70f5f23574313b8db85ac7f23d
      21970d1d
    • Attila Nagy's avatar
      Reduce partial frame copy in encoder's pick_filter_level_fast · de828094
      Attila Nagy authored
      The partial frame copy function used to copy an extra 8 lines above
      and  below. The partial frame filtering can only modify 3 pixel rows
      above the partial frame. Reduce copy to bare minimum needed, which is
      4 lines, so that partial filtering on copied frame is possible.
      
      Define the "magic" fraction number for partial filtering in
      loopfilter.h .
      
      Change-Id: I4791ffc541b6884b12759a0d0714a8faf16147ec
      de828094
  15. 20 Oct, 2011 1 commit
    • James Berry's avatar
      Fix: check cx_data buffer prior to write · bc715113
      James Berry authored
      check to make sure that cx_data buffer has enough room before
      writting to it, prior behavior did not which could result in a crash.
      
      Change-Id: I3fab6f2bc4a96d7c675ea81acd39ece121738b28
      bc715113
  16. 18 Oct, 2011 1 commit
    • Scott LaVarnway's avatar
      Remove usage of predict buffer for decode · ed9c66f5
      Scott LaVarnway authored
      Instead of using the predict buffer, the decoder now writes
      the predictor into the recon buffer.  For blocks with eob=0,
      unnecessary idcts can be eliminated.  This gave a performance
      boost of ~1.8% for the HD clips used.
      
      Tero: Added needed changes to ARM side and scheduled some
            assembly code to prevent interlocks.
      
      Patch Set 6:  Merged (I1bcdca7a95aacc3a181b9faa6b10e3a71ee24df3)
      into this commit because of similarities in the idct
      functions.
      Patch Set 7: EC bug fix.
      
      Change-Id: Ie31d90b5d3522e1108163f2ac491e455e3f955e6
      ed9c66f5
  17. 11 Oct, 2011 1 commit
    • Adrian Grange's avatar
      Added rate-targeted temporal scalability · 217591fd
      Adrian Grange authored
      Added the ability to create rate-targeted, temporally
      scalable, VP8 compatible bitstreams.
      
      The application vp8_scalable_patterns.c demonstrates how
      to use this capability. Users can create output bitstreams
      containing upto 5 temporally separable streams encoded
      as a single VP8 bitstream.
      (previously abandoned as:
      I92d1483e887adb274d07ce9e567e4d0314881b0a)
      
      Change-Id: I156250a3fe930be57c069d508c41b6a7a4ea8d6a
      217591fd
  18. 10 Oct, 2011 1 commit
  19. 30 Sep, 2011 2 commits
    • Scott LaVarnway's avatar
      Improved tokenize · ab00d209
      Scott LaVarnway authored
      For a realtime HD encodings, up to 1.6% gains seen.
      
      
      
      Change-Id: If45028e23db95124da63f9d38ffe06e05596cc6e
      ab00d209
    • Johann's avatar
      combine loopfilter data access · 3556deac
      Johann authored
      The data processed by the loopfilter overlaps. At the block level, this
      results in some redundant transforms. Grouping the filtering allows for
      a single 16x16 transpose (and inversion) instead of three 16x8 transposes
      (and three more inversions).
      
      This implementation is x86_64 only. We retain the previous
      implementation for x86.
      
      Improvements are obviously material dependant, but it seems to be ~%1 in
      tests here.
      
      Change-Id: I467b7ec3655be98fb5f1a94b5d145e5e5a660007
      3556deac
  20. 29 Sep, 2011 1 commit
  21. 22 Sep, 2011 1 commit
  22. 16 Sep, 2011 1 commit
    • Scott LaVarnway's avatar
      clamp_mvs() using the wrong motion vector information · c0ee870b
      Scott LaVarnway authored
      In the "Removed bmi copy to/from BLOCKD" commit, the copy
      to the bmi in BLOCKD was eliminated.  The clamp_mvs() used
      the bmi in BLOCKD, which now contains incorrect values.  This
      patch fixes this problem.
      
      Change-Id: I8eca1eaf4015052b0b63e90876f7ad321aba7cff
      c0ee870b
  23. 24 Aug, 2011 3 commits
    • Scott LaVarnway's avatar
      Removed bmi copy to/from BLOCKD · b870947d
      Scott LaVarnway authored
      for SPLITMV and B_PRED modes.  Modified code to use the bmi
      found in mode_info_context instead of BLOCKD.  On the decode
      side, the uvmvs are calculated only when required, instead of
      every macroblock.  This is WIP. (bmi should eventually be
      removed from BLOCKD)
      Small performance gains noticed for RT encodes and decodes.(VGA)
      
      Change-Id: I2ed7f0fd5ca733655df684aa82da575c77a973e7
      b870947d
    • Fritz Koenig's avatar
      Fix naming of sse2 idct functions. · 112bd4e2
      Fritz Koenig authored
      Prepend idct function names with vp8_
      so that under profiling they show up
      associated with libvpx.
      
      Change-Id: I4fe357b50236cb7730a4cc00164c0a3487a1d8b4
      112bd4e2
    • Johann's avatar
      Fix data accesses for simple loopfilters · 85358d04
      Johann authored
      The data that the simple horizontal loopfilter reads is aligned, treat
      it accordingly.
      
      For the vertical, we only use the bottom 4 bytes, so don't read in 16
      (and incur the penalty for unaligned access).
      
      This shows a small improvement on older processors which have a
      significant penalty for unaligned reads.
      
      postproc_mmx.c is unused
      
      Change-Id: I87b29bbc0c3b19ee1ca1de3c4f47332a53087b3d
      85358d04
  24. 23 Aug, 2011 1 commit
    • Fritz Koenig's avatar
      Use local labels for jumps/loops in x86 assembly. · c5f890af
      Fritz Koenig authored
      Prepend . to local labels in assembly code.  This
      allows non unique labels within a file.  Also
      makes profiling information more informative
      by keeping the function name with the loop name.
      
      Change-Id: I7a983cb3a5ba2413d5dafd0a37936b268fb9e37f
      c5f890af
  25. 19 Aug, 2011 1 commit
    • Alpha Lam's avatar
      Copy less when active map is in use · 4e8d35a4
      Alpha Lam authored
      When active map is specified and the current frame is not a key frame,
      golden frame nor a altref frame then copy only those active regions.
      
      This significantly reduces encoding time by as much as 19% on the test
      system where realtime encoding is used. This is particularly useful
      when the frame size is large (e.g. 2560x1600) and there's only a few
      action macroblocks.
      
      Change-Id: If394a813ec2df5a0201745d1348dbde4278f7ad4
      4e8d35a4
  26. 16 Aug, 2011 1 commit
    • Scott LaVarnway's avatar
      Faster vp8_default_coef_probs · 19987dcb
      Scott LaVarnway authored
      Copies from a generated table instead of building the
      default coeff probabilities during runtime.
      
      Change-Id: I4d9551ea3a2d7d4a4f7ce9eda006495221a8de50
      19987dcb
  27. 12 Aug, 2011 1 commit
    • John Koleszar's avatar
      Revert "Improved 1-pass CBR rate control" · e9613170
      John Koleszar authored
      This reverts commit b5ea2fbc. Further
      testing showed noticable keyframe popping in some cases, reverting this
      for now to give time for a proper fix.
      
      Conflicts:
      
      	vp8/encoder/onyx_if.c
      	vp8/encoder/ratectrl.c
      
      Change-Id: I159f53d1bf0e24c035754ab3ded8ccfd58fd04af
      e9613170
  28. 02 Aug, 2011 1 commit
  29. 01 Aug, 2011 1 commit
  30. 26 Jul, 2011 1 commit
  31. 25 Jul, 2011 1 commit
  32. 22 Jul, 2011 2 commits
    • Johann's avatar
      fix sharpness bug and clean up · a04ed0e8
      Johann authored
      sharpness was not recalculated in vp8cx_pick_filter_level_fast
      
      remove last_filter_type. all values are calculated, don't need to update
      the lfi data when it changes.
      
      always use cm->sharpness_level. the extra indirection was annoying.
      
      don't track last frame_type or sharpness_level manually. frame type
      only matters for motion search and sharpness_level is taken care of in
      frame_init
      
      move function declarations to their proper header
      
      Change-Id: I7ef037bd4bf8cf5e37d2d36bd03b5e22a2ad91db
      a04ed0e8
    • Yunqing Wang's avatar
      Preload reference area to an intermediate buffer in sub-pixel motion search · 20bd1446
      Yunqing Wang authored
      In sub-pixel motion search, the search range is small(+/- 3 pixels).
      Preload whole search area from reference buffer into a 32-byte
      aligned buffer. Then in search, load reference data from this buffer
      instead. This keeps data in cache, and reduces the crossing cache-
      line penalty. For tulip clip, tests on Intel Core2 Quad machine(linux)
      showed encoder speed improvement:
        3.4%   at --rt --cpu-used =-4
        2.8%   at --rt --cpu-used =-3
        2.3%   at --rt --cpu-used =-2
        2.2%   at --rt --cpu-used =-1
      
      Test on Atom notebook showed only 1.1% speed improvement(speed=-4).
      Test on Xeon machine also showed less improvement, since unaligned
      data access latency is greatly reduced in newer cores.
      
      Next, I will apply similar idea to other 2 sub-pixel search functions
      for encoding speed > 4.
      
      Make this change exclusively for x86 platforms.
      
      Change-Id: Ia7bb9f56169eac0f01009fe2b2f2ab5b61d2eb2f
      20bd1446
  33. 19 Jul, 2011 1 commit
    • Scott LaVarnway's avatar
      Moved vp8_encode_bool into boolhuff.h · a25f6a9c
      Scott LaVarnway authored
      allowing the compiler to inline this function.  For real-time
      encodes, this gave a boost of 1% to 2.5%, depending on the
      speed setting.
      
      Change-Id: I3929d176cca086b4261267b848419d5bcff21c02
      a25f6a9c