1. 26 Oct, 2012 1 commit
    • Scott LaVarnway's avatar
      Faster 8t filtering · ce811f87
      Scott LaVarnway authored
      Quickly modified the ssse3 sixtap filters to support eight taps.  For the test
      clip used, a 23+% boost in decoder performance was seen.  We can
      revisit later and improve further.
      
      Change-Id: I5f59860459e80d6fa23e6cc0fd91296a969f5240
      ce811f87
  2. 25 Oct, 2012 1 commit
  3. 22 Oct, 2012 1 commit
  4. 19 Oct, 2012 1 commit
    • Scott LaVarnway's avatar
      sse2 intrinsic version of vp8_mbloop_filter_vertical_edge() · 085433c2
      Scott LaVarnway authored
      First sse2 version of vp8_mbloop_filter_vertical_edge().  For now,
      intrinsics are being used until the bitstream is finalized.  This function
      will be revisited later for further performance improvements.
      
      For the test clip used, a 34+% decoder performance improvement
      was seen.  This will vary depending on material.
      
      Change-Id: I455b438bc8d8af76cf7533ac42eda5f689b21f7c
      085433c2
  5. 18 Oct, 2012 1 commit
    • Scott LaVarnway's avatar
      sse2 intrinsic version of vp8_mbloop_filter_horizontal_edge() · 992b5e2d
      Scott LaVarnway authored
      First sse2 version of vp8_mbloop_filter_horizontal_edge().  For now,
      intrinsics are being used until the bitstream is finalized.  This function
      will be revisited later for further performance improvements.
      For the test clip used, a 31+% decoder performance improvement
      was seen.  This will vary depending on material.
      
      Change-Id: I03ed3a7182478bdd1f094644ff3e0442625600e7
      992b5e2d
  6. 17 Oct, 2012 1 commit
  7. 24 Aug, 2012 1 commit
    • Paul Wilkins's avatar
      New Motion Reference Search · 2d60bee1
      Paul Wilkins authored
      Alternative strategy for finding a list of candidate motion
      vectors to use as reference values in mv coding and as
      nearest and near.
      
      Sort by sad in vp8_find_best_ref_mvs() rather than just
      pick the best. Allow 0,0 as a best ref option but not a
      nearest or near unless there are no alternatives.
      
      Encode/Decode verified on at least some clips.
      
      Some commented out experimental and stats code still in place.
      
      Gain over existing code averages about 1% on derf (alll metrics)
      with improvement on all clips. Other test results pending.
      
      The entropy coding of the mode (nearest/near etc) still
      depends upon and requires the old "findnear" code so
      this needs looking at and may provide room for further gains.
      
      Change-Id: I871d7cba1d1c379c4bad9bcccce1fb19c46b8247
      2d60bee1
  8. 21 Aug, 2012 2 commits
  9. 16 Aug, 2012 1 commit
  10. 08 Aug, 2012 1 commit
  11. 23 May, 2012 1 commit
    • Yaowu Xu's avatar
      changed the way that default probs for 8x8 is set. · e9818bb6
      Yaowu Xu authored
      The commit changed how baseline 8x8 coefficient probabilities are
      initialized, to be consistent with the initialization of baseline
      4x4 coefficient probabilities.
      
      The commit does not have any effect on compression.
      
      Change-Id: Ifb3902b5dc0b0c2e6dc3aa5d4a6589d528e58355
      e9818bb6
  12. 15 Mar, 2012 1 commit
    • Yaowu Xu's avatar
      WebM Experimental Codec Branch Snapshot · 6035da54
      Yaowu Xu authored
      This is a code snapshot of experimental work currently ongoing for a
      next-generation codec.
      
      The codebase has been cut down considerably from the libvpx baseline.
      For example, we are currently only supporting VBR 2-pass rate control
      and have removed most of the code relating to coding speed, threading,
      error resilience, partitions and various other features.  This is in
      part to make the codebase easier to work on and experiment with, but
      also because we want to have an open discussion about how the bitstream
      will be structured and partitioned and not have that conversation
      constrained by past work.
      
      Our basic working pattern has been to initially encapsulate experiments
      using configure options linked to #IF CONFIG_XXX statements in the
      code. Once experiments have matured and we are reasonably happy that
      they give benefit and can be merged without breaking other experiments,
      we remove the conditional compile statements and merge them in.
      
      Current changes include:
      * Temporal coding experiment for segments (though still only 4 max, it
        will likely be increased).
      * Segment feature experiment - to allow various bits of information to
        be coded at the segment level. Features tested so far include mode
        and reference frame information, limiting end of block offset and
        transform size, alongside Q and loop filter parameters, but this set
        is very fluid.
      * Support for 8x8 transform - 8x8 dct with 2nd order 2x2 haar is used
        in MBs using 16x16 prediction modes within inter frames.
      * Compound prediction (combination of signals from existing predictors
        to create a new predictor).
      * 8 tap interpolation filters and 1/8th pel motion vectors.
      * Loop filter modifications.
      * Various entropy modifications and changes to how entropy contexts and
        updates are handled.
      * Extended quantizer range matched to transform precision improvements.
      
      There are also ongoing further experiments that we hope to merge in the
      near future: For example, coding of motion and other aspects of the
      prediction signal to better support larger image formats, use of larger
      block sizes (e.g. 32x32 and up) and lossless non-transform based coding
      options (especially for key frames). It is our hope that we will be
      able to make regular updates and we will warmly welcome community
      contributions.
      
      Please be warned that, at this stage, the codebase is currently slower
      than VP8 stable branch as most new code has not been optimized, and
      even the 'C' has been deliberately written to be simple and obvious,
      not fast.
      
      The following graphs have the initial test results, numbers in the
      tables measure the compression improvement in terms of percentage. The
      build has  the following optional experiments configured:
      --enable-experimental --enable-enhanced_interp --enable-uvintra
      --enable-high_precision_mv --enable-sixteenth_subpel_uv
      
      CIF Size clips:
      http://getwebm.org/tmp/cif/
      HD size clips:
      http://getwebm.org/tmp/hd/
      (stable_20120309 represents encoding results of WebM master branch
      build as of commit#7a159071)
      
      They were encoded using the following encode parameters:
      --good --cpu-used=0 -t 0 --lag-in-frames=25 --min-q=0 --max-q=63
      --end-usage=0 --auto-alt-ref=1 -p 2 --pass=2 --kf-max-dist=9999
      --kf-min-dist=0 --drop-frame=0 --static-thresh=0 --bias-pct=50
      --minsection-pct=0 --maxsection-pct=800 --sharpness=0
      --arnr-maxframes=7 --arnr-strength=3(for HD,6 for CIF)
      --arnr-type=3
      
      Change-Id: I5c62ed09cfff5815a2bb34e7820d6a810c23183c
      6035da54
  13. 12 Mar, 2012 1 commit
    • Yaowu Xu's avatar
      fixed .mk files to reflect add/remove of a header file · 3f5feb7d
      Yaowu Xu authored
      In a previous commit, the duplicate of headerfile defaultcoefcounts.h
      was identified. This commit updates the .mk file to ensure configure
      and make works properly for all platforms.
      
      Change-Id: I31a39c809a734ba438ee53db700f252e9a03eddd
      3f5feb7d
  14. 10 Feb, 2012 1 commit
    • Paul Wilkins's avatar
      Removal of threading code. · 2615ca5d
      Paul Wilkins authored
      For the experimental branch we are trying to slim the codebase
      down removing features such as threading for now which complicate
      the process of development and testing.
      
      Change-Id: I657c0246aef4d1fa8c8ffc6a1adfeee45bce8e24
      2615ca5d
  15. 31 Jan, 2012 1 commit
    • Paul Wilkins's avatar
      Added common prediction modules. · b2f64dff
      Paul Wilkins authored
      This function adds the common prediction modules,  some data structures
      and a config option but does not use them.
      
      It also corrects a bug in clearing down  the MODE_INFO border and introduces
      a new element that indicates if an entry corresponds to an "in image" macro block
      or is part of the border.
      
      Change-Id: Ib69eec0876173ebe9d1de9df9537d0b2447702e0
      b2f64dff
  16. 24 Jan, 2012 1 commit
    • Jim Bankoski's avatar
      vpn common -> implicit segmentation · 91325b8f
      Jim Bankoski authored
      This introduces base functions for introducing implicit segmentation.
      The code that actually stores the results to the segment map isn't
      here yet.   This just prints out the segmentation map results
      if you call it.
      
      Uses connected component labeling technique on mbmi info so that only
      if 2 mbs are horizontally or vertically touching do they get the same
      segment.
      
      vp8next - plumbing for rotation
      
      code to produce taps for rotation ( tapify. py ),  code
      for predicting using rotation ( predict_rotated.c ) ,  code
      for finding the best rotation find_rotation.c.
      
      didn't checkin code that uses this in the codec.   still work
      in progress.
      
      Fixed copyright notice
      
      Change-Id: I450c13cfa41ab2fcb699f3897760370b4935fdf8
      91325b8f
  17. 24 Oct, 2011 1 commit
    • Paul Wilkins's avatar
      Further segment feature extensions. · 01ce04bc
      Paul Wilkins authored
      This quite large check in includes the following:
      
      Merge in some code from Ronald (mbgraph.c) that scans a Gf/arf group.
      This is used as a basis for a simple segmentation for the normal frames
      in a gf/arf group. This code also uses satd functions from Yaowu.
      
      Adds functionality for coding the latest possible position of an EOB for
      blocks in the segment. (Currently 0-15 only, hence just for 4x4 dct).
      Where the EOB position is 0 this acts like "skip" and the normal coding
      of skip at the per mb level is disabled.
      
      Added functions (seg_common.c) for setting and reading segment feature
      elements. These may want to be optimized away at some point but while the
      mecahnism is in a state of flux they provide a single location for making
      changes and keep things a bit cleaner.
      
      This is still proof of concept code. Currently the tested feature set:-
      
      Quantizer,
      Loop Filter level,
      Reference frame,
      Prediction Mode,
      EOB end stop.
      
      TBD:-
      
      Add functions for setting and reading the feature data with range
      and validity checking.
      
      Handling of signed and unsigned feature data. At the moment all is assumed
      to be signed and a sign bit is coded but many cannot be negative.
      
      Correct handling of EOB feature with intra coded blocks.
      
      Testing/trapping of legal/illegal ref frame and mode combinations.
      
      Transform size switch plus merge and test with 8c8 DCT work
      
      Merge and test with Sumans Segmenation coding optimizations
      
      Change-Id: Iee12e83661c7abbd1e0ce6810915eb4ec35e2d8e
      01ce04bc
  18. 22 Sep, 2011 1 commit
    • John Koleszar's avatar
      Install missing default_coef_probs.h · 4a6ac727
      John Koleszar authored
      Make sure that this header is listed as one of the sources, so that it
      will be installed if necessary.
      
      Change-Id: I2427e494488126b179151dc21043c1e2c8ba5991
      4a6ac727
  19. 16 Aug, 2011 1 commit
    • Scott LaVarnway's avatar
      Faster vp8_default_coef_probs · 19987dcb
      Scott LaVarnway authored
      Copies from a generated table instead of building the
      default coeff probabilities during runtime.
      
      Change-Id: I4d9551ea3a2d7d4a4f7ce9eda006495221a8de50
      19987dcb
  20. 02 Aug, 2011 1 commit
  21. 01 Aug, 2011 1 commit
  22. 28 Jun, 2011 1 commit
    • Stefan Holmer's avatar
      Adding support for independent partitions · 4cb0ebe5
      Stefan Holmer authored
      Adding support in the encoder for generating
      independent residual partitions by forcing
      equal probabilities over the prev coef entropy
      contexts.
      
      Change-Id: I402f5c353255f3ca20eae2620af739f6a498cd21
      4cb0ebe5
  23. 21 Jun, 2011 1 commit
  24. 27 Apr, 2011 1 commit
    • Ronald S. Bultje's avatar
      SSE2/SSSE3 optimizations for build_predictors_mbuv{,_s}(). · 1083fe49
      Ronald S. Bultje authored
      decoding
      
      before
      10.425
      10.432
      10.423
      =10.426
      
      after:
      10.405
      10.416
      10.398
      =10.406, 0.2% faster
      
      encoding
      
      before
      14.252
      14.331
      14.250
      14.223
      14.241
      14.220
      14.221
      =14.248
      
      after
      14.095
      14.090
      14.085
      14.095
      14.064
      14.081
      14.089
      =14.086, 1.1% faster
      
      Change-Id: I483d3d8f0deda8ad434cea76e16028380722aee2
      1083fe49
  25. 18 Mar, 2011 1 commit
    • John Koleszar's avatar
      Increase static linkage, remove unused functions · 429dc676
      John Koleszar authored
      A large number of functions were defined with external linkage, even
      though they were only used from within one file. This patch changes
      their linkage to static and removes the vp8_ prefix from their names,
      which should make it more obvious to the reader that the function is
      contained within the current translation unit. Functions that were
      not referenced were removed.
      
      These symbols were identified by:
      
        $ nm -A libvpx.a | sort -k3 | uniq -c -f2 | grep ' [A-Z] ' \
          | sort | grep '^ *1 '
      
      Change-Id: I59609f58ab65312012c047036ae1e0634f795779
      429dc676
  26. 09 Mar, 2011 1 commit
  27. 18 Feb, 2011 2 commits
  28. 10 Feb, 2011 1 commit
    • John Koleszar's avatar
      Fix relative include paths · 02321de0
      John Koleszar authored
      Allow compiling without adding vp8/{common,encoder,decoder} to the
      include paths.
      
      Change-Id: Ifeb5dac351cdfadcd659736f5158b315a0030b6c
      02321de0
  29. 09 Feb, 2011 1 commit
    • Tero Rintaluoma's avatar
      Adds armv6 optimized variance calculation · cb14764f
      Tero Rintaluoma authored
      Adds vp8_sub_pixel_variance16x16_armv6 function to encoder. Integrates
      ARMv6 optimized bilinear interpolations from vp8/common/arm/armv6
      and adds new assembly file for variance16x16 calculation.
       - vp8_filter_block2d_bil_first_pass_armv6   (integrated)
       - vp8_filter_block2d_bil_second_pass_armv6  (integrated)
       - vp8_variance16x16_armv6 (new)
       - bilinearfilter_arm.h (new)
      Change-Id: I18a8331ce7d031ceedd6cd415ecacb0c8f3392db
      cb14764f
  30. 08 Feb, 2011 2 commits
    • Johann's avatar
      clean up bilinear filter · e5aaac24
      Johann authored
      make reference version of bilinear_filters short.
      use reference versions of bilinear_filters and sub_pel_filters when
      possible.
      
      recognize that Width was being passed into
      filter_block2d_bil_first_pass multiple times. ARM version had already
      fixed this. propegate to C.
      
      change references to src_pixels_per_line to src_pitch and standardize on
      src/dst (instead of input/output).
      
      recognize that first_pass is only run in the verticle and second_pass
      only horizontal. ARM version had already fixed this. propegate to C
      
      Change-Id: I292d376d239a9a7ca37ec2bf03cc0720606983e2
      e5aaac24
    • Johann's avatar
      clarify *_offsets.asm differences · 40dcae9c
      Johann authored
      it's difficult to mux the *_offsets.c files because of header conflicts.
      make three instead, name them consistently and partititon the contents
      to allow building them as required.
      
      Change-Id: I8f9768c09279f934f44b6c5b0ec363f7943bb796
      40dcae9c
  31. 07 Feb, 2011 1 commit
    • Johann's avatar
      move one of the offset files · 3273c7b6
      Johann authored
      common/arm/vpx_asm_offsets moves up a level. prepare for muxing with
      encoder/arm/vpx_vp8_enc_asm_offsets
      
      Change-Id: I89a04a5235447e66571995c9d9b4b6edcb038e24
      3273c7b6
  32. 13 Dec, 2010 1 commit
    • John Koleszar's avatar
      remove unused temporal preproc code · b1aa54ab
      John Koleszar authored
      This code is unused, as the current preproc implementation uses the
      same spatial filter that postproc uses.
      
      Change-Id: Ia06d5664917d67283f279e2480016bebed602ea7
      b1aa54ab
  33. 16 Nov, 2010 1 commit
  34. 26 Oct, 2010 2 commits
    • John Koleszar's avatar
      make vp8_recon16x16mb{,y} RTCD functions · d6c67f02
      John Koleszar authored
      ARM NEON has a platform specific version of vp8_recon16x16mb, though
      it's just a stub to extract the various parameters from the
      MACROBLOCKD struct and pass them to vp8_recon16x16mb_neon(). Using
      that function's prototype directly will be a better long term solution,
      but it's quite an invasive change.
      
      Change-Id: I04273149e2ade34749e2d09e7edb0c396e1dd620
      d6c67f02
    • John Koleszar's avatar
      arm: move unrolled loops back to generic code · 19638c23
      John Koleszar authored
      Some of the ARM functions differed from their generic counterparts
      only by unrolling their loops. Since this change may be useful
      on other platforms, or might even supercede the looped version
      in the generic case, move it back to the generic file.
      
      This code is left under #if ARCH_ARM for now, but it may be worth
      considering a different (possibly new) conditional for these. If
      it turns out that this should be runtime selectable, these
      functions will have to move to the RTCD infrastructure. Don't want
      to take that step at this time without more profile data.
      
      Change-Id: I4612fdbc606fbebba4971a690fb743ad184ff15f
      19638c23
  35. 25 Oct, 2010 2 commits
    • Johann's avatar
      reuse common loopfilter code · 1376f061
      Johann authored
      there were four versions for the regular and
      macroblock loopfilters:
      horizontal [y|uv]
      vertical [y|uv]
      
      this moves all the common code into 2 functions:
      vp8_loop_filter_neon
      vp8_mbloop_filter_neon
      
      this provides no gain in performance. there's a bit
      of jitter, but it trends down ~0.25-0.5%. however,
      this is a huge gain maintenance. also, there is the
      potential to drop some stack usage in the macroblock
      loopfilter.
      
      Change-Id: I91506f07d2f449631ff67ad6f1b3f3be63b81a92
      1376f061
    • Timothy B. Terriberry's avatar
      Add runtime CPU detection support for ARM. · b71962fd
      Timothy B. Terriberry authored
      The primary goal is to allow a binary to be built which supports
       NEON, but can fall back to non-NEON routines, since some Android
       devices do not have NEON, even if they are otherwise ARMv7 (e.g.,
       Tegra).
      The configure-generated flags HAVE_ARMV7, etc., are used to decide
       which versions of each function to build, and when
       CONFIG_RUNTIME_CPU_DETECT is enabled, the correct version is chosen
       at run time.
      In order for this to work, the CFLAGS must be set to something
       appropriate (e.g., without -mfpu=neon for ARMv7, and with
       appropriate -march and -mcpu for even earlier configurations), or
       the native C code will not be able to run.
      The ASFLAGS must remain set for the most advanced instruction set
       required at build time, since the ARM assembler will refuse to emit
       them otherwise.
      I have not attempted to make any changes to configure to do this
       automatically.
      Doing so will probably require the addition of new configure options.
      
      Many of the hooks for RTCD on A...
      b71962fd