1. 15 Mar, 2012 1 commit
    • Yaowu Xu's avatar
      WebM Experimental Codec Branch Snapshot · 6035da54
      Yaowu Xu authored
      This is a code snapshot of experimental work currently ongoing for a
      next-generation codec.
      
      The codebase has been cut down considerably from the libvpx baseline.
      For example, we are currently only supporting VBR 2-pass rate control
      and have removed most of the code relating to coding speed, threading,
      error resilience, partitions and various other features.  This is in
      part to make the codebase easier to work on and experiment with, but
      also because we want to have an open discussion about how the bitstream
      will be structured and partitioned and not have that conversation
      constrained by past work.
      
      Our basic working pattern has been to initially encapsulate experiments
      using configure options linked to #IF CONFIG_XXX statements in the
      code. Once experiments have matured and we are reasonably happy that
      they give benefit and can be merged without breaking other experiments,
      we remove the conditional compile statements and merge them in.
      
      Current changes include:
      * Temporal coding experiment for segments (though still only 4 max, it
        will likely be increased).
      * Segment feature experiment - to allow various bits of information to
        be coded at the segment level. Features tested so far include mode
        and reference frame information, limiting end of block offset and
        transform size, alongside Q and loop filter parameters, but this set
        is very fluid.
      * Support for 8x8 transform - 8x8 dct with 2nd order 2x2 haar is used
        in MBs using 16x16 prediction modes within inter frames.
      * Compound prediction (combination of signals from existing predictors
        to create a new predictor).
      * 8 tap interpolation filters and 1/8th pel motion vectors.
      * Loop filter modifications.
      * Various entropy modifications and changes to how entropy contexts and
        updates are handled.
      * Extended quantizer range matched to transform precision improvements.
      
      There are also ongoing further experiments that we hope to merge in the
      near future: For example, coding of motion and other aspects of the
      prediction signal to better support larger image formats, use of larger
      block sizes (e.g. 32x32 and up) and lossless non-transform based coding
      options (especially for key frames). It is our hope that we will be
      able to make regular updates and we will warmly welcome community
      contributions.
      
      Please be warned that, at this stage, the codebase is currently slower
      than VP8 stable branch as most new code has not been optimized, and
      even the 'C' has been deliberately written to be simple and obvious,
      not fast.
      
      The following graphs have the initial test results, numbers in the
      tables measure the compression improvement in terms of percentage. The
      build has  the following optional experiments configured:
      --enable-experimental --enable-enhanced_interp --enable-uvintra
      --enable-high_precision_mv --enable-sixteenth_subpel_uv
      
      CIF Size clips:
      http://getwebm.org/tmp/cif/
      HD size clips:
      http://getwebm.org/tmp/hd/
      (stable_20120309 represents encoding results of WebM master branch
      build as of commit#7a159071)
      
      They were encoded using the following encode parameters:
      --good --cpu-used=0 -t 0 --lag-in-frames=25 --min-q=0 --max-q=63
      --end-usage=0 --auto-alt-ref=1 -p 2 --pass=2 --kf-max-dist=9999
      --kf-min-dist=0 --drop-frame=0 --static-thresh=0 --bias-pct=50
      --minsection-pct=0 --maxsection-pct=800 --sharpness=0
      --arnr-maxframes=7 --arnr-strength=3(for HD,6 for CIF)
      --arnr-type=3
      
      Change-Id: I5c62ed09cfff5815a2bb34e7820d6a810c23183c
      6035da54
  2. 22 Sep, 2011 1 commit
    • John Koleszar's avatar
      Install missing default_coef_probs.h · 4a6ac727
      John Koleszar authored
      Make sure that this header is listed as one of the sources, so that it
      will be installed if necessary.
      
      Change-Id: I2427e494488126b179151dc21043c1e2c8ba5991
      4a6ac727
  3. 16 Aug, 2011 1 commit
    • Scott LaVarnway's avatar
      Faster vp8_default_coef_probs · 19987dcb
      Scott LaVarnway authored
      Copies from a generated table instead of building the
      default coeff probabilities during runtime.
      
      Change-Id: I4d9551ea3a2d7d4a4f7ce9eda006495221a8de50
      19987dcb
  4. 02 Aug, 2011 1 commit
  5. 01 Aug, 2011 1 commit
  6. 28 Jun, 2011 1 commit
    • Stefan Holmer's avatar
      Adding support for independent partitions · 4cb0ebe5
      Stefan Holmer authored
      Adding support in the encoder for generating
      independent residual partitions by forcing
      equal probabilities over the prev coef entropy
      contexts.
      
      Change-Id: I402f5c353255f3ca20eae2620af739f6a498cd21
      4cb0ebe5
  7. 21 Jun, 2011 1 commit
  8. 27 Apr, 2011 1 commit
    • Ronald S. Bultje's avatar
      SSE2/SSSE3 optimizations for build_predictors_mbuv{,_s}(). · 1083fe49
      Ronald S. Bultje authored
      decoding
      
      before
      10.425
      10.432
      10.423
      =10.426
      
      after:
      10.405
      10.416
      10.398
      =10.406, 0.2% faster
      
      encoding
      
      before
      14.252
      14.331
      14.250
      14.223
      14.241
      14.220
      14.221
      =14.248
      
      after
      14.095
      14.090
      14.085
      14.095
      14.064
      14.081
      14.089
      =14.086, 1.1% faster
      
      Change-Id: I483d3d8f0deda8ad434cea76e16028380722aee2
      1083fe49
  9. 18 Mar, 2011 1 commit
    • John Koleszar's avatar
      Increase static linkage, remove unused functions · 429dc676
      John Koleszar authored
      A large number of functions were defined with external linkage, even
      though they were only used from within one file. This patch changes
      their linkage to static and removes the vp8_ prefix from their names,
      which should make it more obvious to the reader that the function is
      contained within the current translation unit. Functions that were
      not referenced were removed.
      
      These symbols were identified by:
      
        $ nm -A libvpx.a | sort -k3 | uniq -c -f2 | grep ' [A-Z] ' \
          | sort | grep '^ *1 '
      
      Change-Id: I59609f58ab65312012c047036ae1e0634f795779
      429dc676
  10. 09 Mar, 2011 1 commit
  11. 18 Feb, 2011 2 commits
  12. 10 Feb, 2011 1 commit
    • John Koleszar's avatar
      Fix relative include paths · 02321de0
      John Koleszar authored
      Allow compiling without adding vp8/{common,encoder,decoder} to the
      include paths.
      
      Change-Id: Ifeb5dac351cdfadcd659736f5158b315a0030b6c
      02321de0
  13. 09 Feb, 2011 1 commit
    • Tero Rintaluoma's avatar
      Adds armv6 optimized variance calculation · cb14764f
      Tero Rintaluoma authored
      Adds vp8_sub_pixel_variance16x16_armv6 function to encoder. Integrates
      ARMv6 optimized bilinear interpolations from vp8/common/arm/armv6
      and adds new assembly file for variance16x16 calculation.
       - vp8_filter_block2d_bil_first_pass_armv6   (integrated)
       - vp8_filter_block2d_bil_second_pass_armv6  (integrated)
       - vp8_variance16x16_armv6 (new)
       - bilinearfilter_arm.h (new)
      Change-Id: I18a8331ce7d031ceedd6cd415ecacb0c8f3392db
      cb14764f
  14. 08 Feb, 2011 2 commits
    • Johann's avatar
      clean up bilinear filter · e5aaac24
      Johann authored
      make reference version of bilinear_filters short.
      use reference versions of bilinear_filters and sub_pel_filters when
      possible.
      
      recognize that Width was being passed into
      filter_block2d_bil_first_pass multiple times. ARM version had already
      fixed this. propegate to C.
      
      change references to src_pixels_per_line to src_pitch and standardize on
      src/dst (instead of input/output).
      
      recognize that first_pass is only run in the verticle and second_pass
      only horizontal. ARM version had already fixed this. propegate to C
      
      Change-Id: I292d376d239a9a7ca37ec2bf03cc0720606983e2
      e5aaac24
    • Johann's avatar
      clarify *_offsets.asm differences · 40dcae9c
      Johann authored
      it's difficult to mux the *_offsets.c files because of header conflicts.
      make three instead, name them consistently and partititon the contents
      to allow building them as required.
      
      Change-Id: I8f9768c09279f934f44b6c5b0ec363f7943bb796
      40dcae9c
  15. 07 Feb, 2011 1 commit
    • Johann's avatar
      move one of the offset files · 3273c7b6
      Johann authored
      common/arm/vpx_asm_offsets moves up a level. prepare for muxing with
      encoder/arm/vpx_vp8_enc_asm_offsets
      
      Change-Id: I89a04a5235447e66571995c9d9b4b6edcb038e24
      3273c7b6
  16. 13 Dec, 2010 1 commit
    • John Koleszar's avatar
      remove unused temporal preproc code · b1aa54ab
      John Koleszar authored
      This code is unused, as the current preproc implementation uses the
      same spatial filter that postproc uses.
      
      Change-Id: Ia06d5664917d67283f279e2480016bebed602ea7
      b1aa54ab
  17. 26 Oct, 2010 2 commits
    • John Koleszar's avatar
      make vp8_recon16x16mb{,y} RTCD functions · d6c67f02
      John Koleszar authored
      ARM NEON has a platform specific version of vp8_recon16x16mb, though
      it's just a stub to extract the various parameters from the
      MACROBLOCKD struct and pass them to vp8_recon16x16mb_neon(). Using
      that function's prototype directly will be a better long term solution,
      but it's quite an invasive change.
      
      Change-Id: I04273149e2ade34749e2d09e7edb0c396e1dd620
      d6c67f02
    • John Koleszar's avatar
      arm: move unrolled loops back to generic code · 19638c23
      John Koleszar authored
      Some of the ARM functions differed from their generic counterparts
      only by unrolling their loops. Since this change may be useful
      on other platforms, or might even supercede the looped version
      in the generic case, move it back to the generic file.
      
      This code is left under #if ARCH_ARM for now, but it may be worth
      considering a different (possibly new) conditional for these. If
      it turns out that this should be runtime selectable, these
      functions will have to move to the RTCD infrastructure. Don't want
      to take that step at this time without more profile data.
      
      Change-Id: I4612fdbc606fbebba4971a690fb743ad184ff15f
      19638c23
  18. 25 Oct, 2010 2 commits
    • Johann's avatar
      reuse common loopfilter code · 1376f061
      Johann authored
      there were four versions for the regular and
      macroblock loopfilters:
      horizontal [y|uv]
      vertical [y|uv]
      
      this moves all the common code into 2 functions:
      vp8_loop_filter_neon
      vp8_mbloop_filter_neon
      
      this provides no gain in performance. there's a bit
      of jitter, but it trends down ~0.25-0.5%. however,
      this is a huge gain maintenance. also, there is the
      potential to drop some stack usage in the macroblock
      loopfilter.
      
      Change-Id: I91506f07d2f449631ff67ad6f1b3f3be63b81a92
      1376f061
    • Timothy B. Terriberry's avatar
      Add runtime CPU detection support for ARM. · b71962fd
      Timothy B. Terriberry authored
      The primary goal is to allow a binary to be built which supports
       NEON, but can fall back to non-NEON routines, since some Android
       devices do not have NEON, even if they are otherwise ARMv7 (e.g.,
       Tegra).
      The configure-generated flags HAVE_ARMV7, etc., are used to decide
       which versions of each function to build, and when
       CONFIG_RUNTIME_CPU_DETECT is enabled, the correct version is chosen
       at run time.
      In order for this to work, the CFLAGS must be set to something
       appropriate (e.g., without -mfpu=neon for ARMv7, and with
       appropriate -march and -mcpu for even earlier configurations), or
       the native C code will not be able to run.
      The ASFLAGS must remain set for the most advanced instruction set
       required at build time, since the ARM assembler will refuse to emit
       them otherwise.
      I have not attempted to make any changes to configure to do this
       automatically.
      Doing so will probably require the addition of new configure options.
      
      Many of the hooks for RTCD on ARM were already there, but a lot of
       the code had bit-rotted, and a good deal of the ARM-specific code
       is not integrated into the RTCD structs at all.
      I did not try to resolve the latter, merely to add the minimal amount
       of protection around them to allow RTCD to work.
      Those functions that were called based on an ifdef at the calling
       site were expanded to check the RTCD flags at that site, but they
       should be added to an RTCD struct somewhere in the future.
      The functions invoked with global function pointers still are, but
       these should be moved into an RTCD struct for thread safety (I
       believe every platform currently supported has atomic pointer
       stores, but this is not guaranteed).
      
      The encoder's boolhuff functions did not even have _c and armv7
       suffixes, and the correct version was resolved at link time.
      The token packing functions did have appropriate suffixes, but the
       version was selected with a define, with no associated RTCD struct.
      However, for both of these, the only armv7 instruction they actually
       used was rbit, and this was completely superfluous, so I reworked
       them to avoid it.
      The only non-ARMv4 instruction remaining in them is clz, which is
       ARMv5 (not even ARMv5TE is required).
      Considering that there are no ARM-specific configs which are not at
       least ARMv5TE, I did not try to detect these at runtime, and simply
       enable them for ARMv5 and above.
      
      Finally, the NEON register saving code was completely non-reentrant,
       since it saved the registers to a global, static variable.
      I moved the storage for this onto the stack.
      A single binary built with this code was tested on an ARM11 (ARMv6)
       and a Cortex A8 (ARMv7 w/NEON), for both the encoder and decoder,
       and produced identical output, while using the correct accelerated
       functions on each.
      I did not test on any earlier processors.
      
      Change-Id: I45cbd63a614f4554c3b325c45d46c0806f009eaa
      b71962fd
  19. 24 Sep, 2010 1 commit
    • John Koleszar's avatar
      move reconintra_mt to decoder (for now) · 48e76ff4
      John Koleszar authored
      reconintra_mt.c is only required for building the decoder right now.
      It could definitely be used for the encoder in the future, but it
      currently depends on decoder only data structures. (onyxd_int.h,
      VP8D_COMP, etc). Move it from common/ to decoder/ until the
      necessary changes to the common multithread code are complete.
      
      This patch is needed to build with --disable-vp8-decoder.
      
      Change-Id: I568c52221a2b309234d269675cba97131ce35c86
      48e76ff4
  20. 17 Sep, 2010 1 commit
    • Yunqing Wang's avatar
      Restructure multi-threaded decoder · f857a850
      Yunqing Wang authored
      On each MB, loopfiltering is done right after MB decoding. This
      combines two loops in multi-threaded code into one, which reduces
      number of synchronizations to half.
      
      The above-row/left-col data are saved in temp buffers for
      next-row/next MB decoding.
      
      Tests on 4-core gLucid machine showed 10% decoder performance
      gain with threads=4 (tulip clip). Testing on other platforms
      isn't done yet.
      
      Change-Id: Id18ea7c1e84965dabea65d4c01ca5bc056ddeac9
      f857a850
  21. 09 Sep, 2010 1 commit
  22. 02 Sep, 2010 1 commit
    • James Zern's avatar
      encoder: remove postproc dependency · 76640f85
      James Zern authored
      Remove the dependency on postproc.c for the encoder in general, the only
      unchecked need for it is when CONFIG_PSNR is enabled. All other cases
      are already wrapped in CONFIG_POSTPROC. In the CONFIG_PSNR case the file
      will still be included.
      
      Additionally, when VP8_SET_POSTPROC is used with the encoder when post
      processing has been disabled an error will be returned.
      
      This addresses issue #153.
      
      Change-Id: Ia6dfe20167f7077734a6058cbd1d794550346089
      76640f85
  23. 23 Aug, 2010 1 commit
    • Fritz Koenig's avatar
      Rework idct calling structure. · 93c32a55
      Fritz Koenig authored
      Moving the eob structure allows for a non-struct based
      function to handle decoding an entire mb of
      idct/dequant/recon data.  This allows for SIMD functions
      to idct/dequant/recon multiple blocks at once.
      
      SSE2 implementation gives 3% gain on Atom.
      
      Change-Id: I8a8f3efd546ea4e0535f517d94f347cfb737c9c2
      93c32a55
  24. 19 Aug, 2010 1 commit
  25. 18 Aug, 2010 1 commit
  26. 13 Aug, 2010 1 commit
    • John Koleszar's avatar
      move segmentation_common to encoder · 80d3923a
      John Koleszar authored
      vp8_update_gf_useage_maps() is only used by the encoder. This patch
      fixes the ability to build in decode-only or encode-only
      configurations.
      
      Change-Id: I3a5211428e539886ba998e09e8abd747ac55c9aa
      80d3923a
  27. 10 Aug, 2010 1 commit
  28. 26 Jul, 2010 1 commit
    • Johann's avatar
      update arm idct functions · 56f5a9a0
      Johann authored
      Jeff Muizelaar posted some changes to the idct/reconstruction c code.
      This is the equivalent update for the arm assembly.
      
      This shows a good boost on v6, and a minor boost on neon.
      Here are some numbers for highway in qcif, 2641 frames:
      HEAD neon: ~161 fps
      new neon:  ~162 fps
      HEAD v6:   ~102 fps
      new v6:    ~106 fps
      
      The following functions have been updated for armv6 and neon:
      vp8_dc_only_idct_add
      vp8_dequant_idct_add
      vp8_dequant_dc_idct_add
      
      Conflicts:
      
      	vp8/decoder/arm/armv6/dequantdcidct_v6.asm
      	vp8/decoder/arm/armv6/dequantidct_v6.asm
      
      Resolved by removing these files. When I rewrote the functions, I also
      moved the files to dequant_dc_idct_v6.asm/dequant_idct_v6.asm
      
      Change-Id: Ie3300df824d52474eca1a5134cf22d8b7809a5d4
      56f5a9a0
  29. 18 Jun, 2010 1 commit
    • John Koleszar's avatar
      cosmetics: trim trailing whitespace · 94c52e4d
      John Koleszar authored
      When the license headers were updated, they accidentally contained
      trailing whitespace, so unfortunately we have to touch all the files
      again.
      
      Change-Id: I236c05fade06589e417179c0444cb39b09e4200d
      94c52e4d
  30. 04 Jun, 2010 1 commit
  31. 25 May, 2010 1 commit
    • John Koleszar's avatar
      remove references to vp8/vp8.h · 4c627f56
      John Koleszar authored
      This file was moved to vpx/, currently this reference breaks the MSVS build.
      
      Change-Id: I2c90a7a1c09cb66055e3daf84facefcaee1085a1
      4c627f56
  32. 18 May, 2010 1 commit