1. 01 Aug, 2012 1 commit
    • Attila Nagy's avatar
      Fix potential encoder dead-lock after picture resize · 03b3fcec
      Attila Nagy authored
      The sync interval for the multithreaded encoder was considered as not changing
      during the encoding. This is not true if picture size is changed.
      The encoder could dead-lock because the main thread and the other threads were
      using different sync interval.
      
      Change-Id: I75232bbdbc6c02d77f830d870fd8b4e96697c64e
      03b3fcec
  2. 27 Jul, 2012 1 commit
    • Attila Nagy's avatar
      Optimizes updates of encoder block ptrs · e66e9ddf
      Attila Nagy authored
      Precalculated block ptrs do not need updates during encoding.
      Set these at init stage.
      
      Moved the allocation of 'mt_current_mb_col' (last encoded MB on each
      row) to vp8_alloc_compressor_data(), so that it is correctly
      reallocated when frame size is changing.
      
      Change-Id: Idcdaa2d0cf3a7f782b7d888626b7cf22a4ffb5c1
      e66e9ddf
  3. 11 Jun, 2012 1 commit
    • John Koleszar's avatar
      Fix pedantic compiler warnings · 0164a1cc
      John Koleszar authored
      Allows building the library with the gcc -pedantic option, for improved
      portabilty. In particular, this commit removes usage of C99/C++ style
      single-line comments and dynamic struct initializers. This is a
      continuation of the work done in commit 97b766a4, which removed most
      of these warnings for decode only builds.
      
      Change-Id: Id453d9c1d9f44cc0381b10c3869fabb0184d5966
      0164a1cc
  4. 04 May, 2012 3 commits
    • John Koleszar's avatar
      Formalize encodeframe.c forward delclarations · 22f56b93
      John Koleszar authored
      Change If4321cc5 fixed a bug caused by forward declarations not being
      kept in sync across C files, resulting in a function call with the
      wrong arguments. The commit moves the affected function declarations
      into a header file, along with the other symbols from encodeframe.c
      that were being sloppily shared.
      
      Change-Id: I76a7b4c66d4fe175f9cbef7e52148655e4bb9ba1
      22f56b93
    • Attila Nagy's avatar
      Fix multi-resolution threaded encoding · 3e32105d
      Attila Nagy authored
      mb_row and mb_col was not passed to vp8cx_encode_inter_macroblock in
      threaded encoding.
      
      Change-Id: If4321cc59bf91e991aa31e772f882ed5f2bbb201
      3e32105d
    • Attila Nagy's avatar
      Fix multi-resolution threaded encoding · 357800e7
      Attila Nagy authored
      mb_row and mb_col was not passed to vp8cx_encode_inter_macroblock in
      threaded encoding.
      
      Change-Id: If4321cc59bf91e991aa31e772f882ed5f2bbb201
      357800e7
  5. 23 Apr, 2012 1 commit
    • Attila Nagy's avatar
      Shares one set of RD costs tables between all encoding threads · b41c17d6
      Attila Nagy authored
      RD costs were local to MACROBLOCK data and had to be copied all the
      time to each thread's MACROBLOCK data. Tables moved to a common place
      and only pointers are setup for each encoding thread.
      
      vp8_cost_tokens() generates 'int' costs so changed all types to be
      int (i.e. removed unsigned).
      
      NOTE: Could do some more cleaning in vp8cx_init_mbrthread_data().
      
      Change-Id: Ifa4de4c6286dffaca7ed3082041fe5af1345ddc0
      b41c17d6
  6. 29 Feb, 2012 1 commit
    • Attila Nagy's avatar
      Packing bitstream on-the-fly with delayed context updates · 52cf4dca
      Attila Nagy authored
      Produce the token partitions on-the-fly, while processing each MB.
      Context is updated at the beginning of each frame based on the
      previoud frame's counters. Optimally encoder outputs partitions in
      separate buffers. For frame based output, partitions are concatenated
      internally.
      
      Limitations:
          - enabled just in combination with realtime-only mode
          - number of encoding threads has to be equal or less than the
          number of token partitions. For this reason, by default the encoder
          will do 8 token partitions.
          - vpxenc supports partition output (-P) just in combination with
          IVF output format (--ivf)
      
      Performance:
          - Realtime encoder can be up to 13% faster (ARM) depending on the number
          of threads and bitrate settings. Constant gain over the 5-16 speed
          range.
          - Token buffer reduced from one frame to 8 MBs
      
      Quality:
          - quality is affected by the delayed context updates. This again
          dependents on input material, speed and bitrate settings. For VC
          style input the loss seen is up to 0.2dB. If error-resilient=2
          mode is used than the effect of this change is negligible.
      
      Example:
      ./configure --enable-realtime-only --enable-onthefly-bitpacking
      ./vpxenc --rt --end-usage=1 --fps=30000/1000 -w 640 -h 480
      --target-bitrate=1000 --token-parts=3 --static-thresh=2000
      --ivf -P -t 4 -o strm.ivf tanya_640x480.yuv
      
      Change-Id: I127295cb85b835fc287e1c0201a67e378d025d76
      52cf4dca
  7. 16 Feb, 2012 1 commit
    • Attila Nagy's avatar
      Multithreaded encoder, late sync loopfilter · 78071b3b
      Attila Nagy authored
      Second shot at this...
      
      Sync with loopfilter thread as late as possible, usually just at the
      beginning of next frame encoding. This returns control to application
      faster and allows a better multicore scaling.
      
      When PSNR packets are generated the final filtered frame is needed
      imediatly so we cannot delay the sync. Same has to be done when
      internal frame is previewed.
      
      Change-Id: I64e110c8b224dd967faefffd9c93dd8dbad4a5b5
      78071b3b
  8. 02 Feb, 2012 2 commits
  9. 30 Jan, 2012 3 commits
    • John Koleszar's avatar
      RTCD: add arnr functions · 109b69a7
      John Koleszar authored
      This commit continues the process of converting to the new RTCD
      system. It removes the last of the VP8_ENCODER_RTCD struct references.
      
      Change-Id: I2a44f52d7cccf5177e1ca98a028ead570d045395
      109b69a7
    • John Koleszar's avatar
      RTCD: add FDCT functions · 510e0ab4
      John Koleszar authored
      This commit continues the process of converting to the new RTCD
      system.
      
      Change-Id: I3f9c07db65eb206f6363d21bdb80e871570da767
      510e0ab4
    • John Koleszar's avatar
      RTCD: add recon functions · fdb61a45
      John Koleszar authored
      This commit continues the process of converting to the new RTCD
      system.
      
      Change-Id: I9bfcf9bef65c3d4ba0fb9a3e1532bad1463a10d6
      fdb61a45
  10. 06 Jan, 2012 1 commit
  11. 28 Dec, 2011 2 commits
  12. 22 Dec, 2011 1 commit
    • John Koleszar's avatar
      Remove legacy integer types · f56918ba
      John Koleszar authored
      Remove BOOL, INTn, UINTn, etc, in favor of C99-style fixed width
      types.
      
      Change-Id: I396636212fb5edd6b347d43cc940186d8cd1e7b5
      f56918ba
  13. 28 Nov, 2011 1 commit
    • Yunqing Wang's avatar
      Populate q_index in multi-thread encoding · 06fc0f83
      Yunqing Wang authored
      This value needs to be copied to each thread's data structure.
      This fixed artifact problem in multi-thread encoder.
      
      Change-Id: Iab6d9745a1d44846aa503184705376f63a505597
      06fc0f83
  14. 08 Nov, 2011 1 commit
    • Yunqing Wang's avatar
      Fix checks in MB quantizer initialization · 4c14efd2
      Yunqing Wang authored
      vp8cx_mb_init_quantizer() needs to be called at least once to get
      all values calculated. This change added one check to decide if
      we could skip initialization or not.
      
      Change-Id: I3f65eb548be57580a61444328336bc18c25c085b
      4c14efd2
  15. 24 Oct, 2011 1 commit
  16. 13 Sep, 2011 1 commit
    • Scott LaVarnway's avatar
      Fixed encoder crash · 5bc7b3a6
      Scott LaVarnway authored
      caused by the "Removed bmi copy to/from BLOCKD" commit.
      
      Change-Id: I9fae71bdc34c8ecc07bb81cd3ccf498b91ce3ec7
      5bc7b3a6
  17. 23 Jun, 2011 1 commit
    • Yunqing Wang's avatar
      Copy macroblock data to a buffer before encoding it · 0d87098e
      Yunqing Wang authored
      I got this idea from Pascal (Thanks). Before encoding a macroblock,
      copy it to a 16x16 buffer, and then read source data from there
      instead. This will help keep the source data in cache, and help
      with the performance.
      
      Change-Id: Id05f4cb601299150511d59dcba0ae62c49b5b757
      0d87098e
  18. 08 Jun, 2011 1 commit
    • Paul Wilkins's avatar
      Further activity masking changes: · 4e81a68a
      Paul Wilkins authored
      Some further re-structuring of activity masking code.
      Still has various experimental switches.
      Supports a metric based on intra encode.
      Experimental comparison against a fixed activity target  rather
      than a frame average, for altering rd and zbin.
      
      Overall the SSIM performance is similar  to TT's original
      code but there is a much smaller PSNR hit of circa
      0.5% instead of 3.2%
      
      Change-Id: I0fd53b2dfb60620b3f74d7415e0b81c1ac58c39a
      4e81a68a
  19. 02 Jun, 2011 1 commit
    • Scott LaVarnway's avatar
      Removed B_MODE_INFO · 773768ae
      Scott LaVarnway authored
      Declared the bmi in BLOCKD as a union instead of B_MODE_INFO.
      Then removed B_MODE_INFO completely.
      
      Change-Id: Ieb7469899e265892c66f7aeac87b7f2bf38e7a67
      773768ae
  20. 01 Jun, 2011 1 commit
    • Tero Rintaluoma's avatar
      neon fast quantize block pair · 61f0c090
      Tero Rintaluoma authored
      vp8_fast_quantize_b_pair_neon function added to quantize
      two adjacent blocks at the same time to improve performance.
       - Additional 3-6% speedup compared to neon optimized fast
         quantizer (Tanya VGA@30fps, 1Mbps stream, cpu-used=-5..-16)
      
      Change-Id: I3fcbf141e5d05e9118c38ca37310458afbabaa4e
      61f0c090
  21. 24 May, 2011 2 commits
    • Scott LaVarnway's avatar
      Removed unused variable warnings · cfab2cae
      Scott LaVarnway authored
      Change-Id: I6e5e921f03dc15a72da89a457848d519647677a3
      cfab2cae
    • Scott LaVarnway's avatar
      MODE_INFO size reduction · e11f21af
      Scott LaVarnway authored
      Declared the bmi in MODE_INFO as a union instead of B_MODE_INFO.
      This reduced the memory footprint by 518,400 bytes for 1080
      resolutions.  The decoder performance improved by ~4% for the
      clip used and the encoder showed very small improvements. (0.5%)
      This reduction was first mentioned to me by John K. and in a
      later discussion by Yaowu.
      This is WIP.
      
      Change-Id: I8e175fdbc46d28c35277302a04bee4540efc8d29
      e11f21af
  22. 19 May, 2011 1 commit
  23. 13 May, 2011 1 commit
    • Paul Wilkins's avatar
      Restructure of activity masking code. · ff52bf36
      Paul Wilkins authored
      This commit restructures the mb activity masking code
      to better facilitate experimentation using different metrics
      etc. and also allows for adjustment of the zero bin either
      for encode only or both the encode and mode selection
      stages
      
      It also uses information from the current frame rather than
      the previous frame and the default strength has been
      reduced.
      
      Change-Id: Id39b19eace37574dc429f25aae810c203709629b
      ff52bf36
  24. 06 May, 2011 2 commits
    • Yaowu Xu's avatar
      fix a bug related to gf_active_flags in multi-threaded encoder · 89c6017c
      Yaowu Xu authored
      Paul pointed out that the pointer to the gf_active_flags is not being
      properly incremented in multithreaded encoder. This commit fixes the
      issue by making sure the gf_active_ptr points to the starting of next
      group of mb rows.
      
      Change-Id: I3246e657d23beabb614dfb880733a68a5fd7e34c
      89c6017c
    • Aron Rosenberg's avatar
      Fix semaphore emulation on Windows · eeb81173
      Aron Rosenberg authored
      The existing emulation of posix semaphores on Windows uses SetEvent()
      and WaitForSingleObject(), which implements a binary semaphore, not a
      counting semaphore as implemented by posix. This causes deadlock when
      used with the expected posix semantics. Instead, this patch uses the
      CreateSemaphore() and ReleaseSemaphore() calls (introduced in Windows
      2000) which have the expected behavior.
      
      This patch also reverts commit eb16f00c, which split a semaphore that
      was being used with counting semantics into two binary semaphores.
      That commit is unnecessary with corrected emulation.
      
      Change-Id: If400771536a27af4b0c3a31aa4c4e9ced89ce6a0
      eeb81173
  25. 05 May, 2011 1 commit
    • Yunqing Wang's avatar
      Fix rare hang in multi-thread encoder on Windows · eb16f00c
      Yunqing Wang authored
      This patch is to fix a rare hang in multi-thread encoder that was
      only seen on Windows. Thanks for John's help in debugging the
      problem. More test is needed.
      
      Change-Id: Idb11c6d344c2082362a032b34c5a602a1eea62fc
      eb16f00c
  26. 01 Apr, 2011 1 commit
    • Yunqing Wang's avatar
      Use full-pixel MV in mvsadcost calculation · 3d681581
      Yunqing Wang authored
      MV sad cost error is only used in full-pixel motion search,
      which only need full-pixel resolution instead of quarter-pixel
      resolution. This change reduced mvsadcost table size, and
      removed unneccessary pamameter passing since this table is
      constant once it is generated.
      
      Change-Id: I9f931e55f6abc3c99011321f1dfb2f3562e6f6b0
      3d681581
  27. 31 Mar, 2011 1 commit
    • Attila Nagy's avatar
      Runtime detection of available processor cores. · 297b2765
      Attila Nagy authored
      Detect the number of available cores and limit the thread allocation
      accordingly. On decoder side limit the number of threads to the max
      number of token partition.
      
      Core detetction works on Windows and
      Posix platforms, which define _SC_NPROCESSORS_ONLN or _SC_NPROC_ONLN.
      
      Change-Id: I76cbe37c18d3b8035e508b7a1795577674efc078
      297b2765
  28. 18 Mar, 2011 1 commit
  29. 11 Mar, 2011 1 commit
    • Attila Nagy's avatar
      Encoder loopfilter running in its own thread · 3ae24657
      Attila Nagy authored
      In multithreaded mode the loopfilter is running in its own thread (filter level
      calculation and frame filtering). Filtering is mostly done in parallel with the
      bitstream packing. Before starting the packing the loopfilter level has
      to be calculated. Also any needed reference frame copying is done in the
      filter thread.
      
      Currently the encoder will create n+1 threads, where n > 1 is the number of
      threads specified by application  and 1 is the extra filter thread. With n = 1
      the encoder runs in single thread mode. There will never be more than n threads
      running concurrently.
      
      Change-Id: I4fb29b559a40275d6d3babb8727245c40fba931b
      3ae24657
  30. 10 Feb, 2011 1 commit
    • John Koleszar's avatar
      Fix relative include paths · 02321de0
      John Koleszar authored
      Allow compiling without adding vp8/{common,encoder,decoder} to the
      include paths.
      
      Change-Id: Ifeb5dac351cdfadcd659736f5158b315a0030b6c
      02321de0
  31. 09 Feb, 2011 1 commit
  32. 01 Feb, 2011 1 commit
    • Attila Nagy's avatar
      Improved encoder threading · 385c2a76
      Attila Nagy authored
      Reduce the number of sync points by letting each thread
      continue imediatly with a new MB row.
      Better multicore scaling, improves performance by 5-20% on ARM multicore.
      
      Change-Id: Ic97e4d1c4886a842c85dd3539a93cb217188ed1b
      385c2a76