1. 24 Feb, 2014 1 commit
    • Yaowu Xu's avatar
      added clamp of segment loop filter level · 05e850cb
      Yaowu Xu authored
      for ABSDATA mode, so segment loop filter level always fall in valid
      range for both Absolute and delta modes.
      
      Change-Id: If90df3411479533dbdab63f8ae088d2f5dd174a9
      05e850cb
  2. 21 Feb, 2014 1 commit
  3. 13 Feb, 2014 1 commit
  4. 01 Feb, 2014 1 commit
    • Yunqing Wang's avatar
      Rename a loopfilter parameter · 11a9366e
      Yunqing Wang authored
      As pointed out by Dmitry and James, "partial" is a Microsoft-
      specific c++ keyword, and it is renamed.
      
      Change-Id: Ia0fc11ceb89e54b3195287f89f7e26edbbe9beb8
      11a9366e
  5. 31 Jan, 2014 1 commit
    • Yunqing Wang's avatar
      vp9 decoder: row-based multi-threaded loopfilter · 903801f1
      Yunqing Wang authored
      Implemented parallel loopfiltering, which uses existing tile-
      decoding threads. Each thread works on one row, and when that row
      is loopfiltered, it moves to next unattended row. To ensure the
      correct filtering order, threads are synchronized and one
      superblock is filtered only if the superblocks it depends on are
      filtered already.
      
      To reduce synchronization overhead and speed up the decoder, we use
      nsync > 1 for high resolution.
      
      Performance tests:
      1. on desktop:
      8-tile 4k video using 8 threads, speedup: 70% - 80%
      4-tile HD video using 4 threads, speedup: ~35%
      2. on mobile device(Nexus 7):
      4-tile 1080p video using 4 threads, speedup: 18% - 25%
      4-tile 1080p video using 2 threads, speedup: 10% - 15%
      
      Change-Id: If54b4a11960dd706c22d5ad145ad94156031f36a
      903801f1
  6. 18 Dec, 2013 1 commit
    • Jim Bankoski's avatar
      rename loop filter functions · b720ba16
      Jim Bankoski authored
      This renames all the loop filter functions so that they no
      longer refer to mb
      
      Change-Id: I8a58a8c7fd253d835cb619bde13913e896ece90b
      b720ba16
  7. 15 Dec, 2013 1 commit
  8. 13 Dec, 2013 1 commit
  9. 04 Dec, 2013 1 commit
  10. 27 Nov, 2013 1 commit
    • Yunqing Wang's avatar
      Simplify mask checking in loop filters · 8f05e703
      Yunqing Wang authored
      Considering a horizontal edge, if mask_16x16 is 1 for an even-
      indexed 8x8 block, then mask_16x16 is 1 for next 8x8 block in
      same row. Similiar to a verticle edge, if mask_16x16 is 1 for
      an even-rowed 8x8 block, then mask_16x16 is 1 for the 8x8 block
      right below it in next raw. Based on that, the mask_16x16 checking
      can be simplified to save cycles. The corresponding 8-pixel
      vp9_mb_lpf_horizontal_edge code can also be removed.
      
      Change-Id: Ic3fe7a5674322239208cbe2731dc3216ce2084f3
      8f05e703
  11. 22 Nov, 2013 1 commit
    • Yunqing Wang's avatar
      Do vertical loopfiltering in parallel · ed36720b
      Yunqing Wang authored
      This patch followed "Add filter_selectively_vert_row2 to enable
      parallel loopfiltering" commit, and added x86 SSE2 optimization
      to do 16-pixel filtering in parallel. For other optimizations
      (neon and dspr2), current 16-pixel functions were done by calling
      8-pixel functions twice, and real 16-pixel functions could be added
      later.
      
      Decoder speedup:
      tulip clip:     2% speed gain;
      old_town_cross: 1.2% speed gain;
      bus:            2% speed gain.
      
      Change-Id: I4818a0c72f84b34f5fe678e496cf4a10238574b7
      ed36720b
  12. 21 Nov, 2013 1 commit
    • Yunqing Wang's avatar
      Add filter_selectively_vert_row2 to enable parallel loopfiltering · b5e6d6cc
      Yunqing Wang authored
      Added filter_selectively_vert_row2 to be ready for parallel
      loopfiltering in vertical direction. This change did 2-row
      filtering at a time. If 2 vertically adjacent 8x8 blocks do same
      type of filtering, we can do 16-pixel filtering in parallel.
      
      Next, we need to provide 16-pixel loopfiltering functions in c
      and optimized versions for codec speedup.
      
      Change-Id: Idf97bbdd70566e55bd30e1fd25cb8544e33291be
      b5e6d6cc
  13. 16 Nov, 2013 1 commit
    • Yunqing Wang's avatar
      Do horizontal loopfiltering in parallel · 64f728ca
      Yunqing Wang authored
      This patch followed "Rewrite filter_selectively_horiz for parallel
      loopfiltering" commit, and added x86 SSE2 optimization to do
      16-pixel filtering in parallel. Also, corrected the declaration
      of aligned arrays. For 8-pixel-in-parallel case, improved the
      calculation of the masks and filters. Updated the threshold loading
      since the thresholds were already duplicated. Updated neon C functions
      to call neon loopfilters twice.
      
      Using tulip clip, tests showed it gave a ~1.5% decoder speed gain.
      
      Change-Id: Id02638626ac27a4b0e0b09d71792a24c0499bd35
      64f728ca
  14. 14 Nov, 2013 2 commits
  15. 12 Nov, 2013 2 commits
    • Yunqing Wang's avatar
      Use 1D array to store super block filter levels · ce89309b
      Yunqing Wang authored
      As Jim suggested, 1D array was used to store filter levels instead
      of 2D array. This used shift_y in setup_mask directly, and saved
      few cycles.
      
      Change-Id: If61ab298784861f1806b1cd396d4e4e2e0f097b9
      ce89309b
    • Yunqing Wang's avatar
      Rewrite filter_selectively_horiz for parallel loopfiltering · b4543818
      Yunqing Wang authored
      Added loop filter mask checking, and made the caller function
      ready for implementation of parallel loopfiltering in horizontal
      direction.
      
      Next, we need to go through the loopfilter functions (both c and
      optimized versions), and provide 16-byte wide loopfiltering for
      each filter type.
      
      Change-Id: Ifef47e7ef9086ebc2fd6ca7ede8f27c9bbf79e66
      b4543818
  16. 08 Nov, 2013 1 commit
    • Yunqing Wang's avatar
      Improve loopfilter function · 49cf335e
      Yunqing Wang authored
      This patch continued the work done in "Rewrite loop_filter_info_n
      struct"(commit:00dbd369) to further
      improve loopfilter function.
      
      1. Instead of storing pointers to thresholds, store loopfilter
      levels within 64x64 SB;
      2. Since loopfilter levels are already calculated in setup_mask,
      we don't need call build_lfi to look up them again. Just save
      loopfilter levels in setup_mask.
      3. Reorganized and simplified filter_block_plane().
      
      Tests showed a ~0.8% decoder speedup.
      
      Change-Id: I723c7779738bbc2afcb9afa2c6f78580ee6c3af7
      49cf335e
  17. 25 Oct, 2013 1 commit
    • Yunqing Wang's avatar
      Rewrite loop_filter_info_n struct · 00dbd369
      Yunqing Wang authored
      Restructured the storing of loopfilter information. Deleted
      loop_filter_info struct and reduced copying happened in every
      superblock.
      
      Tests showed a 0.5% ~ 0.8% decoder speed gain.
      
      Change-Id: Ie6a8e46bae71dc3a3cd8c6054f5de540b8e0ef5e
      00dbd369
  18. 30 Sep, 2013 1 commit
  19. 19 Sep, 2013 1 commit
  20. 11 Sep, 2013 1 commit
    • Scott LaVarnway's avatar
      New mode_info_context storage -- undo revert · ac6093d1
      Scott LaVarnway authored
      mode_info_context was stored as a grid of MODE_INFO structs.
      The grid now constists of pointers to MODE_INFO structs.  The
      MODE_INFO structs are now stored as a stream (decoder only),
      eliminating unnecessary copies and is a little more cache
      friendly.
      
      Change-Id: I031d376284c6eb98a38ad5595b797f048a6cfc0d
      ac6093d1
  21. 09 Sep, 2013 1 commit
  22. 06 Sep, 2013 1 commit
    • Scott LaVarnway's avatar
      New mode_info_context storage · dae17734
      Scott LaVarnway authored
      mode_info_context was stored as a grid of MODE_INFO structs.
      The grid now constists of a pointer to a MODE_INFO struct and
      a "in the image" flag.  The MODE_INFO structs are now stored
      as a stream, eliminating unnecessary copies and is a little
      more cache friendly.
      
      For the test clips used, the decoder performance improved
      by ~4.3% (1080p) and ~9.7% (720p).
      
      Patch Set 2: Re-encoded clips with latest. Now ~1.7% (1080p)
      and 5.9% (720p).
      
      Change-Id: I846f29e88610fce2523ca697a9a9ef2a182e9256
      dae17734
  23. 05 Sep, 2013 1 commit
  24. 04 Sep, 2013 1 commit
  25. 30 Aug, 2013 1 commit
  26. 28 Aug, 2013 1 commit
  27. 27 Aug, 2013 1 commit
    • Frank Galligan's avatar
      Fix winodws warning. · f1560ce0
      Frank Galligan authored
      Const is not needed on the function parameter.
      
      Change-Id: I38c2a7317cb6f42f70bbddfde9a2cd18d65ceb1c
      f1560ce0
  28. 24 Aug, 2013 1 commit
    • Dmitry Kovalev's avatar
      Renaming D27 to D207. · 50ee61db
      Dmitry Kovalev authored
      I've already renamed d27_predictor to d207_predictor but forgot about the
      corresponding constant.
      
      Change-Id: Id312aa80fc5b5a1ab8a709a33418a029552a6857
      50ee61db
  29. 19 Aug, 2013 1 commit
    • Adrian Grange's avatar
      Further correct bug in loopfilter initialization · 5a1a269f
      Adrian Grange authored
      The intent was to initialize the deltas for the
      segment to the computed value, irrespective of mode
      and reference frame if (mode_ref_delta_enabled == 0).
      
      (In response to bug posted by Manjit Hota to codec-devel
      and webm-discuss lists)
      
      Change-Id: I10435cb63d0f88359bb4c14f22181878a1988e72
      5a1a269f
  30. 15 Aug, 2013 1 commit
  31. 14 Aug, 2013 1 commit
    • Paul Wilkins's avatar
      Renaming in MB_MODE_INFO · 26fead7e
      Paul Wilkins authored
      The macro block mode info context originally contained an
      entry for each 16x16 macroblock. In VP9 each entry refers
      to an 8x8 region not a macro block, so the naming is misleading.
      
      This first stage clean up changes the names of 3 entries in the
      structure to remove the mb_ prefix.
      
      TODO clean up the nomenclature more widely in respect of
      mbmi and bmi.
      
      Change-Id: Ia7305c6d0cb805dfe8cdc98dad21338f502e49c6
      26fead7e
  32. 10 Aug, 2013 1 commit
  33. 09 Aug, 2013 3 commits
    • Dmitry Kovalev's avatar
      Moving loopfilter struct to VP9_COMMON. · 816d6c98
      Dmitry Kovalev authored
      Loop filter configuration doesn't belong to macroblock, so moving it from
      MACROBLOCKD to VP9_COMMON. Also moving the declaration of loopfilter struct
      from vp9_blockd.h to vp9_loopfilter.h.
      
      Change-Id: I4b3e34be9623b47cda35f9b1f9951f8c5b1d5d28
      816d6c98
    • Adrian Grange's avatar
      Correct bug in loopfilter initialization · 12eb2d02
      Adrian Grange authored
      The memset sets 16 bytes rather than the correct size of the
      final array dimension (MAX_MODE_LF_DELTAS).
      
      (In response to bug posted by Manjit Hota to codec-devel
      and webm-discuss lists)
      
      Change-Id: I8980f5aa71ddc9d7ef57c5b4700bc28ddf8651c7
      12eb2d02
    • Yaowu Xu's avatar
      Added lpf level picking using partial frame · 6ec2b85b
      Yaowu Xu authored
      Change-Id: I599ab1bd22b5f3f10d5962c609952abdef8ff67a
      6ec2b85b
  34. 05 Aug, 2013 1 commit
  35. 02 Aug, 2013 2 commits