Commits · 52cf4dcaea10f97d25d8a3585704a1e47b384751 · Xiph.Org / aom-rav1e

Feb 29, 2012

Packing bitstream on-the-fly with delayed context updates · 52cf4dca

Attila Nagy authored 13 years ago

Produce the token partitions on-the-fly, while processing each MB.
Context is updated at the beginning of each frame based on the
previoud frame's counters. Optimally encoder outputs partitions in
separate buffers. For frame based output, partitions are concatenated
internally.

Limitations:
    - enabled just in combination with realtime-only mode
    - number of encoding threads has to be equal or less than the
    number of token partitions. For this reason, by default the encoder
    will do 8 token partitions.
    - vpxenc supports partition output (-P) just in combination with
    IVF output format (--ivf)

Performance:
    - Realtime encoder can be up to 13% faster (ARM) depending on the number
    of threads and bitrate settings. Constant gain over the 5-16 speed
    range.
    - Token buffer reduced from one frame to 8 MBs

Quality:
    - quality is affected by the delayed context updates. This again
    dependents on input material, speed and bitrate settings. For VC
    style input the loss seen is up to 0.2dB. If error-resilient=2
    mode is used than the effect of this change is negligible.

Example:
./configure --enable-realtime-only --enable-onthefly-bitpacking
./vpxenc --rt --end-usage=1 --fps=30000/1000 -w 640 -h 480
--target-bitrate=1000 --token-parts=3 --static-thresh=2000
--ivf -P -t 4 -o strm.ivf tanya_640x480.yuv

Change-Id: I127295cb85b835fc287e1c0201a67e378d025d76

52cf4dca

Feb 28, 2012

Merge changes Ifb450710,I61c4a132 · ce328b85

Scott LaVarnway authored 13 years ago

* changes:
  Eliminated reconintra_mt.c
  Eliminated vp8mt_build_intra_predictors_mbuv_s

ce328b85

Merge "Removed duplicate code in threading.c" · aab70f4d
Scott LaVarnway authored 13 years ago

aab70f4d

Eliminated reconintra_mt.c · bcba86e2

Scott LaVarnway authored 13 years ago

Reworked the code to use vp8_build_intra_predictors_mby_s,
vp8_intra_prediction_down_copy, and vp8_intra4x4_predict_d_c
functions instead.  vp8_intra4x4_predict_d_c is a decoder-only
version of vp8_intra4x4_predict.  Future commits will fix this
code duplication.

Change-Id: Ifb4507103b7c83f8b94a872345191c49240154f5

bcba86e2

Removed duplicate code in threading.c · 9a4052a4
Scott LaVarnway authored 13 years ago
```
Change-Id: Id7e44950ceda67b280e410e541510106ef02f1da
```
9a4052a4
Merge "Only do uv intra-mode evaluation when intra mode is checked" · b1bfd0ba
Yunqing Wang authored 13 years ago

b1bfd0ba

Only do uv intra-mode evaluation when intra mode is checked · 019384f2

Yunqing Wang authored 13 years ago

When we encode slide-show clips, for the majority of the time,
only ZEROMV mode is checked, and all other modes are skipped.
This change delayed uv intra-mode evaluation until intra mode is
actually checked. This gave big performance gain for slide-show
video encoding (2nd pass gain: 18% to 28%). But, this change
doesn't help other types of videos.

Also, zbin_mode_boost is adjusted in mode-checking loop, which
causes bitstream mismatch before/after this change when --best
or --good with --cpu-used=0 are used.

Change-Id: I582b3e69fd384039994360e870e6e059c36a64cc

019384f2

Feb 27, 2012
- bugfix: use oxcf width/height for reinit check · e2c6b05f
  James Berry authored 13 years ago
  
  use oxcf instead of common in check to Reinit the lookahead buffer if the frame size changes prior behavior would cause assertion fail/crash first observed in: support changing resolution with vpx_codec_enc_config_set Change-Id: Ib669916ca9b4f206d4cc3caab5107e49d39a36aa
  e2c6b05f
- Merge "Fix skippable evaluation in mode decision" · 61c5e31c
  Yunqing Wang authored 13 years ago
  
  61c5e31c
- Merge "vpxenc: initial implementation of multistream support" · ad121615
  John Koleszar authored 13 years ago
  
  ad121615
- Merge "decoder: reset segmentation map on keyframes" · 02a31e6b
  John Koleszar authored 13 years ago
  
  02a31e6b
- Fix skippable evaluation in mode decision · 84be08b0
  Yunqing Wang authored 13 years ago
  
  Yaowu fixed the skippable evaluation by correcting 2nd order block's eob. Change-Id: Id47930cbc74a90a046c0c0e324efb03477639ee0
  84be08b0
Feb 23, 2012
- Merge "Add unit tests for idctllm_test and idctllm_mmx" · 313bfbb6
  James Berry authored 13 years ago
  
  313bfbb6
- Merge "Remove the frame rate factor for key frame size." · 2089f26b
  Jim Bankoski authored 13 years ago
  
  2089f26b
Feb 22, 2012

Remove the frame rate factor for key frame size. · 507ee87e

Marco Paniconi authored 13 years ago

When temporal layers is used (i.e., number_of_layers > 1),
we don't use the frame rate boost for setting the key
frame target size. The factor was forcing the target size to be
always at its minimum (2* per_frame_bandwidth) for low frame rates
(i.e., base layer frame rate).

Generally we should modify or remove this frame rate factor;
for now we turn if off for number_of_layers > 1.

Change-Id: Ia5acf406c9b2f634d30ac2473adc7b9bf2e7e6c6

507ee87e

Feb 21, 2012
- Eliminated vp8mt_build_intra_predictors_mbuv_s · f2bd11fa
  Scott LaVarnway authored 13 years ago
  
  Reworked the code to use vp8_build_intra_predictors_mbuv_s instead. This is WIP with the goal of eliminating all functions in reconintra_mt.h Change-Id: I61c4a132684544b24a38c4a90044597c6ec0dd52
  f2bd11fa
- Add unit tests for idctllm_test and idctllm_mmx · 0c1cec22
  James Berry authored 13 years ago
  
  add unit tests for vp8_short_idct4x4llm_c Change-Id: I472b7c0baa365ba25dc99a3f6efccc816d27c941
  0c1cec22
- Merge changes I0341554f,I64e110c8 · dadc9189
  John Koleszar authored 13 years ago
  
  * changes: Consolidate C version of token packing functions Multithreaded encoder, late sync loopfilter
  dadc9189
- Merge "Remove redundant init of segment_counts in vp8_encode_frame" · f05feab7
  Scott LaVarnway authored 13 years ago
  
  f05feab7
- Merge "Update encoder mb_skip_coeff and prob_skip_false calculation" · 02360dd2
  John Koleszar authored 13 years ago
  
  02360dd2
Feb 17, 2012

Refine offset pattern · b0a12a28

Johann Koenig authored 13 years ago

When compiling with -ggdb3 the output includes an extraneous EQU from
vpx_ports/asm_offsets.h

https://trac.macports.org/ticket/33285

Change-Id: Iba93ddafec414c152b87001a7542e7a894781231

b0a12a28

Merge changes Idf1a05f3,If227b29b,Iac784d39 · b5ce9456

John Koleszar authored 13 years ago

* changes:
  vpxenc: factor out input open/close
  vpxenc: add warning()/fatal() helpers
  vpxenc: factor out global config options

b5ce9456

Merge "OS X shell is incompatible with echo -n" · e6047a17
Johann Koenig authored 13 years ago

e6047a17
Merge "Fix incorrect use of uv eobs in intra modes" · f93b1e7b
Yunqing Wang authored 13 years ago

f93b1e7b

Fix incorrect use of uv eobs in intra modes · 04b9e0d7

Yunqing Wang authored 13 years ago

In vp8_rd_pick_inter_mode(), if total of eobs is zero, rate needs
to be adjusted since there are no non-zero coefficients for
transmission. The uv intra eobs calculated in
rd_pick_intra_mbuv_mode() need to be saved before they are
overwritten by inter-mode eobs.

Change-Id: I41dd04fba912e8122ef95793d4d98a251bc60e58

04b9e0d7

Update encoder mb_skip_coeff and prob_skip_false calculation · ce42e79a

Attila Nagy authored 13 years ago

mode_info_context->mbmi.mb_skip_coeff has to always reflect the
existence or not of coeffs for a certain MB. The loopfilter needs this
info.
mb_skip_coeff is either set by the vp8_tokenize_mb or has to be set to
1 when the MB is skipped by mode selection. This has to be done
regardless of the mb_no_coeff_skip value.

prob_skip_false is needed just when mb_no_coeff_skip is 1. No need to
keep count of both skip_false and skip_true as they are complementary
(skip_true+skip_false = total_mbs)

Change-Id: I3c74c9a0ee37bec10de7bb796e408f3e77006813

ce42e79a

Remove redundant init of segment_counts in vp8_encode_frame · 565d0e6f

Attila Nagy authored 13 years ago

segment_counts was zero init twice in the beginning of vp8_encode_frame.

Change-Id: Ibc29f6896dabd9aab1d0993f3941cf6876022e70

565d0e6f

Feb 16, 2012

Clarify 'max_sad' usage · 6b151d43

Johann Koenig authored 13 years ago

Depending on implementation the optimized SAD functions may return early
when the calculated SAD exceeds max_sad.

Change-Id: I05ce5b2d34e6d45fb3ec2a450aa99c4f3343bf3a

6b151d43

OS X shell is incompatible with echo -n · 5f0b303c

Johann Koenig authored 13 years ago

Built in echo in 'sh' on OS X does not support -n (exclude trailing
newline). It's not necessary so just leave it off. Fixes issue 390.

Build include guard using 'symbol' so that it is more likely to be
unique.

Change-Id: I4bc6aa1fc5e02228f71c200214b5ee4a16d56b83

5f0b303c

Include path fix for building against Android NDK. · 3653fb47

Fritz Koenig authored 13 years ago

cpu-features.h is not in the common paths, add
to the cflags for Android.

Change-Id: Icbafc7600d72f6b59ffb030f6ab80ee6860332bb

3653fb47

vpxenc: initial implementation of multistream support · 9e50ed7f

John Koleszar authored 13 years ago

Add the ability to specify multiple output streams on the command line.
Streams are delimited by --, and most parameters inherit from previous
streams.

In this implementation, resizing streams is still not supported. It
does not make use of the new multistream support in the encoder either.
Two pass support runs all streams independently, though it's
theoretically possible that we could combine firstpass runs in the
future. The logic required for this is too tricky to do as part of this
initial implementation. This is mostly an effort to get the parameter
passing and independent streams working from the application's
perspective, and a later commit will add the rescaling and
multiresolution support.

Change-Id: Ibf18c2355f54189fc91952c734c899e5c072b3e0

9e50ed7f

vpxenc: factor out input open/close · 732cb9a6

John Koleszar authored 13 years ago

Simplify some of the file I/O for later commits which will add multistream
support

Change-Id: Idf1a05f3a29c95331d0c4a6ea5960904e4897fd4

732cb9a6

vpxenc: add warning()/fatal() helpers · c535025c

John Koleszar authored 13 years ago

Cosmetic. Allows exiting with an error message without opening a new
scope.

Change-Id: If227b29b825f0241acea79dd38f19e524552ee18

c535025c

decoder: reset segmentation map on keyframes · e8223bd2

John Koleszar authored 13 years ago

Refactoring some of the mode decoding logic introduced a bug where
the segmentation maps would not be properly reset on keyframes.

http://code.google.com/p/webm/issues/detail?id=378

The text of the bug is somewhat misleading as I initially read it to
imply the bug was present in v0.9.7-p1 (Cayuga), but note the text
"master", which indicates this was something subsequent. This issue
bisects back to v0.9.7-p1-84-ga99c20c, so unfortunately it was broken
during the Duclair release.

Thanks to Alexei Leonenko for investigating the root cause.

Change-Id: I9713c9f070eb37b31b3b029d9ef96be9b6ea2def

e8223bd2

Support Android x86 NDK build · 7989bb7f

Makoto Kato authored 13 years ago

On Android NDK, rand() is inlined function.  But, on our SSE optimization,
we need symbol for rand()

Change-Id: I42ab00e3255208ba95d7f9b9a8a3605ff58da8e1

7989bb7f

Simplify mb_to_x_edge calculation during mode decoding · 6776bd62
Scott LaVarnway authored 13 years ago
```
Change-Id: Ibcb35c32bf24c1d241090e24c5e2320e4d3ba901
```
6776bd62
Merge "decodemv cleanup/improvements" · a5879f7c
Scott LaVarnway authored 13 years ago

a5879f7c

decodemv cleanup/improvements · 12ee845e

Scott LaVarnway authored 13 years ago

Removed unnecessary variables, unrolled functions, eliminated
unnecessary mv bounds checks and branches.

Change-Id: I02d034c70cd97b65025d59dd67c695e1db529f0b

12ee845e

Consolidate C version of token packing functions · d02e74a0

Attila Nagy authored 13 years ago

Replace inner loops of pack_mb_row_tokens_c and
pack_tokens_into_partitions_c with a call to pack_tokens_c.

Change-Id: I0341554fb154a14a5dadb63f8fc78010724c2c33

d02e74a0

Multithreaded encoder, late sync loopfilter · 78071b3b

Attila Nagy authored 13 years ago

Second shot at this...

Sync with loopfilter thread as late as possible, usually just at the
beginning of next frame encoding. This returns control to application
faster and allows a better multicore scaling.

When PSNR packets are generated the final filtered frame is needed
imediatly so we cannot delay the sync. Same has to be done when
internal frame is previewed.

Change-Id: I64e110c8b224dd967faefffd9c93dd8dbad4a5b5

78071b3b