Skip to content
GitLab
Menu
Projects
Groups
Snippets
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
Guillaume Martres
aom-rav1e
Commits
a808dfe3
Commit
a808dfe3
authored
Jun 03, 2014
by
Jingning Han
Committed by
Gerrit Code Review
Jun 03, 2014
Browse files
Merge "Fix potential overflow issue in SSSE3 forward 8x8 2D-DCT"
parents
bf168d54
540d9103
Changes
1
Show whitespace changes
Inline
Side-by-side
vp9/encoder/x86/vp9_dct_ssse3_x86_64.asm
View file @
a808dfe3
...
@@ -23,6 +23,7 @@ pw_%1_%2: dw %1, %2, %1, %2, %1, %2, %1, %2
...
@@ -23,6 +23,7 @@ pw_%1_%2: dw %1, %2, %1, %2, %1, %2, %1, %2
pw_
%
2
_m
%
1
:
dw
%
2
,
-%
1
,
%
2
,
-%
1
,
%
2
,
-%
1
,
%
2
,
-%
1
pw_
%
2
_m
%
1
:
dw
%
2
,
-%
1
,
%
2
,
-%
1
,
%
2
,
-%
1
,
%
2
,
-%
1
%endmacro
%endmacro
TRANSFORM_COEFFS
11585
,
11585
TRANSFORM_COEFFS
15137
,
6270
TRANSFORM_COEFFS
15137
,
6270
TRANSFORM_COEFFS
16069
,
3196
TRANSFORM_COEFFS
16069
,
3196
TRANSFORM_COEFFS
9102
,
13623
TRANSFORM_COEFFS
9102
,
13623
...
@@ -83,7 +84,7 @@ SECTION .text
...
@@ -83,7 +84,7 @@ SECTION .text
%endmacro
%endmacro
; 1D forward 8x8 DCT transform
; 1D forward 8x8 DCT transform
%macro FDCT8_1D
0
%macro FDCT8_1D
1
SUM_SUB
0
,
7
,
9
SUM_SUB
0
,
7
,
9
SUM_SUB
1
,
6
,
9
SUM_SUB
1
,
6
,
9
SUM_SUB
2
,
5
,
9
SUM_SUB
2
,
5
,
9
...
@@ -92,14 +93,21 @@ SECTION .text
...
@@ -92,14 +93,21 @@ SECTION .text
SUM_SUB
0
,
3
,
9
SUM_SUB
0
,
3
,
9
SUM_SUB
1
,
2
,
9
SUM_SUB
1
,
2
,
9
SUM_SUB
6
,
5
,
9
SUM_SUB
6
,
5
,
9
%if %1 == 0
SUM_SUB
0
,
1
,
9
SUM_SUB
0
,
1
,
9
%endif
BUTTERFLY_4X
2
,
3
,
6270
,
15137
,
m8
,
9
,
10
BUTTERFLY_4X
2
,
3
,
6270
,
15137
,
m8
,
9
,
10
pmulhrsw
m6
,
m12
pmulhrsw
m6
,
m12
pmulhrsw
m5
,
m12
pmulhrsw
m5
,
m12
%if %1 == 0
pmulhrsw
m0
,
m12
pmulhrsw
m0
,
m12
pmulhrsw
m1
,
m12
pmulhrsw
m1
,
m12
%else
BUTTERFLY_4X
1
,
0
,
11585
,
11585
,
m8
,
9
,
10
SWAP
0
,
1
%endif
SUM_SUB
4
,
5
,
9
SUM_SUB
4
,
5
,
9
SUM_SUB
7
,
6
,
9
SUM_SUB
7
,
6
,
9
...
@@ -150,10 +158,10 @@ cglobal fdct8x8, 3, 5, 13, input, output, stride
...
@@ -150,10 +158,10 @@ cglobal fdct8x8, 3, 5, 13, input, output, stride
psllw
m7
,
2
psllw
m7
,
2
; column transform
; column transform
FDCT8_1D
FDCT8_1D
0
TRANSPOSE8X8
0
,
1
,
2
,
3
,
4
,
5
,
6
,
7
,
9
TRANSPOSE8X8
0
,
1
,
2
,
3
,
4
,
5
,
6
,
7
,
9
FDCT8_1D
FDCT8_1D
1
TRANSPOSE8X8
0
,
1
,
2
,
3
,
4
,
5
,
6
,
7
,
9
TRANSPOSE8X8
0
,
1
,
2
,
3
,
4
,
5
,
6
,
7
,
9
DIVIDE_ROUND_2X
0
,
1
,
9
,
10
DIVIDE_ROUND_2X
0
,
1
,
9
,
10
...
...
Write
Preview
Supports
Markdown
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment