Skip to content
GitLab
Explore
Sign in
Register
Primary navigation
Search or go to…
Project
A
aom-rav1e
Manage
Activity
Members
Labels
Plan
Issues
Issue boards
Milestones
Wiki
Code
Merge requests
Repository
Branches
Commits
Tags
Repository graph
Compare revisions
Snippets
Build
Pipelines
Jobs
Pipeline schedules
Artifacts
Deploy
Releases
Container Registry
Model registry
Operate
Environments
Monitor
Incidents
Analyze
Value stream analytics
Contributor analytics
CI/CD analytics
Repository analytics
Model experiments
Help
Help
Support
GitLab documentation
Compare GitLab plans
Community forum
Contribute to GitLab
Provide feedback
Terms and privacy
Keyboard shortcuts
?
Snippets
Groups
Projects
Show more breadcrumbs
Xiph.Org
aom-rav1e
Commits
901d2036
Commit
901d2036
authored
9 years ago
by
Jian Zhou
Committed by
Gerrit Code Review
9 years ago
Browse files
Options
Downloads
Plain Diff
Merge "Speed up tm_predictor_8x8"
parents
adb033b5
f4621c5c
No related branches found
Branches containing commit
No related tags found
Loading
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
vpx_dsp/x86/intrapred_sse2.asm
+17
-19
17 additions, 19 deletions
vpx_dsp/x86/intrapred_sse2.asm
with
17 additions
and
19 deletions
vpx_dsp/x86/intrapred_sse2.asm
+
17
−
19
View file @
901d2036
...
...
@@ -545,33 +545,31 @@ cglobal tm_predictor_4x4, 4, 4, 5, dst, stride, above, left
RET
INIT_XMM
ss
e2
cglobal
tm_predictor_8x8
,
4
,
4
,
4
,
ds
t
,
stride
,
above
,
left
cglobal
tm_predictor_8x8
,
4
,
4
,
5
,
ds
t
,
stride
,
above
,
left
pxor
m1
,
m1
movd
m2
,
[
aboveq
-
1
]
movq
m0
,
[
aboveq
]
punpcklbw
m2
,
m1
punpcklbw
m0
,
m1
pshuflw
m2
,
m2
,
0x0
punpcklbw
m0
,
m1
; t1 t2 t3 t4 t5 t6 t7 t8 [word]
pshuflw
m2
,
m2
,
0x0
; [63:0] tl tl tl tl [word]
DEFINE_ARGS
ds
t
,
stride
,
line
,
left
mov
lineq
,
-
4
punpcklqdq
m2
,
m2
add
leftq
,
8
psubw
m0
,
m2
.loop:
movd
m2
,
[
leftq
+
lineq
*
2
]
movd
m3
,
[
leftq
+
lineq
*
2
+
1
]
punpcklbw
m2
,
m1
punpcklbw
m3
,
m1
pshuflw
m2
,
m2
,
0x0
pshuflw
m3
,
m3
,
0x0
punpcklqdq
m2
,
m2
punpcklqdq
m3
,
m3
paddw
m2
,
m0
punpcklqdq
m2
,
m2
; tl tl tl tl tl tl tl tl [word]
psubw
m0
,
m2
; t1-tl t2-tl ... t8-tl [word]
movq
m2
,
[
leftq
]
punpcklbw
m2
,
m1
; l1 l2 l3 l4 l5 l6 l7 l8 [word]
.loop
pshuflw
m4
,
m2
,
0x0
; [63:0] l1 l1 l1 l1 [word]
pshuflw
m3
,
m2
,
0x55
; [63:0] l2 l2 l2 l2 [word]
punpcklqdq
m4
,
m4
; l1 l1 l1 l1 l1 l1 l1 l1 [word]
punpcklqdq
m3
,
m3
; l2 l2 l2 l2 l2 l2 l2 l2 [word]
paddw
m4
,
m0
paddw
m3
,
m0
packuswb
m
2
,
m3
movq
[
ds
tq
],
m
2
movhps
[
ds
tq
+
strideq
],
m
2
packuswb
m
4
,
m3
movq
[
ds
tq
],
m
4
movhps
[
ds
tq
+
strideq
],
m
4
lea
ds
tq
,
[
ds
tq
+
strideq
*
2
]
psrldq
m2
,
4
inc
lineq
jnz
.loop
REP_RET
...
...
This diff is collapsed.
Click to expand it.
Preview
0%
Loading
Try again
or
attach a new file
.
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Save comment
Cancel
Please
register
or
sign in
to comment