AW: [Script] VDR-SC FFdecsa optimization
Mit der aktuellen Version (eben raus) könnte die Performance auf bestimmten System ggf. nochmal etwas besser sein.
Ja!
./cpuoptv7.sh
FFdecsa optimization helper (benchmark) V7
### CPU-INFO ###
System: x86
Vendor-ID: AuthenticAMD
CPU-Family: 16
CPU-Model: 6
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm 3dnowext 3dnow constant_tsc nonstop_tsc extd_apicid pni monitor cx16 popcnt lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt npt lbrv svm_lock nrip_save
gcc version 4.5.0 20100604 [gcc-4_5-branch revision 160292] (SUSE Linux)
Using compilers "native" flags
### FFdeCSA TEST ###
Using compiler: g++
with flags: -O3 -march=native -mmmx -msse -msse2 -fexpensive-optimizations -funroll-loops
Trying various FFdecsa optimizations...
PARALLEL_32_INT: test failed
PARALLEL_32_4CHAR: 67
PARALLEL_32_4CHARA: 40
PARALLEL_64_8CHAR: 60
PARALLEL_64_8CHARA: 38
PARALLEL_64_2INT: test failed
PARALLEL_64_LONG: 129
PARALLEL_64_MMX: 313
PARALLEL_128_16CHAR: 59
PARALLEL_128_16CHARA: 42
PARALLEL_128_4INT: 183
PARALLEL_128_2LONG: 169
PARALLEL_128_2MMX: 261
PARALLEL_128_SSE: 370
PARALLEL_128_SSE2: 417
Choosing PARALLEL_MODE = PARALLEL_128_SSE2 (417 Mbit/s)
### VDR-SC FFdeCSA MAKEFILE OPTS ###
CPUOPT ?= native
PARALLEL ?= PARALLEL_128_SSE2
CSAFLAGS ?= -O3 -march=native -mmmx -msse -msse2 -fexpensive-optimizations -funroll-loops
### GENERIC FFdeCSA MAKE OPTS ###
FLAGS="-O3 -march=native -mmmx -msse -msse2 -fexpensive-optimizations -funroll-loops" PARALLEL_MODE=PARALLEL_128_SSE2
./cpuoptv7.sh -n
FFdecsa optimization helper (benchmark) V7
### CPU-INFO ###
System: x86
Vendor-ID: AuthenticAMD
CPU-Family: 16
CPU-Model: 6
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm 3dnowext 3dnow constant_tsc nonstop_tsc extd_apicid pni monitor cx16 popcnt lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt npt lbrv svm_lock nrip_save
gcc version 4.5.0 20100604 [gcc-4_5-branch revision 160292] (SUSE Linux)
Compilers "native" flags disabled or unsupported
### FFdeCSA TEST ###
Using compiler: g++
with flags: -O3 -march=amdfam10 -fexpensive-optimizations -funroll-loops
Trying various FFdecsa optimizations...
PARALLEL_32_INT: test failed
PARALLEL_32_4CHAR: 68
PARALLEL_32_4CHARA: 41
PARALLEL_64_8CHAR: 60
PARALLEL_64_8CHARA: 38
PARALLEL_64_2INT: test failed
PARALLEL_64_LONG: 130
PARALLEL_64_MMX: 314
PARALLEL_128_16CHAR: 59
PARALLEL_128_16CHARA: 42
PARALLEL_128_4INT: 183
PARALLEL_128_2LONG: 168
PARALLEL_128_2MMX: 260
PARALLEL_128_SSE: 373
PARALLEL_128_SSE2: 418
Choosing PARALLEL_MODE = PARALLEL_128_SSE2 (418 Mbit/s)
### VDR-SC FFdeCSA MAKEFILE OPTS ###
CPUOPT ?= amdfam10
PARALLEL ?= PARALLEL_128_SSE2
CSAFLAGS ?= -O3 -march=amdfam10 -fexpensive-optimizations -funroll-loops
### GENERIC FFdeCSA MAKE OPTS ###
FLAGS="-O3 -march=amdfam10 -fexpensive-optimizations -funroll-loops" PARALLEL_MODE=PARALLEL_128_SSE2
Noch schneller (429) wird es aber mit
FLAGS ?= -O3 -march=amdfam10 -msse2 -fomit-frame-pointer -fexpensive-optimizations -funroll-loops und PARALLEL_MODE=PARALLEL_128_SSE2