[MPlayer-dev-eng] [OT] C-code Optimiation Contest

Arpi arpi at thot.banki.hu
Tue Jul 15 02:12:32 CEST 2003


Hi,

> if you like to have some fun, try optimizing the attached simple matrix 
> multiply and post your results.
> 
> The Rules:
> 1. you may only modify multiply_d.[ch] (NUM should stay 512 though)
> 2. you may change compiler and optims in Makefile
> 3. precision must stay the same
> 4. to compare, you should compile the orginal code:
>     make && copy matrix matrix.org && (./matrix.org >res.org)
> 5. then later you can compare results via:
>    make && (./matrix >res.txt) && diff -q res.org res.txt
> 
> Have fun!
> 
> Btw. current results by me is 738% of original speed, arpis results are even 
> better, as he included my tips in his alredy optimized code =))))

ok here is my (ok, our:)) contribution:

to get best results (19.628 faster than original) set DIM to 516 in the .h

/* simple matrix multiply */
#include "multiply_d.h"

void multiply(double a[][DIM], double b[][DIM], double c[][DIM]) {
  double* ap=a[0];
  double* cp=c[0];
  unsigned int i;
  for(i=0; i<NUM; i++) {
    unsigned int k;
    double* bp=b[0];
    for(k=0; k<NUM; k+=4, bp+=DIM*4) {
      double ap_k0=ap[k+0];
      double ap_k1=ap[k+1];
      double ap_k2=ap[k+2];
      double ap_k3=ap[k+3];
      register unsigned int j;
      for(j=0; j<NUM; j+=4) {
	cp[j+0] += ap_k0*bp[j+0] + ap_k1*bp[j+0+DIM] + ap_k2*bp[j+0+2*DIM] + ap_k3*bp[j+0+3*DIM];
	cp[j+1] += ap_k0*bp[j+1] + ap_k1*bp[j+1+DIM] + ap_k2*bp[j+1+2*DIM] + ap_k3*bp[j+1+3*DIM];
	cp[j+2] += ap_k0*bp[j+2] + ap_k1*bp[j+2+DIM] + ap_k2*bp[j+2+2*DIM] + ap_k3*bp[j+2+3*DIM];
	cp[j+3] += ap_k0*bp[j+3] + ap_k1*bp[j+3+DIM] + ap_k2*bp[j+3+2*DIM] + ap_k3*bp[j+3+3*DIM];
	//cp[j+4] += ap_k0*bp[j+4] + ap_k1*bp[j+4+DIM] + ap_k2*bp[j+4+2*DIM] + ap_k3*bp[j+4+3*DIM];
	//cp[j+5] += ap_k0*bp[j+5] + ap_k1*bp[j+5+DIM] + ap_k2*bp[j+5+2*DIM] + ap_k3*bp[j+5+3*DIM];
	//cp[j+6] += ap_k0*bp[j+6] + ap_k1*bp[j+6+DIM] + ap_k2*bp[j+6+2*DIM] + ap_k3*bp[j+6+3*DIM];
	//cp[j+7] += ap_k0*bp[j+7] + ap_k1*bp[j+7+DIM] + ap_k2*bp[j+7+2*DIM] + ap_k3*bp[j+7+3*DIM];
      }
    }
    cp+=DIM;
    ap+=DIM;
  }
}



A'rpi / Astral & ESP-team

--
Developer of MPlayer G2, the Movie Framework for all - http://www.MPlayerHQ.hu



More information about the MPlayer-dev-eng mailing list