[MPlayer-dev-eng] SFENCE really needed?
Nick Kurshev
nickols_k at mail.ru
Tue Jan 21 17:51:00 CET 2003
Hello, Michael!
On Tue, 21 Jan 2003 17:23:27 +0100 you wrote:
> Hi
>
> On Tuesday 21 January 2003 17:10, D Richard Felker III wrote:
> > On Tue, Jan 21, 2003 at 10:21:23AM +0100, Michael Niedermayer wrote:
> > > Hi
> > >
> > > On Tuesday 21 January 2003 06:22, D Richard Felker III wrote:
> > > > Hmm, I was reading some of the fast/agp memcpy code and other related
> > > > stuff, and I'm wondering...what's the performance penalty for using
> > > > the SFENCE instruction, and is it really necessary after writes to
> > > > video memory? I could understand it being useful when writing to
> > > > system memory, when another procedure might need to read the data
> > > > after you write it. But from what I understand, it's nonsense when
> > > > writing to video mem unless you plan on reading from video mem (slow
> > > > and pointless!). Any ideas?
> > >
> > > its needed, btw if u really dont read the data from there then why do u
> > > write it in the video mem? (hint the grafix card reads it & displays it
> > > on screen, so the last few pixels could be missing ..., i dunno if that
> > > would happen in practice, but it could IMHO according to the manuals)
> >
> > Hmm. I was under the impression that the special 'non-temporal' writes
> > or whatever caused the data to be written directly through the cache,
> > with only a minimal amount being cached for write combining, and that
> > sfence was for making sure the cpu wouldn't read stale values for
> > those addresses from the cache. However, you probably know a good deal
> > more about it than I do. It's been a long while since I coded much
> > asm, and back in the day it was all different -- cpu wasn't 10x as
> > fast as the memory so number of cycles actually mattered. :)
> well, i only know whats written in the docs from intel & amd and these are not
> completely clear to me, so IMHO its saver to add the sfence stuff ...
=================================================
SFENCE Store Fence
Acts as a barrier to force strong memory ordering (serialization) between store
^^^^^^^^^^^^^
instructions preceding the SFENCE and store instructions that follow the SFENCE.
A weakly-ordered memory system allows hardware to reorder reads and writes between
^^^^^^ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
the processor and memory. The SFENCE instruction guarantees that the system completes
all previous stores before executing subsequent stores...
============================================
According on Mans RullGard:
mb() is supposed to block until all outstanding memory operations have
completed. On some machines (e.g. Alpha) memory operations are
allowed to complete out of order. An mb instructions prevents any new
loads/stores from issuing before all previous load/stores are
completed. This is necessary for MMIO to work.
============================================
>From K8 manuals:
Memory-Mapped I/O.
Application software that needs to force memory ordering to memory-mapped
I/O devices can do so using the read/write barrier instructions: LFENCE,
SFENCE, and MFENCE.
============================================
>From me:
If you perform out-of-order writing of array[10] with using movnt** insns
then you may have situation when cpu will write it in order
1. array[2]
2. array[6]
3. array[1]
4. array[9]
and so on
This is useful for writing into graphics memory which can be temporary locked
by video DAC (SGRAM allows access to the memory chip from side of 2 units
simultaneously except locked memory cells).
And this is not acceptable for writing into MMIO for example. When array[10]
is memory mapped PCI ports then order of writing DOES MATTER.
IMHO that'c clear!
>
> [...]
>
> Michael
> _______________________________________________
> MPlayer-dev-eng mailing list
> MPlayer-dev-eng at mplayerhq.hu
> http://mplayerhq.hu/mailman/listinfo/mplayer-dev-eng
>
WBR! Nick
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
URL: <http://lists.mplayerhq.hu/pipermail/mplayer-dev-eng/attachments/20030121/71bb8831/attachment.pgp>
More information about the MPlayer-dev-eng
mailing list