One of my coworkers directed my attention to the implementation of spinlocks on IA-32. In spin_lock_string, we can read:
"cmpb $0,%0\n\t" \ "rep;nop\n\t" \ "jle 2b\n\t" \
The "rep;nop" line looks dubious, since the IA-32 programmer's manual from Intel (year 2001) mentions that the behaviour of REP is undefined when it is not used with string opcodes. BTW, according to the same manual, REP is supposed to modify ecx, but it looks like is is not the case here... which is fortunate, since ecx is never saved. :-)
What is the intent behind this "rep;nop" ? Does it really rely on an undocumented behaviour ?
Regards,
-- Jean-Marc Saffroy - Research Engineer - Silicomp Research Institute mailto:saffroy@ri.silicomp.fr
@@@@@@@@@@@
On Mon, 17 Sep 2001, Jean-Marc Saffroy wrote:
> What is the intent behind this "rep;nop" ? Does it really rely on an > undocumented behaviour ?
Its used to stop Pentium 4's from cooking themselves. See the P4 manuals for more info.
regards,
Dave.
-- | Dave Jones. | SuSE Labs
@@@@@@@@@@@
On Mon, 17 Sep 2001, Dave Jones wrote:
> On Mon, 17 Sep 2001, Jean-Marc Saffroy wrote: > > > What is the intent behind this "rep;nop" ? Does it really rely on an > > undocumented behaviour ? > > Its used to stop Pentium 4's from cooking themselves. > See the P4 manuals for more info.
Ok, I found it: actually it is the PAUSE opcode in the P4 instruction set, and the doc for PAUSE mentions that it is equivalent to a NOP on older IA-32 processors.
So no black magic here, except that "rep;nop" is a bit misleading, since the Intel docs for REP and NOP do not mention PAUSE...
Thanks all for you help.
Regards,
-- Jean-Marc Saffroy - Research Engineer - Silicomp Research Institute mailto:saffroy@ri.silicomp.fr
@@@@@@@@@@@
On Mon, 17 Sep 2001, Jean-Marc Saffroy wrote:
> Hi all, > > One of my coworkers directed my attention to the implementation of > spinlocks on IA-32. In spin_lock_string, we can read: > > "cmpb $0,%0\n\t" \ > "rep;nop\n\t" \ > "jle 2b\n\t" \ > > The "rep;nop" line looks dubious, since the IA-32 programmer's manual from > Intel (year 2001) mentions that the behaviour of REP is undefined when it > is not used with string opcodes. BTW, according to the same manual, REP is > supposed to modify ecx, but it looks like is is not the case here... which > is fortunate, since ecx is never saved. :-) > > What is the intent behind this "rep;nop" ? Does it really rely on an > undocumented behaviour ? > > > Regards,
Well it's now documented although you have to search a web-site to find it. Basically, it runs the CPU at low clock-speed when it's busy-waiting. Since most all spin-locks lock for mere microseconds it's unlikely that it does anything useful, but it can't hurt.
@@@@@@@@@@@
From: Alan Cox (alan@lxorguk.ukuu.org.uk) Date: Mon Sep 17 2001 - 12:27:44 EST
> The "rep;nop" line looks dubious, since the IA-32 programmer's manual from > Intel (year 2001) mentions that the behaviour of REP is undefined when it > is not used with string opcodes. BTW, according to the same manual, REP is > supposed to modify ecx, but it looks like is is not the case here... which > is fortunate, since ecx is never saved. :-)
rep nop is a pentium IV operation. Its retroactively after testing defined to be portable and ok.
Alan
@@@@@@@@@@@
Followup to: By author: Alan Cox In newsgroup: linux.dev.kernel > > > The "rep;nop" line looks dubious, since the IA-32 programmer's manual from > > Intel (year 2001) mentions that the behaviour of REP is undefined when it > > is not used with string opcodes. BTW, according to the same manual, REP is > > supposed to modify ecx, but it looks like is is not the case here... which > > is fortunate, since ecx is never saved. :-) > > rep nop is a pentium IV operation. Its retroactively after testing defined > to be portable and ok. >
Now, the example brought up was assembly, but in general I really think we should have a processor-independent wait_loop(); inline. Right now we have a rep_nop(); inline which only works on x86 (and presumably x86-64).
-hpa
-- at work, in private! "Unix gives you enough rope to shoot yourself in the foot."
@@@@@@@@@@@
Alan Cox wrote: > > The "rep;nop" line looks dubious, since the IA-32 programmer's manual from > > Intel (year 2001) mentions that the behaviour of REP is undefined when it > > is not used with string opcodes. BTW, according to the same manual, REP is > > supposed to modify ecx, but it looks like is is not the case here... which > > is fortunate, since ecx is never saved. :-) > > rep nop is a pentium IV operation. Its retroactively after testing defined > to be portable and ok.
Are we sure that the value of ECX doesn't matter on a 386? Or does it count down doing nops ECX times on a 386?
-- Jamie
@@@@@@@@@@@
Jamie Lokier wrote: > > Alan Cox wrote: > > > The "rep;nop" line looks dubious, since the IA-32 programmer's manual from > > > Intel (year 2001) mentions that the behaviour of REP is undefined when it > > > is not used with string opcodes. BTW, according to the same manual, REP is > > > supposed to modify ecx, but it looks like is is not the case here... which > > > is fortunate, since ecx is never saved. :-) > > > > rep nop is a pentium IV operation. Its retroactively after testing defined > > to be portable and ok. > > Are we sure that the value of ECX doesn't matter on a 386? Or does it > count down doing nops ECX times on a 386?
Older processors ignore the rep prefix when used with non-string opcodes. %ecx should not be affected.