Answered -spin parameter discussion for virtualised CPUs

steveprog

New Member
Hello,
I am a Progress and VMware novice who is the IT caretaker for a small business using an Openedge application. We have no specific performance issues at present.
I am simply curious from an academic viewpoint about the '-spin' parameter and how the optimum setting is determined for virtualised guest OS with vCPU resources provisioned in different permutations:

I read that the recommended starting point for the -spin parameter for a multicore CPU is 8000 per CPU
(progress KB P23850), and various advice on this forum recommend a much higher level on 'modern' CPUs.
For "CPU" I read "socket" - I infer from the Progress recommendation that DB performance will scale with an increased number of sockets as we can increase the -spin parameter - at least if 'resource waits' were a factor constraining DB performance.
In ESXi 4.x + we can provision vCPUs with more than one core per vCPU:
Imagine we have host hardware consisting of 2x4core physical CPUs (2 sockets populated, 4 cores per socket) that hosts multiple guest VMs, and we have 4 physical core resources to assign to our Openedge DB guest OS,
1) I assume we would benefit therefore from provisioning the DB guest OS with 4x vCPU of one core each, instead of 1xvCPU of 4 cores? (our OS license allows for 4 physical processors), as it allows us to scale '-spin' accordingly?
2) If the trick above is valid, is the affect negated if the vCPUs are actually provisioned from the same physical CPU on the host (for example if the host vm was a single socket, multi-core physical CPU); despite dispatching requests to separate vCPUs, each request would still 'wait' for the resource as the physical CPU is busy?
3) perhaps this is all a moot point if there are no issues with 'resource waits' under the Promon utility, and one should only adjust -spin if necessary? (to put it into perspective we are getting ~600 resource waits per day with spin of 16,000 on 2x vCPU VM at 2.4GHz, which by my calculation is only an extra, ooh, 4ms delay? )
I welcome any comments.
Steve

For the curious:
Openedge 10.2B 32bit on Windows Server 2008R2 running virtualised under VMWare ESXi4, 2 vCPU assigned
DB size ~ 25GB. 100 concurrent users.
Openedge trained by Google.
 

TomBascom

Curmudgeon
I don't know where you are reading that "much higher" is the general advice on "this forum" or any other. In general modern CPUs and modern releases of Progress benefit from lower values than may have been used in the past.

The notion that -spin should be in any way related to the number of cores or their speed was deprecated by the original author (Gus) about 10 seconds after he said it. None the less it refuses to die and continues to be propagated. It's like a bad penny -- it keeps coming back. It even made it in to the product as the default setting.

For almost all people pretty much all the time any 4 digit value will work about as well as any other. If you'd like to invest a few months benchmarking your application and testing various scenarios you might find something that is slightly better for you. It is *very* unlikely that you will find a value that users notice is any better.

I use spin = ( BirthYear_of_DBA * pi ) as my formula for optimizing -spin.

With regards to virtualization -- the best performance comes from concentrating activity in as few cores as necessary (but no fewer). These cores should be physically contained in as few CPUs or "sockets", "packages" or "boards" as possible. You want to reduce the physical distance between the cores in order to minimize the effects of "cache ping pong". Paying attention to this is a much greater return on investment than fiddling with -spin.

"Resource waits" are the wrong metric to be looking at. "Resources" are higher level constructs than the "latches" (mutext locks) that -spin applies to. "Latch timeouts" tell you how often processes "spin out" (exceed -spin and then nap) but they do not tell you how long the naps were (they geometrically increases according to -nap and -napmax) nor does it tell you how much time went into spinning successfully. All of which is wait time. The _Latch VST provides much (but not all) of this data but it is difficult to extract much meaning out of it.
 
Last edited:

TomBascom

Curmudgeon
One more thing... if latch timeouts and such are an actual problem with your application (it takes *very* high read activity and tuning to mostly eliminate disk IO to get to this point...) then you should make sure to be running 10.2B SP08 or better and enable -lruskips. This enormously reduces latch activity and has a very beneficial impact on concurrency in high-read environments.

If you are seeing latch timeouts in excess of a few hundred per second you may be a candidate.
 

Rob Fitzpatrick

ProgressTalk.com Sponsor
... then you should make sure to be running 10.2B SP08 or better and enable -lruskips.

I might amend that to "SP07 or better", as SP08 won't be release until later this quarter. ;) The -lruskips param has been in the product since SP06. That aside, great post. :)
 

Cringer

ProgressTalk.com Moderator
Staff member
You might want to look at a 64 bit install of Progress on the server too IMO.
 

TomBascom

Curmudgeon
Yes, if it is a database server then it should be a 64 bit OS and 64 bit Progress.

If it is a combination database server / app server it can sometimes get complicated if the vendor has unwisely chosen to implement background processes with prowin32.exe rather than _progres.exe. Sometimes all you need to do to fix that is to change the name of the executable in the scripts. But sometimes they also spuriously make calls to 32 bit libraries -- which then ties you to 32 bit code :(
 

steveprog

New Member
This is all useful information, thankyou for you replies. I shall worry less about optimising -spin; our latch timeouts value is averaging 2 per minute...
It is evident from my post that it is tricky for a novice such as myself to interpret the various sources of information regarding this parameter, and I welcome the feedback given above.
For example, Progress knowledgebase articles 000020011/P23850 and 000022638/P9992 recommend reviewing 'resource waits' for judging -spin performance, and a value of 10,000 ( here I admit my lack of clarity of source over advice given in 'this forum' as opposed to the progress knowledgebase), whilst article 000022260/P92353 refers to 'latch waits' as you mention above for determining spin.
Openedge 11.2 deployment guide online (forum prohibits me posting links) recommends a starting value of 6000*#CPUs for multiprocessor systems, and this was the motive for my original topic for discussion (that - whilst acknowledging '-spin' is not related to cores on a CPU - it was somehow intrinsically dependant/scalable/optimisable on the number of CPUs/sockets/boards, and investigating how a virtual CPU would be interpreted in this regard ).

As an aside, we are currently restricted by 32bit appserver components, but with the option to migrate these onto a separate server once budgets allow. There are no perceived performance issues at present , therefore little motivation from management.
Thanks again
Steve
 

TomBascom

Curmudgeon
Like I said... it's like a bad penny. It keeps coming back.

There is a very strong need to provide a formula. The kbase and documentation people apparently just cannot resist the urge.
 

Cringer

ProgressTalk.com Moderator
Staff member
Yes there would - for all those folks who insist on doing things wrong because they can. :eek:
 
Top