Openedge Management: Daily Reboots, Request Duration

Brian Cooney

New Member
Good morning! I have been lurking on this forum for a week or two, Great content, you guys have much to offer.

Our company recently deployed Opendge Management. At this time, our upstream vendor has very little expertise of the product, and we are their first client to use it, (we insisted) so our support with them is limited.

I have found OEM to be a nice interface to check app severs in real time and make simple configuration changes, but I think it could be much more useful, if I just knew where to look.

My first task is to get the alerts down to a level where I feel compelled to actually look at them when one is sent, by reducing the amount of "garbage" alerts the system sends out. Once per day, 1/2 of our app servers reboot (windows boxes.) Is there any way to have OEM suppress the alert, or not log the app server reboots at all so I don't get these as false positives? Alternately, perhaps there is an alternate ay I should be shutting them down to keep OEM happy?

My second task, and this is a bonus because we are already doing it, but with out of band custom tools, is there a way to use OEM to get performance snapshots of the app servers on 10 or 15 minute intervals? I have a script that I wrote (before OEM) that runs absman on each of the app servers, and logs all of the stats every 5 minutes. This lets me go back and "see" that the app server was suffering at a time, or fine. The problem is that ABSMan shows the average since the server reboots (I think) and I don't know how to shorten the interval. In order to get better data, our vendor built a program that actually reads the debug level log files, and figures out the same thing (average request duration) and plots it on a chart, but their tool depends on Excel and is not automated. Can I do something similar, or better, leveraging OEM and the Trend database? Is this functionality hiding under the hood, or more akin to a feature requests?
 
If the app server agents are restarted at the same time every day, you could look at changing the times when the system is monitored. Inside the Default_Schedule_Plan is the Default_Schedule - 24 x 7 is the default.

You have the option to trend certain app server stats, I don't trend app server stats, I only trend db stats so I'll let another comment on that.

One caution - don't use the "Kill" button in OEM to disconnect an app server agent - available if you drill down on the pid of an agent. That kills the app server with a kill -9 on Unix / Linux systems and can bring down the db. Not sure about windows. There is a KB article about it.

I wish you lots of luck in playing around with OEM - it seems very few OE customers use this product so there is little activity around it. We were heavy users of OEM at one time, but have scaled back.
 
CJ_Brandt,

Default_Schedule might be just the ticket, I think if I change it to monitor from 2:15AM to 2:00AM it should "miss" the reboots and not report them. Sound right?

Can you tell me a little about how you trend your DB stats? I am betting app server stats are similar, and I dont have a script to trend the DB anyway, so OEM might be a good place to do it.

Kill -9.... outch. One of the progress guys that helped set up OEM told us that Kill -9 on an app server can bring the DB down IF the DB and app server are using shared memory. In our environment, they are not, and are in fact on seperate hosts. Would that make Kill -9 safe (ie, only kill the app server) or should I still be wary of letting people use OEM to kill an app server? For what its worth, its been a good long while since that has been necessary anyway.

You say you have scaled back, what are you using now?

Tom,

I have seen your posts, and you impress me. I might download and give protop a try one of these days, but I think I will be stuck with OEM for a while, mainly because I very much doubt that our upstream vendor will agree to support anything else (getting them to provide OEM required arm twisting and extra licensing). I certainly woudlnt mind getting to know DBAppraise and may do so one day, but I need to know what I have much better first before making a good comparison and asking our company to switch directions.

Thank you both for your replies!

--Brian
 
I had week_day, week_end and a few other schedules and the point was to stop monitoring around times when we knew things would be down.

There are many items to trend for DB stats, there are only a few for the app server so growth of the trend db shouldn't be as big of an issue. We collect table stats every 15 minutes for a few of our big clients, most others were every 60 minutes. Table stats are what consume a bunch of space.

I would manually kill the agents rather than use OEM. You are running app server, but not via shared memory, bummer. I used to work for a medical company that had to do that because the app server ran in state reset mode so it chewed up so much memory the app servers had to be hosted on other servers, so I have been in that boat before.

We scaled back because we had too large of an environment for 2 OEM servers to handle. We are a Software As A Service provider and have hundreds of clients, each with app servers. We basically swamped the OEM servers until they started impacting the admin servers running on the prod server - which impacted our clients. We started writing our own monitoring scripts and only using OEM for the largest clients.
 
Back
Top