How to unravel a "black hole" setup?

Confucedius

New Member
Hi all,

I'm looking for some tips / pointers on how to approach a system I run into which is for me, but also other people in the company a sort of a mystery. I have at the moment no progress experience, but am working on it. I have a ms SQL db server background (advanced). I have very little linux knowledge.

The situation is as following:
There is a redhat enterprise server with QAD software and a small database: 50 GB. It runs progress openedge 9 as database server (or is it openedge rdbms? 9). This server reguarly freezes without a clear cause (many consultants have looked into it).

As hardly anyone will step up to the plate i will give it a go to bring the system into the light so everyone knows how its build and what it does and where to look in case of issues.

So where to start? <-- need tips ;)
1) I have setup progress explorer on a windows machine -version 10.2b- , to have a sort of "sql management studio" but can't add the linux box ("ensure admin service is running")

2) I have the progress dashboard (http://localhost:9090/fathom.htm) can't add / discover the server their aswell, it only sees the qad site.

how can i fire a query at the database, so that i can ensure which version i exactly run? As i only know it's somewhere in the 9

A yes/no question: Can i setup a windows machine with the same version and restore the database on that machine or is that not possible and should i not google into that to spare my time?

I think i need the proper 'tools' to get myself formilliar with the setup so i can start working on the daily issues after some time.


many thanxs for pointers
 
If your company pays for annual maintenance to QAD, you would be better off getting someone from QAD Support to help you analyse/fix the freezing issue. You should be able to learn the system and maintain it in the long run.

If you think the version is 9, then it could be Progress 9. Its called OpenEdge from Progress v10. To confirm it, you can look for a dlc folder. You would see an environment variable "DLC" in a few places, database startup scripts, client scripts that let users log into QAD, database backup scripts. That would tell you were the Progress software is installed. There you can find a "version" file, or you can run $DLC/bin/proenv.

You can move the database from linux to windows by dumping the data and loading it into a new database. You wouldnt need to do that if you are going to stay with redhat. You would need licenses for your windows machine for both Progress and if you want to run QAD on the windows machine, you would have to get licenses for that.

You could also read up some documentation about "promon" from the progress website. This is the database monitoring tool that you would use.

HTH
 
Progress is not a SQL database. (Progress does support SQL but it is not the "natural" interface and QAD is a 4GL oriented application.) The db engine is nothing like MS SQL Server and Progress on Linux is going to be very different from your experiences. Trying to administer this system from a Windows PC with a GUI using a SQL mindset isn't going to work well. You're just digging the hole deeper trying to go that way. (Technically, it can be done -- but it won't "just happen". It will take significant effort.)

Apparently you've had some bad experiences with some consultants but I would strongly suggest that you get some strong mentoring from a good one. Or take some training classes.
 
Thanks guys for the comments.

I have started a search for consultancy (have some new companies looking into it now and I looked into the system myself.

I'm afraid I looking at a failed upgrade from version 9 to 10. I runned a $DLC command (haven't got it with me) that showed that it's version 10 but the application/db man says it's version 9 for sure. That will require correcting of course firstly.

Anyway, I'm looking into a high availability setup firstly on hardware level. I'm sure the consultant will make proposals but I like to gain some knowledge on the topic myself.

It's important that the (QAD) database is highly available.

a) Could I use HA from VMware's Vsphere, this covers hardware and OS failure, although if 2 machines are identical at the same moment you couldn't say it's OS failure protected. At least the there is an instant 'new' machine.


b) Should I use an cluster setup, either a passive passive server (have the users re-connect to a different config / db) in combination with something like transaction log shipping. Or a Active passive server.

Also, regarding QAD I think/wonder if the idea of separating the tasks (web server / print / mail server / qad app server) is the way to go, at this moment it's all running on the same machine. I'd rather have the users connect to a application server / middle tier instead of the DB server directly. But I have to say my QAD knowledge is 0.

Money is not a consideration at the moment, just looking into the most HA solution possible.
 
For a highly available system your first and most important line of defense is after-imaging. See this: http://dbappraise.com/ppt/ai.pptx for details.

After imaging is the basis for more advanced solutions -- either log based replication ("log forwarding" in MS SQL Server speak) or OE Replication (a hot spare).
 
Take a look to your [databasename].lg
Search for "KILL".
If you found it ...
search on all scripts who "kill" progress process PID. Made a comment (#).
 
Right, well what is happening ? there is a KBase who explain this on progress site.

Bref explain:
the telnet server have a keep-alive time-out to expire the zombies (2hours).

When the telnet keeps the process alive, Progress server is not "informed".
In this case the "system admin" KILL's the "bad session".
When Progress server receive a KILL , he say ... ooo-la-laaaaa we have a huge system error
and I have to rool-back and "abnormaly shutdown" the server because
I will loose data !!!!

And this cause the server is up to 100% CPU and freeze.

Fix: remove all this KILL on unix scripts and use a Progress tool the "watchdog".

After starting the database server ...
$: proserve [databaseName] ... etc.
do a $: prowdog [databaseName]
This start a backgroud process who clean-up himself all zombies connections.

And if , for any reason you need to KILL a _progres process you can use properly
the Progress command: $: proshut [databaseName]
You will see a list of connected users and PID , you choose the "bad one" and you say ->
Option 1 -> Shutdown User ....

Capisci ?
 
Ok, color me confused... what is this "kill" sub-thread about? It doesn't seem at all related to unraveling black holes or establishing a high availability system. Maybe it should be a stand-alone thread?
 
Cannot agree more.

Jumping between topics is bad, bad, bad culture - just like responding to an Email like "RE: I hate you" instead of creating a new one "I really love you!" - and a pain for everybody volunteering to help.

RealHeavyDude.
 
Back
Top