Frequent space issue on the server where the application logs are written.

chandukarla

New Member
Hello All,

We have a application where progress connects with oracle using the data server. Business logic was written in progress 4gl and application front end in Java. We are using progress 10.2B version.

I am using classic open edge appserver and my appserver running on the state-free mode. We have setup the name server and different application brokers running under the same name server.

Qns: Recently, Memory on the server directory is getting full for every 3-4 days frequently. This directory contains the application server logs and batch logs. we haven't seen this behavior earlier. Not sure that whether there were any temp files/hidden files are getting created and occupying space. re-mounting the space every time in production server is tedious task. Could you please someone help me how to monitor these application broker logs and what consuming the server space?
 

TomBascom

Curmudgeon
What operating system is the appserver running on?

If it is Linux you can use "ls", "df", "du", and "lsof" to explore how much disk space is in use and what files are using it.

If it is Windows you should, of course, apply the Linux patch.

But while you wait for the Linux patch you can use Windows Explorer, and the DIR command to get an idea what is using your disk space. You should also get the Sysinternals Tools from Microsoft and use them to get a better picture of things.

Regardless - if the problem is that the log files themselves have recently become much larger, then the content of those log files will likely tell you why. They might, for instance, be rapidly filling with error messages related to something that you ought to fix. Or maybe they are reporting something innocent about application usage that is only a "problem" because the business activity has recently increased dramtically (which would usually be a good thing).

If temp files are the issue then we really can't say much without knowing *what* those temp files are.

Lastly (which should really be "firstly"...) what changed? Have you "upgraded" something? Patched a component? Tweaked a parameter? Promoted some new application logic? Retired an old server? Hired 500 customer service reps? Cut prices by 30% to bring in more business? Won a big contract that requires a hug expansion of the warehouse? Acquired a competitor and integrated all of their orders into your system? etc...
 

chandukarla

New Member
What operating system is the appserver running on?

If it is Linux you can use "ls", "df", "du", and "lsof" to explore how much disk space is in use and what files are using it.

If it is Windows you should, of course, apply the Linux patch.

But while you wait for the Linux patch you can use Windows Explorer, and the DIR command to get an idea what is using your disk space. You should also get the Sysinternals Tools from Microsoft and use them to get a better picture of things.

Regardless - if the problem is that the log files themselves have recently become much larger, then the content of those log files will likely tell you why. They might, for instance, be rapidly filling with error messages related to something that you ought to fix. Or maybe they are reporting something innocent about application usage that is only a "problem" because the business activity has recently increased dramtically (which would usually be a good thing).

If temp files are the issue then we really can't say much without knowing *what* those temp files are.

Lastly (which should really be "firstly"...) what changed? Have you "upgraded" something? Patched a component? Tweaked a parameter? Promoted some new application logic? Retired an old server? Hired 500 customer service reps? Cut prices by 30% to bring in more business? Won a big contract that requires a hug expansion of the warehouse? Acquired a competitor and integrated all of their orders into your system? etc...
Thanks Tom for addressing my question.

I am using AIX server. When i executed the command df -g , result showing like 100% where as if i check at DB level only 240GB was utilized out of 605GB. I am not sure where the remaining space was allocated.

Yes. The client has changed the Infra maintenance vendor from our company to other. The other vendor maintains the DB activities like DB health check, Application logs monitoring and other services of production server. it's been around 9 years we were on live with this product and never had this situation. The new vendor blaming that space issue created by the application logs only where i was clueless.

192.67.78.24:/pliLog 669.00 4.42 100% 135507 10% /pliLog

Please let me know if this helps to identify the issue or need further information on this.

Thanks in advance.
 

TomBascom

Curmudgeon
Thanks Tom for addressing my question.

I am using AIX server. When i executed the command df -g , result showing like 100% where as if i check at DB level only 240GB was utilized out of 605GB. I am not sure where the remaining space was allocated.

Yes. The client has changed the Infra maintenance vendor from our company to other. The other vendor maintains the DB activities like DB health check, Application logs monitoring and other services of production server. it's been around 9 years we were on live with this product and never had this situation. The new vendor blaming that space issue created by the application logs only where i was clueless.

192.67.78.24:/pliLog 669.00 4.42 100% 135507 10% /pliLog

Please let me know if this helps to identify the issue or need further information on this.

Thanks in advance.

It sounds like something a bit more than changing vendors has occurred... does this new company know anything about Progress? Or are they just a general purpose infrastructure provider that is now going through "Progress who?"

Anyhow...

At least it is a flavor of UNIX :)

"df" says the disk is full but you don't see enough "stuff" in the filesystem to add up to being full? And if you reboot or restart Progress the disk space magically reappears?

That would indicate "invisible temp files". When a Progress session is running it opens a number of temp or "scratch" files to do things like sort results sets or "swapping" in memory structures that get too big (not literal "swapping" in the UNIX sense but a similar idea) .

The temporary files are located wherever the session "-T /some/directory/name" parameter points. If the session does NOT have -T specified then the current directory is used. By default the appserver current directory is $WRKDIR which is also probably where your log files are.

If the "-t" (lower case tee) parameter is NOT specified the temporary files are created "unlinked". That makes them invisible so that you cannot see them with "ls". That is annoying because you cannot see how big they are except by using something like "lsof" (see below). OTOH, it means that they are automatically deleted if the session crashes or the db restarts or the server is rebooted etc.

If -t is enabled to make the temp files visible they do NOT get automatically cleaned up when a session dies for whatever reason, It then becomes your responsibility to clean up stale temp files. So one possibility is that -t IS turned on but the cleanup is not happening and so you are running out of space for no good reason. If your -T directory is full of old files with crazy names ("ls -lr" will tell you if they are old) then you might just need to purge old files.

But if there are invisible temp files you still don't know which ones nor how big and the only thing you can do without digging into which ones and how big is to make the filesystem bigger. And hope that you guess right regarding how much bigger.

To dig deeper with "lsof" get it installed, run it, and look at the output to determine which files are taking up all of your disk space. Report back here and we might be able to tell you what your findings mean.

Or... find the startup parameters for your appservers. They are _probably_ in $DLC/properties/ubroker.properties.

workDir=/some/directory
srvrStartupParam=

srvStartupParam may have -T and/or -t on it or it might point to a -pf file with more parameters. In any event these options allow you to change the defaults so if -T and -t are missing you can add them and you will then be able to see what space the temp files are using (without "lsof").


The "lsof" command is generally an optional AIX command, if you don't have it you _could_ ask the sys admins to install it and give you access (it requires privileges) would shed light on what particular files are involved.
 

chandukarla

New Member
It sounds like something a bit more than changing vendors has occurred... does this new company know anything about Progress? Or are they just a general purpose infrastructure provider that is now going through "Progress who?"

Anyhow...

At least it is a flavor of UNIX :)

"df" says the disk is full but you don't see enough "stuff" in the filesystem to add up to being full? And if you reboot or restart Progress the disk space magically reappears?

That would indicate "invisible temp files". When a Progress session is running it opens a number of temp or "scratch" files to do things like sort results sets or "swapping" in memory structures that get too big (not literal "swapping" in the UNIX sense but a similar idea) .

The temporary files are located wherever the session "-T /some/directory/name" parameter points. If the session does NOT have -T specified then the current directory is used. By default the appserver current directory is $WRKDIR which is also probably where your log files are.

If the "-t" (lower case tee) parameter is NOT specified the temporary files are created "unlinked". That makes them invisible so that you cannot see them with "ls". That is annoying because you cannot see how big they are except by using something like "lsof" (see below). OTOH, it means that they are automatically deleted if the session crashes or the db restarts or the server is rebooted etc.

If -t is enabled to make the temp files visible they do NOT get automatically cleaned up when a session dies for whatever reason, It then becomes your responsibility to clean up stale temp files. So one possibility is that -t IS turned on but the cleanup is not happening and so you are running out of space for no good reason. If your -T directory is full of old files with crazy names ("ls -lr" will tell you if they are old) then you might just need to purge old files.

But if there are invisible temp files you still don't know which ones nor how big and the only thing you can do without digging into which ones and how big is to make the filesystem bigger. And hope that you guess right regarding how much bigger.

To dig deeper with "lsof" get it installed, run it, and look at the output to determine which files are taking up all of your disk space. Report back here and we might be able to tell you what your findings mean.

Or... find the startup parameters for your appservers. They are _probably_ in $DLC/properties/ubroker.properties.

workDir=/some/directory
srvrStartupParam=

srvStartupParam may have -T and/or -t on it or it might point to a -pf file with more parameters. In any event these options allow you to change the defaults so if -T and -t are missing you can add them and you will then be able to see what space the temp files are using (without "lsof").


The "lsof" command is generally an optional AIX command, if you don't have it you _could_ ask the sys admins to install it and give you access (it requires privileges) would shed light on what particular files are involved.
The new vendor has no idea of progress and he is just an infra provider.

-T parameter has been used in the ubroker.properties and the directory has some logs which are having only 6MB size. I don't think this is causing issue.

I wanted to highlight one more thing here, dataserver is loosing the connection with oracle DB and we are getting the ORACLE -3112 error in the broker logs and application brokers are trying for the connection and due to this log file is increasing.

After checking the Dataserver.log the below error has been noticed. I have checked the previous days dataserver.log as well and noticed that connection is getting failed in a particular time frame i.e. 10AM to 11:30AM. The schema holder re-start is a temporary solution.

Tom -- Could you please share your thoughts that how i can resolve this issue?

Do we need to connect with PSC for the support?

image
 

Rob Fitzpatrick

ProgressTalk.com Sponsor
-T parameter has been used in the ubroker.properties and the directory has some logs which are having only 6MB size. I don't think this is causing issue.
You should re-read Tom's earlier post about OpenEdge session temporary files (DBI*, lbi*, rcd*, srt*). By default (i.e. without -t) they are created as unlinked files. That means they are invisible in directory listings, as well as to utilities like ls, df, and du. I don't know how you are determining that there are only 6 MB of files in -T but I strongly suspect that is not correct. As Tom said, you should install and use lsof to see the temp files and their sizes.

More info on temp files:

Do we need to connect with PSC for the support?
Contacting tech support versus getting help from the community does not have to be an either/or decision. If you have the ability to open a case with TS, I suggest you do so in parallel with trying to get help from the community, documentation, and knowledge base.
 

TomBascom

Curmudgeon
As Rob said, invisible temp-files are not ruled out by anything that you have said. In fact you seem to be strengthening that theory. If there are a mere 6MB of log files then _something_ else has to be taking up the space and if -T points to that filesystem then invisible temp files are a very likely culprit.

The Oracle error should also be addressed but unless the Oracle database is also on the same filesystem I don't think that it would be directly related. (If it *is* on that filesystem then it is odd that you haven't reported anything about how much space it is using.)

Did the Oracle error start occurring at the same time that you started running out of disk space? Presumably this Oracle database is being used for some purpose - are the users complaining that it is not working as expected?

This is obviously a filthy commercial pitch but... you might also want to consider hands-on expertise. We do just happen to have consulting services available that could take a look at this for you. Visit Progress OpenEdge Database Services - White Star Software for more info about that.
 

chandukarla

New Member
Thanks Rob and Tom for sharing your thoughts on the issue. I will further analyze the issue based on your inputs. Thank you so much for your time.
 
Top