Data warehouse on openedge feasibility

MrGinge

New Member
Hi,

does anyone have any experience of building / implementing a data warehouse (star schema not 3nf) on an openedge platform?
All reporting output would be generated via a reporting tool e.g crystal, cognos ...etc

Just looking for any pointers / pros & cons etc

Thanks,

MrG
 

RealHeavyDude

Well-Known Member
IMHO and knowing that I probably will be blamed for saying this:

I would never build a data warehouse which will be used almost exclusively from SQL reporting tools with a Progress database. Don't get me wrong, the Progress database is, again IMHO, one of the best databases in combination with the ABL - but not with SQL. Although it gets better every release there still many things that either won't work or work different compared to other so-called "standard SQL" databases. Plus, you won't many people that have such experience. Most SQL access use cases for Progress databases are coming from an ABL based application running against a Progress database in the first place to provide functionality like reporting where there are not many ABL based reporting systems on the market that can compare with SQL based ones like, you name it, Cognos.

Heavy Regards, RealHeavyDude.
 

Marian EDU

Member
Hi,

does anyone have any experience of building / implementing a data warehouse (star schema not 3nf) on an openedge platform?
All reporting output would be generated via a reporting tool e.g crystal, cognos ...etc

Just looking for any pointers / pros & cons etc

Thanks,

MrG

Well, given the processing power available nowadays and the general improvement in joining in SQL databases there is now a school of thought that favors a regular 3NF form even for data warehouse environments, instead of what used to be the norm of highly de-normalized star-schema (snow-flake was used/accepted even before for some cases). The major advantages will be that the 3NF schema in more easily to keep clean, and easier data load/transformation processes... but, back to how viable a Progress database will be in such environment.

As you seems to be aware most of the reporting tools were designed for SQL databases, so it boils down to using either ODBC or JDBC against the Progress SQL engine... which is not quite the same animal as the 4GL one. This means you'll have to make room for the SQL engine, set-up authentication and privileges, constantly fix the sql-width issue for character fields and analyze table all the time to give food for the SQL engine cost-base optimizer.

Bottom line, it can be done if you really want to... damn, you can even set-up Mondrian (in memory OLAP) over a Progress database but... I would go with one of the usual suspects in SQL, MySQL or PostgreSQL, move data in there with an ETL tool and use reporting tools against one of those SQL engines.
 

MrGinge

New Member
Thanks chaps.

really struggling to find any examples or evidence that anyone has tried to do this before - which in itself tells me that is probably not the most sensible idea.
 

tamhas

ProgressTalk.com Sponsor
There are a couple of points here. One is that it was common in the past to use specialized databases, architecture, and query engines in order to achieve acceptable performance. Today, I think it is less common because one can use conventional databases, architecture, and query engines and still achieve good performance. In particular, there are advantages to querying the production database instead of a specialized warehouse database because it is real time accurate and one can drill down as needed. A common alternative is to do queries on a read only replicate database, thus providing the security of replication and near real time data while moving the query load to a different DB and platform.
 

MrGinge

New Member
There are a couple of points here. One is that it was common in the past to use specialized databases, architecture, and query engines in order to achieve acceptable performance. Today, I think it is less common because one can use conventional databases, architecture, and query engines and still achieve good performance. In particular, there are advantages to querying the production database instead of a specialized warehouse database because it is real time accurate and one can drill down as needed. A common alternative is to do queries on a read only replicate database, thus providing the security of replication and near real time data while moving the query load to a different DB and platform.

There are also advantages to simplifying data structures, consolidating data from multiple sources, providing conformed dimensions, pre-aggregation ...etc
Which bits are more important depends as much on individual business process as it does on IT Strategy. Your example may be a solution for a number of scenarios, but it isn't for mine. Not without fundamental redesign and huge hardware/infrastructure costs anyway.

I suppose regardless of whether we are talking about star schemas, normalised or replicated dbs, the point i'm trying to get at is whether it's practical to build such a database on progress and then *only* use it as a reporting database via an SQL-based reporting application.

The due diligence i'm doing and some of the testing i've done so far says not, however i'm more than happy to be swayed if the evidence is there.

MrG
 

Storm trooper

New Member
MrG,

Why would you want to waste the time to do this exercise when there is already overwhelming credence to simply create one in SQL and leave your source live database in progress? This way you get the best out of both.

"The force is strong in this one"
 

TomBascom

Curmudgeon
"Practical" is subjective. It all depends on your needs and you capabilities. We don't really know anything about those.

I do know of several large customers who have things that they refer to as a "Data Warehouse" implemented in Progress. But, IMHO, they are really more in the line of a "data mart". But, again, it is all kind of subjective.

If the objective is to provide a view of the data for reporting purposes then, yes, you can do that.

Is Progress the "best" technology to support that? That depends... it certainly isn't a "traditional" choice. But that doesn't really mean much.
 

FrancoisL

Member
Put me in camp that would never use a openEdge DB for a Datawarehouse that would be used for SQL reporting.

2 reasons ...

1) I always found SQL with Openedge to be sub par and subject to alot of little annoying bugs ... Workaround2=8192 anyone or db name suddendly becoming case sensitive between 2 minor versions :p

2) It just too expensive compared to other database... OpenEdge is one of the most expensive DB (why do you think it so hard to get a price list) and has a license model that is more then a decade old... It really a very bad model that is hard to even understand.
 

tamhas

ProgressTalk.com Sponsor
Actually, compared to things like Oracle and other major databases, Progress consistently scores *lower* cost of ownership.

One really good reason to do one's reporting on an OpenEdge database is because there is where the production data is. Using the production database or a replicate means real time data with no possibility of it getting out of sync and always being able to drill down into details. It isn't for everyone, but it is something I've done repeatedly.
 

FrancoisL

Member
Actually, compared to things like Oracle and other major databases, Progress consistently scores *lower* cost of ownership.

One really good reason to do one's reporting on an OpenEdge database is because there is where the production data is. Using the production database or a replicate means real time data with no possibility of it getting out of sync and always being able to drill down into details. It isn't for everyone, but it is something I've done repeatedly.

That interesting Tamhas, we are maybe getting riped off by progress (and we are supposed to be a big reseller with bulk discounts) but each time that i compared the price we pay and what we would pay with another product (ORACLE, MS SQL) without any special bulk discount (standard price list) i always pay more or about the same (ORACLE). Are they're any actual data that we can compare or studies that shows these ownership "scores" ?

I would love to see a actual case studies that show progress cost versus other databases.
 

Marian EDU

Member
I suppose regardless of whether we are talking about star schemas, normalised or replicated dbs, the point i'm trying to get at is whether it's practical to build such a database on progress and then *only* use it as a reporting database via an SQL-based reporting application.

Nope, not only unpractical but why would you do it since Progress is not an SQL RDBMS to start with and there are other alternatives more cost effective out there... supposing you don't have such a 'big data' to start talking about NCR's Teradata you can easily start with either MySQL or PostgreSQL, or go with MS SQL if you are a fan of MS$ (hell, at least you'll have the analysis services in the package).

Sure Thomas is right that real-time reporting has it's place but won't advice anyone to run something like year-2-year reports on large history tables against the production OLTP system... might still work ok if you can have all your database in memory but otherwise better try something else :)

To move data from one system to another try to use an ETL tool if you can, better yet let the BL export the data you need (otherwise if you go directly to the database raw level you might have to duplicate the application BL and the ETL will become dependent on both DB and BL changes), if possible export data in a 'dump' format the target system can easily import (the mysql's load data infile forr instance), try to dump incremental as much as you can and do data consolidation/clean-up in a 'staging' area to put as little workload as possible on source production systems.
 

MrGinge

New Member
"Practical" is subjective. It all depends on your needs and you capabilities. We don't really know anything about those.

I do know of several large customers who have things that they refer to as a "Data Warehouse" implemented in Progress. But, IMHO, they are really more in the line of a "data mart". But, again, it is all kind of subjective.

If the objective is to provide a view of the data for reporting purposes then, yes, you can do that.

Is Progress the "best" technology to support that? That depends... it certainly isn't a "traditional" choice. But that doesn't really mean much.


Yes there's a lot more detail that we are also considering, much of which is why we started with a perception that progress would be the most suitable platform for us to sit this database on. Ultimately though, the deal breaker will be how does this thing perform from a user perspective.
If it was dirt cheap, quick to produce, easy to manage, etc, but ran like a dog in six months because it has 200 concurrent users querying via ODBC then no-one is going to say well done MrGinge you did a great job there.
and when i respond with 'but i wrote the ETL in half the time because it was in progress' they will probably kick me out the door.
The business couldn't care less as long as it's a cost effective and performant solution. The fact that our ERP and most of our in-house knowledge is progress is a big consideration. It's really the fact that it's not a 'traditional' approach that's causing me a headache, because there's nothing around that i can compare to.

MrG

Mods - this looks more like a general discussion than development so feel free to move somewhere more appropriate.
 

TomBascom

Curmudgeon
Ok...

So you basically need to gather your requirements and do some serious feasibility analysis with real scenarios and real numbers rather than vague impressions from "the internet". It will be just as bad if you reject Progress simply because it is non-traditional and end up with a technology that you do not understand as well and cannot properly support.
 

tamhas

ProgressTalk.com Sponsor
Starting with, do you even need a separate data warehouse at all or is reporting based on the production database or a replicate going to get you want you want and need without the hassle and expense of creating a separate data warehouse.
 

MrGinge

New Member
Ok...

So you basically need to gather your requirements and do some serious feasibility analysis with real scenarios and real numbers rather than vague impressions from "the internet". It will be just as bad if you reject Progress simply because it is non-traditional and end up with a technology that you do not understand as well and cannot properly support.

Thanks Tom & Tamhas.
The requirement is to build a reporting database based on a star schema methodology, containing data from multiple sources (mainly progress databases), which will be accessed by end users purely via an SQL-based reporting tool and will support around 300 concurrent users. The database will be in the region of 100GB upon inception, with an annual growth rate of maybe 10%. Real-time reporting is not required, so an overnight ETL process will populate it.

My question is therefore - has anyone on this board had any experience of doing anything similar on a progress platform and if so, what advice would they be able to give to assist me in my evaluation of whether this requirement can be achieved via a progress database. I would be interested to know of any potential challenges in terms of licencing costs and performance as these are key areas of concern for my organisation.
So far i have had some useful comments regarding these two key concerns but i am struggling to find real-life examples of companies who have acheived such a requirement that i can use in comparison. There are also other considerations that my organisation is aware of which will help us in our decisions, however for the purpose of this thread i am only concerned about licencing cost and performance factors.

Thanks,

MrG
 

TomBascom

Curmudgeon
0) I don't know of anyone who has done that in that particular manner. That doesn't mean that it cannot be done.

1) The performance considerations would be all of the usual suspects. Good storage area design, up to date release of Progress, 64 bit OS & OE, lots of RAM, excellent disk (think SSD...); plus, since this will be accessed with SQL, a really, really good network, routine use of UPDATE STATISTICS, monitoring of common queries to determine if new indexes are needed and so forth.

2) Licensing? Yuck. If you can convince yourself that this is a "bolt-on" to an existing app then maybe there is no licensing cost. Otherwise you're building a 300 user system. Then you need to figure out if those are 300 "named users" or "client access" users or what the heck they are.

Whatever you do, do NOT say "300" where a Progress sales person might hear you! That will immediately become the lowest number of the highest priced option that you will ever see a quote for. The first thing the sales person will do after listening to your voice mail (they never actually answer the phone) will be to forecast the revenue for the current quarter while heading out on to the golf course with his boss to celebrate. (His boss will then increase his quota for the next quarter...) The next morning, after a hard night celebrating, the sales rep will have an assistant put together a quote that starts with whatever he thinks he remembers you said plus several feverish fantasies that will result in a new car, an upgrade to his yacht and, with any luck, a fresh lakeside mansion. The boss will check in over lunch (probably from the golf course where he is celebrating his good fortune with his boss...) and want to know when the deal will be closed.

Start by telling them that you think you may launch with 10 and build up towards 20 users and ask what your options are. Then ask what sorts of discounts you might be able to get in the unlikely event that you can convince a total of 30 users that the system is any good... you might want to mention the down economy and any recent downsizing or threats thereof (anywhere in your own company is best but competitors are also a good source of bad news).
 
Top