Partitioning for archiving

rfutil · Dec 15, 2019

Every December we delete about 200,000 records from our DB, each record takes about 3 seconds to delete via a vendor supplied deletion tool and the process runs out of hours and at weekends, this is followed up by a dump and reload/reindex on new year's eve run by the vendor.

This process consumes our DBA team across December and the holidays. As a process it's clearly hugely inefficient and also very costly.

I've canned the d&r this year as looking at last year's dbanalys the scatter factor didn't justify it, instead we are compacting indexes and will d&r a table at a time later on to free up space.

It seems rather than deleting records a better approach would be to mark records that need to be deleted, create a partitioning scheme based on the contents of a field, move the records into the partition and truncate the partition.

Has anyone done anything similar?

Rob Fitzpatrick · Dec 15, 2019

You could look at the Table Partitioning add-on product as an alternative approach to purging data. But my intuition says it isn't the right solution here.

rfutil said:
Every December we delete about 200,000 records from our DB, each record takes about 3 seconds to delete via a vendor supplied deletion tool and the process runs out of hours and at weekends, this is followed up by a dump and reload/reindex on new year's eve run by the vendor.

This raises red flags for me. First, 200,000 records is very little data in the grand scheme of things. Second, deleting at a rate of 0.33 records/second is ridiculously slow. It makes me think that one or more of the following is possible:

the hardware is woefully inadequate;
the database is poorly configured for transaction throughput;
the database is on some shared infrastructure (e.g. virtualization host or SAN) that is overloaded;
the software is doing more than you think it's doing (e.g. deleting many related child records for each of these 200,000 records, or executing some complicated business logic to determine what to delete);
the queries in the deletion tool are poorly written or poorly bracketed -- in the worst case, if there isn't an index that matches the where clause, it could be doing a full table scan on what is presumably a large table.

You say you have a DBA team; you're lucky! They should be able to provide a detailed picture of what this utility does, even if they don't have access to the source code. They should also understand the platform on which the database is running. In other words, they should be able to tell you which of the above bullet points are relevant here.

rfutil said:
As a process it's clearly hugely inefficient and also very costly.

Absolutely!

You should ask yourself (or whoever is calling for this process to be done annually): what is the value of this process? As you've canned it for this year, clearly you don't see business value. Is there technical value? Someone, either internally or at the vendor, designed this process and put it into effect. What was their intent? Conserving disk space? Improving application performance? Punishing DBAs?

Some people have the mistaken notion that "if I purge data from this table then code that reads it will run faster". In fact, somewhat counter-intuitively, deleting records from a table can actually make code slower. When I encounter a purge process it generally involves, at a minimum, several GB to tens of GB of data. Even if these are very large records, 200K amounts to what, maybe a few hundred MB of data? A maintenance activity to deal with that little data seems pointless to me.

If you do purge data, you do have to do maintenance afterward to avoid the problem I described above. Your decision to go with index compaction is good. It can be done online or offline and isn't too disruptive. But a full dump and load of one table, on a continual basis, is overkill. If your data storage areas are structured properly, you really shouldn't need to dump and load your data often; think once every several years. And when you do D&L you should have specific reasons for doing so and specific, measurable desired outcomes.

rfutil said:
a dump and reload/reindex on new year's eve run by the vendor.

Just curious: if you have not just one DBA but a DBA team, why was the vendor doing the D&L?

TomBascom · Dec 15, 2019

I agree whole-heartedly with Rob.

Additional note: if a purge makes things run faster it is because your WHERE clauses are poorly written and looking data that the code does not need.

Purging data only makes sense if your data retention policy has a business reason for it. Purging for the sake of purging is not a well thought out strategy.

rfutil · Dec 16, 2019

The purge was being done from the days of type 1 storage areas and spinning disks, the vendor traditionally did the dump and reload possibly due to lack of knowledge from the dbas and likely due to tradition, someone once mentioned an issue with the data dictionary as the reason why the dbas didn't do the dump and reload. The db is 370gb with the largest two tables having 2billion rows and 610 million rows the next largest has 94 million rows. It's a CRM system so it seems that every record update is going into a giant table.

Today we are on SSDs and have a 16 core system hardly being utilised. Can't remember what the ram is but it's sufficient.

Protop tells me there are hardly any os reads happening and the index reads to record reads are very close. So the application is accessing it's data via indexes from the buffer pool. There's no latch contention. The APWs are managing writes to the disks effectively. Checkpoints are happening every three to four minutes. The system is in a happy place. The issue we are left with is dead space in the storage areas, which we've said we would remove throughout the year to replace the end of year dump and reload. The deletions are of records that are no longer active so really don't need to be in the system with tables of such size it's clearly reasonable to remove data that won't be accessed. The predeletion database is also mounted elsewhere.

I looked at last year's tabanlys output prior to the "annual d&r" and the scatter factor wasn't above 2 anywhere, it's currently 1 on the largest tables. Yeah a couple of million rows (assuming each record contains multiple rows) from a 2 billion row table isn't going to increase scatter factor significantly.

I'm not exactly certain what the delete process is doing I was going to see what I could gather via the statement cache this week to see if that provided any insight but really the question was around using partitioning for archiving. If the data that is being deleted can be identified then rather than use an archaic process, it seems more logical to mark the records with an archive flag, create a partitioning scheme based on that flag, split the data into a live partition and an archive partition, truncate the archive partition and merge the partitions back to one.

TomBascom · Dec 16, 2019

That empty space will get reused over time so unless the business is shrinking dramatically there probably isn't much point to reclaiming it.

My personal opinion is that "scatter" is an almost useless metric in a type 2 storage area. "Logical scatter" is interesting but is not what is being reported and each index on the table will have a different logical scatter. When you improve one you are probably hurting another.

None the less -- if you are going to purge data then your approach of using table partitioning to accomplish it seems workable to me. OTOH you could also do the same thing without TP. Add an "active" flag, and remove inactive data at your leisure. There's really no special need for TP to do that. Doing it with TP means that you are going to actually delete it twice and recreate it along the way (moving it from the original partition is a create & delete) so you're generating a bunch of extra IO for no actual benefit since you're just going to delete those records anyway. (At least that is what you seem to be saying.) Yes, the truncate at the end would be more efficient than discrete deletions all at once but you don't actually avoid them.

Rob Fitzpatrick · Dec 16, 2019

You can use Table Partitioning, and at 370 GB you're at a size where that could make sense. And it does allow you to remove large chunks of partitioned data without a language client, which is beneficial. Just be aware that TP is not a transparent feature, like say AI or TDE, in that a DBA can implement it independently of development.

It requires careful planning and close collaboration with development for several reasons.

There are many options for partitioning schemes and you need to choose one that makes sense for the application and the desired maintenance scenario.
Given the scheme you choose, you need to audit the code to ensure that when it creates records in the partitioned table(s) it always fully populates the partition key during the create transaction and not in a subsequent update transaction. Otherwise you will see runtime errors in the application as the storage engine won't know which partition the record belongs in.
If you use the recid function anywhere in your code, you need to replace it with rowid, as the latter is partition-aware and the former is not.
There is a bit more management overhead with TP as you need to ensure that your partition policies, if they are date-based, remain up to date as new data is written. For example, if you have policies where each year's worth of data is stored in different partitions for 2018, 2019, and 2020, you need to ensure you create a new policy for 2021 data and create the appropriate extents, before December 31 2020.
If you have any code that reports on data in meta-schema tables, it may need to be updated as Table Partitioning changes some long-true paradigms, e.g. that a table is a single storage object that resides in a single storage area, or that each index definition corresponds to a single b-tree.
Implementing TP, as with implementing any optional feature, will necessitate changes to some of your database management procedures/scripts, e.g. for executing a dump and load.

I'm not trying to talk you out of buying TP. I just want you to be aware of the implications of using it.

rfutil said:
someone once mentioned an issue with the data dictionary as the reason why the dbas didn't do the dump and reload

In my opinion, a D&L of a DB that size shouldn't involve the data dictionary (directly). The dump should be binary and the portions that require dictionary code, like dumping schema, sequence values, etc., can be scripted. Sounds like a process that should be revisited.

rfutil said:
Today we are on SSDs and have a 16 core system hardly being utilised. Can't remember what the ram is but it's sufficient.

The fact that you are on newish hardware makes the delete performance all the more suspicious.

rfutil said:
Protop tells me there are hardly any os reads happening and the index reads to record reads are very close. So the application is accessing it's data via indexes from the buffer pool. There's no latch contention. The APWs are managing writes to the disks effectively. Checkpoints are happening every three to four minutes. The system is in a happy place.

No obvious bad news there but it also doesn't rule out poorly-bracketed queries in the deletion utility.

rfutil said:
I was going to see what I could gather via the statement cache this week to see if that provided any insight

That may provide insight if you have access to source and you want to see which parts of the code path are being followed. I'd start simply, looking at the CRUD stats during a deletion run (in a test environment with prod-like data). Make sure your stats parameters are set appropriately for your schema; ProTop will warn you if they aren't. Then look at your total reads and deletes. If you do 200,000 deletes on the table in question and have more than 200,000 reads in it, or a very large number of index reads, or a significant number of reads elsewhere, then the deletion code is doing more than you thought.

Partitioning for archiving

rfutil

New Member

Rob Fitzpatrick

ProgressTalk.com Sponsor

TomBascom

Curmudgeon