the wizard of z/os: mei 2008

woensdag 28 mei 2008

HSM in a Virtual Tape Environment - Part 2

Yet another entry regarding HSM in a Virtual Tape Environment.

This time we will be adressing the impact of PtP implementations in backup and migrate (ML2) processing.

First of all, let's take a look at a timeline for creating a single-file-single-volume dataset on a PtP volume with copymode settings to imemdiate. In this case an HSM migrate level 2 dataset, but you this is also true for other single-file-single-volume datasets.

The data is first written at the local site. Before completing the O/C EOV (rewind unload) the volume is copied (in compressed format) to the remote VTS. Penalty for creating this 'synchronous' copy is the time required to perform this copy (on average between 40% and 60% of the 'original' job I/O time).

Now what happens if an HSM migrate task results in a dataset spanning multiple (lets assume 2) volumes?

At the point in time where HSM needs to get the second volume a rewind-unload is issued. Analogous to the first example this is followed (due to the immediate copy mode) by the copy to the remote site before this rewind-unload is completed. The time it takes to complete this copy causes a delay for the HSM task. As soon as the copy has completed, the second volume is mounted. At completion of writing to the ML2 dataset another rewind-unload will be issued, followed by another copy process to the remote site.

This results in the 'penalty' for creating the immediate copy is yet again roughly 40%-60% of I/O time.

For non-HSM multivolume datasets specifying unit=(,,2) will create an improvement in elapsetime because the PtP-copy can then be made in parallel to the local write to the second volume (see figure below). Unfortunately HSM allocates only one unit to each migrate tasks. Maybe we will see a SETSYS MAXCONCURRENTML2UNITS in the future giving us the option to fully exploit the VTS capabilities.

As reference a figure displaying the manner in which single-file-multiple-volume tape datasets are being created in a VTS-PtP environment with unit=(,,2) specified:

For the sake of completeness, all the three figures in one handy A4-sized picture. This image can be freely pinned in your cubicle.

donderdag 15 mei 2008

Whadda-ya-mean-D/R?

Just so I will remember the link myself, but maybe interesting for the 1.5 (on average) reader of this blog : recoveryspecialties.com has some nice texts on PtP, PPRC and the such

woensdag 14 mei 2008

Whadda-ya-mean-z/os?

For those of you who stumbled upon this site "by accident" but who are curious to the whole z/OS world, or maybe for the 3 regular readers the following link.

IBM just released a new redbook entitled IBM System z Strengths and Values, you can read the abstract (and even the complete redbook) right here : http://www.redbooks.ibm.com/abstracts/sg247333.html

It's not too technical, maybe even a bit too much commercial but good reading material nontheless.

vrijdag 2 mei 2008

REPORT DAILY FUNCTION

Ever had a wee look at the output from a 'HSEND REPORT DAILY FUNCTION ODS('YOUR.DATASET')' ?

I want to use todays post to rant a bit about the values presented here. As you see in the picture , I have boxed some of the values with a very nice red box :)

After reading my rant about these (fictive) figures feel free to leave your comments.

This might very well be the first in a series of 'how do I rate my figures'. So without further ado, here's my 2cc on the matter.....

Average Age. For the migrates and the recalls it's quite nice to be knowing what the average age is. Low values in the 'recall' rows might indicate HSM thrashing (recalls just after a migrate). The values in the 'migrate' rows must match your gut feeling in the average setting for migrates. (14 days Ml1, 30 days Ml2 or whatever your shops guidelines are).

Average Queued Time. Now this is an odd figure to be interpreting. It represents te amount of time an average request was held up in the queue. It's the average amount of time a request to recall a ML2 dataset was not served by HSM. Generally speaking: if it's high (> 120secs is my RoT atm) you've got more request coming in than HSM may/can handle. This might be due to SETSYS TAPEMAXRECALLTASKS, the available units or can even be due misuse :)

User Initiated Backups. Another nice figure to determine user behaviour. I've seen reports where every monday 400 userbackups were being made. User migrates are another thing to look at. Then it's the hunt for Red October, or whatever the user is doing these actions. The art of trying to find out why they are doing this, without assuming it's a wrong thing to be doing is a tricky but fine one. Remember, user migrates cause partial tapes!

the wizard of z/os

woensdag 28 mei 2008

HSM in a Virtual Tape Environment - Part 2

donderdag 15 mei 2008

Whadda-ya-mean-D/R?

woensdag 14 mei 2008

Whadda-ya-mean-z/os?

vrijdag 2 mei 2008

REPORT DAILY FUNCTION

html code

a text

About Me

Labels

Blog Archive