Linear regression, Error analysis, Forecasting growth – HP Matrix Operating Environment Software User Manual
Page 32: Linear regression error analysis

The following are examples of events that Capacity Advisor can recognize (and disregard) as
potential sources of invalid points:
•
System downtime during the collection period.
•
Out of the ordinary activity designated by you. You can manually designate time periods as
invalid when you know resource usage has been outside the norm that you want to consider
in your capacity planning.
•
Partial collection from a virtual machine or a VM host. When Capacity Advisor is unable to
apply a correction that accounts for all activity on a VM host, it marks any partial data collection
as invalid.
How this relates to setting a validity threshold
The Validity Threshold that you set should reflect your tolerance for obtaining a sufficient amount
of valid data in the collection period that you designate. If the reports that you run show that the
given threshold is not obtainable for the designated time period, this may indicate that many of
the data points in the designated collection period are invalid.
In this case, you can choose a lower Validity Threshold with the understanding that the report
outcome may be a less reliable indicator of probable resource usage, or you can select a different
or longer data collection period to improve the likelihood of obtaining a sufficient percentage of
valid points for a good report.
Linear regression
The linear regression is based on a least squares fit that minimizes the sum of the squares of the
vertical offsets between each of the aggregate points and the trend line that describes them.
TIP:
Regressions performed over small data sets are not always meaningful and can be misleading.
Any trend analysis based on less than a dozen aggregate points should be carefully compared
with the historical data to see if it "makes sense." The maximum number of data points for the trend
analysis is the total time for the report divided by the business interval, because business intervals
can be excluded if they do not meet the validity criteria.
Because the trend is reported as an annual growth rate, it is best to have more than a year of
historical data before trying to analyze trends.
Error analysis
You can choose to include error analysis in the report. The following error value is available:
r-squared:
r
2
is the square of the correlation coefficient (r), and is used in the 'goodness of fit' analysis of
trend estimations. r is a value between 0 and +/- 1. where values approaching +/- 1 indicate
increasing validity of the data representation.
Forecasting growth
Capacity Advisor forecasting allows you to combine a range of historical data (the
) with a predicted trend (the
) to produce a
. The
forecast model can be used to provide an estimate of future utilization.
Whenever a Capacity Advisor report or profile is generated with an end date later than the current
date, the historical utilization data must be projected into the future. The projection is indicated in
the utilization graphs by a colored background. This projection is done based on a
Forecast models can be defined globally, for individual workloads or systems, for a scenario, and
for individual workloads within a scenario. Because the process for defining a forecast model is
basically the same regardless of where it is in the hierarchy of forecast models, the procedures
below are broken into two parts: accessing the forecast model and defining it.
32
Key Capacity Advisor concepts