Frequently Asked Questions

General

How is the physical layout calculated?

The physical data centre layout is based on the completion of the PRQ and the server/device connectivity. Any mistakes at this stage may affect the source and target diagrams on the Analysis Engine Report (AER). The PRQ should be double checked for the layout as well as spelling mistakes which may be displayed on the final report.

Why are the libraries in the wrong location?

The collection is from the backup server perspective, and all devices attached are assumed to be in the same location other than if specifically stated. Where backup servers are in a different location to the device there is potential for mis-alignment. The free form text area should state where this is the case. This will not affect the logic or commercial components of the AER, and ensure the libraries are accurately positioned.

What does the operational issue XYZ mean?

The operational issues on the AER are gathered from the source backup product error logs or infrastructure logs. These can be quite technical and product specific. The customer can use these to search their own system to locate the actual issue and effect of this issue. They should be used to highlight any risk to the customer with their current environment, and an operational value to completing the AER.

Please explain the operational issue XYZ?

Detailed explanation of these issues will require subject matter expertise in the source product and environment. The best way to address a specific operational issue is state that it has been reported by the source product and should be investigated. The issues are on the report when they pose an immediate risk to backup or recovery operations. Butterfly can provide further details.

Backup

What products do Butterfly support?

The active support matrix can be seen on the interoperability matrix on the website, or can be sent from the sales team. This shows the platform, product and version that is supported for collection.

Can Butterfly map the backup system back to a backup image?

Yes, the Analysis Engine Database (AEDB), contains all the information to the granularity of the backup image. However, this is not displayed as there can be many million images in active indexing. The information is converged to the graphs, charts and tables displayed on the AER to make the information usable and accessible.

How is the data on the retention graph calculated?

It is first important to understand the x-axis labels, if the label states ’30 days’, this means the data represented in the column has a retention of 16 to 30 days. The right hand column is over 7 years to infinity. This is calculated by looking at the images in the backup product and the difference between their backup and expiry date- effectively demonstrating the retention of the image. Where backup products do not store data at the image level, e.g. TSM- this is a calculated number based on the domain and associated bu or ar_copygroup definitions. Where occupancy is active data or a no limit retention is specified the occupancy is realised in the over 7 year column. Data that cannot be matched will also be positioned here, so sizing is accurate and conservative.

The number of TSM servers has increased from source to target, why?

Despite the benefits of server consolidation and improvements in server and product instance capability, target solution is sized to give 36 months capability. Growth is included in the target side of the solution, but only represented in the source business case infrastructure list. These lists are the true infrastructure comparison.

How is the Data Domain/VTL being used today?

The collection of data from the source server will see the virtual configuration being presented by the VTL, or the network attached storage that is visible to the server. The configuration, virtual drives and virtual volumes will be represented on the AER. It is important that the PRQ is completed with the VTL raw and usable storage as these figures drive the disk capacity in the business case. The VTL will be positioned in the source as part of the storage hierarchy.

How is the disk used on the backup server, do you show this?

Disk being used by the source backup servers for storing occupant data will be shown on the source diagram and factored into the business case. OS disk storage is not included as this is assumed to be part of the server build.

Why did a certain data type increase on the ‘waterfall chart’?

The waterfall chart shows the differential between the actual capacity required to hold the occupant data. This is reduced through the use of data reduction technology; incremental backup, compression, de-duplication etc In the event that data is being migrated or active backup operations are being moved off a more capable medium to a less efficient technology- the capacity may increase. This may only be for the specific data type that resides on that storage medium.

How is the labour reduction determined on the TCO chart?

The man time required to manage the environment is based on two metrics: The number and type of elements that are in the solution; i.e. server, backup product, library etc and the management efficiency of the element. This allows simpler management elements to show tangible benefit over more complex elements. It is import that their function is also represented here e.g. manual vaulting will be less efficient that automated vaulting. All management roles are included as a function of the solution.

Can you provide the number of clients by backup server?

The total number of clients in the environment is calculated but not broken down by backup server as the target solution looks at total consolidation at the AER level. If there is a requirement to break the environment into separate areas, multiple AERs should be completed.

Why are there no schedules on the backup environment?

In certain cases, external schedulers are used for backup operations. In this case, it will be stated that the backup server has no active schedules.

Why is Oracle RAC not listed as a source data type?

The Oracle RAC environment will be demonstrated as the data type that actually exists within the backup occupancy, this may be unstructured filesystem, or RMAN/Oracle as a function of the backup method and the source product reporting.

What % of the saving comes from changing the backup software and what % comes from the hardware change?

The savings can be seen in detail in the TCO chart, the total savings are really around solution efficiency. The real benefits come where the software and hardware layers are built to work together for data reduction and performance efficiency. In general terms, hardware savings are made by a more intelligent software layer and more capable/cost effective hardware.

What is the difference between capacity and occupancy?

Occupancy is the actual volume of managed data. The capacity is the infrastructure footprint required to hold this occupant data. Say for example there is 10TB of managed occupancy on a physical tape with 2:1 compression- the capacity would be 5TB. This can get more complex when client, server and device de-duplication are part of the solution.

The customer is using client compression, how has this been modelled?

Client compression is seen at the data backup size, and the image size in the capacity required to store the data occupancy.

Please can you provide more insight into XYZ client?

Due to the scale of the environments that are analysed, the granularity of a single client can only be found by running specific queries against the Analysis Engine database. The AER is at an environment level, with only the top clients being represented.

What is the projected de-duplication rate of the VTL in the target design?

The actual de-duplication rate for the VTL can actually be calculated from the data on the AER. This is not displayed as the total data reduction ration of the solution as a whole is the key factor, which takes all data types and features into account. Where data is backed up directly to the VTL e.g. Oracle data, the only data reduction effect is the VTL de-duplication and this can be seen in the waterfall chart.

How can you reduce the number of tape drives by that much!!?

Drive reduction is not just about increased speed drives, it is a function of re-architecture of the solution. Disk storage may be used for short term retention, VTL may be used for structured data. This means the drives are addressed through a TSM process, which means the aggregate throughput is considerably higher. Drive numbers in many environments can be very high, as client backup sessions are highly inefficient when it comes to drive throughput- but not for drive mount requirements. Alternative solutions to manage the actual client sessions and drive usage can massively increase aggregate drive utilisation and therefore hugely reduce the number of aggregate drive bandwidth and mount points (peak and average) required.

Please explain how the target solution has adequate IO capability?

The IO capability is based on network capability, as part of the server tier, the disk performance and tape throughput- peak mount points where required. Additional requirements around replication are calculated. The backup window is assumed at 8 hours in target design cases.

How can you reduce the number of backup servers by that much?

Backup server reduction is through software upgrade efficiency, backup operation efficiency i.e. object number reduction and throughput requirement reduction. This is completed on a mathematical basis and does not take into account any addition strategy or constrains of the customer. It may be that additional servers are required to cope with alternative strategy. The mathematical model allows an accurate infrastructure baseline of the two solutions.

Please explain how TSM for Sharepoint works?

The AER will demonstrate where specific products are required and the benefit of using them. However it is up to the vendor to understand the operation of their own products at a more detailed level.

Please explain how the TSM for VE solution works?

Again, the vendor should be positioned to understand the key architectural changes of transforming the backup technique for virtual environments using their own product set. The AER will position where this is required, and introduce the benefits in terms of data reduction and performance; however detailed features should be explained to the customer by the vendor.

Where is the software license cost on the business case?

The business case is a pure differential infrastructure business case, which provides as accurate infrastructure baseline required to provide data protection services. Both source and target software licensing is not included, but the key cost drivers from a capacity perspective are presented on the AER.

What is a structured data type?

The source occupancy is separated into structured and unstructured data types. This allows more accurate modelling when alternative data protection strategies are chosen. A structured data type is when an API driven agent is used and the backups are either in an image from or managed and indexed outside of the backup system. Examples may be Oracle RMAN, MSSQL etc

How was the “Target Data Occupancy by Retention Period” determined?

This graph is calculated by looking at client policies in the source for the different data types. This is projected out over the three years. This is important as the occupancy landscape will change towards the data types with longer terms retention. This is a projected position that will change if retention configuration is altered.

Where did you get the number “XX% Reported aggregate backup success rate”?

Aggregate backup success rate is calculated by looking at all data that is in scope of a data protection operation, and creating a percentage based on the actual successful attempts. So backups that are mis-configured or drives that are not protected will not be part of this figure. It is based at the object level and a function of successful data backup attempts completed within the configured window.

How do you define an “Active Backup Cycle”?

Active backup cycles are the retention periods and backup schedules that are current and in active operation in the configuration. Schedules with no clients, or policies that do not have active data protection operations are not considered a part of the active configuration.

Does “aggregate success” include missed schedules, missed files, skipped files, other errors, etc ?

Missed and Skipped files are considered a failure. They are considered as objects that are configured to be in scope of a data protection policy, but are un-recoverable as they have not been successfully processed by the backup operation. These therefore fall into the failure section.

Customer does not “think” they have data that they retain over 7 years – how do you calculate the SOURCE retention occupancy in the different categories from days > 7 years?

In certain cases, data will be positioned in the 7 year column. Active data in TSM, which is not aligned to a retention policy, data that cannot be matched to a retention policy, and reporting errors in the product as scenarios where data cannot be bond to a specific policy. To ensure sizing and the business case are conservative, these data volumes are positioned into the over 7 year retention- this ensures a worst case scenario and that the capacity will be sufficient.

Can you take out the data from the NDMP servers as a what if scenario ?

The lowest level of granularity at the report level is the backup server. These can be included or excluded as part of the PRQ completion process. Any further more detailed investigation should be conducted after the AER has presented the baseline position.

Data in the source environment is reported as 100% unstructured. Is this because they are not using storage agents to back up their databases?

Correct, even though the source data may be a structured data type, if the backup is completed at the file level the backup server will report the data as unstructured file data. This can seen in mis-configured clients, incorrect use of agents of dump file backups of databases.

Network issues are mentioned. In what way will the target solution resolve the problems with the network?

By reducing data volume at source, reduces the volume of data being transmitted across the network. This is also valid for replication and offsite recovery features. Network utilisation can be reduced by efficient data control, and as observed network issues are caused by congestion these will be reduced. When processing limitations on the backup servers are seen as causing network issues, this will be improved by transformed server infrastructure.