Summary: Data Domain Active tier capacity is increasing rapidly, is near 100%, or needs to be reduced below 85% before a planned IDPA upgrade. There are many factors to consider when analyzing increasing or high capacity usage. The below Resolution path will explore them systematically.

Issue:
Data Domain Active tier capacity is increasing rapidly, is near 100%, or needs to be reduced below 85% before a planned IDPA upgrade.
There are many factors to consider when analyzing increasing or high capacity usage. The below Resolution path will explore them systematically.
 
Note: Some of the KB articles linked below have limited visibility. If needed, please open a Service Request to troubleshoot Data Domain Active tier capacity.

 

Resolution:

Step 1. Note the current Data Domain Active tier usage:

DDBoostUser@ddhostname# df
 
Example Output:
Active Tier:
Resource           Size GiB   Used GiB   Avail GiB   Use%   Cleanable GiB*
----------------   --------   --------   ---------   ----   --------------
/data: pre-comp           -   926097.1           -      -                -
/data: post-comp     7309.6     6598.4       711.2    90%             13.3
/ddvar                 49.2        7.6        39.1    16%                -
/ddvar/core           158.5        0.2       150.2     0%                -
----------------   --------   --------   ---------   ----   --------------
 * Estimated based on last cleaning of 2019/11/19 09:11:08.
 
 
Step 2. Look for non-Avamar mtrees on Data Domain:

DDBoostUser@ddhostname# mtree list
 
Example output with additional mtree:
Name                                                       Pre-Comp (GiB)   Status
-----------------------------------------------   ---------------------   ------
/data/col1/avamar-15XXXXXXXX                    565211.0   RW   
/data/col1/backup                                                        0.0   RW   
/data/col1/sql                                                       63859.7   RW   
-----------------------------------------------   ---------------------   ------
 
If there are non-Avamar mtrees with significant pre-comp data contributing to the overall capacity usage, refer to KB 333479.
This KB article is only intended to assist with Avamar mtree capacity usage.
 
 
Step 3. Verify health of Avamar maintenance activities:

admin@avamar:~/>: status.dpn
 
Example output snippet:
Last checkpoint: cp.20200106140522 finished Mon Jan  6 09:05:59 2020 after 00m 37s (OK)
Last GC: finished Mon Jan  6 08:07:25 2020 after 06m 48s >> recovered 2.59 KB (OK)
Last hfscheck: finished Mon Jan  6 09:00:58 2020 after 47m 41s >> checked 9903 of 9903 stripes (OK)
 
 
If there is not an (OK) besides each maintenance activity, the activity is in progress or has failed. If there are any failures in the most recent maintenance cycle, these will cause Data Domain capacity usage to swell and must be resolved before proceeding. There are many possible causes for maintenance activity failures and associated KB articles.
 
Note: If Avamar maintenance activities are failing because Data Domain capacity has reached 100% with 0 bytes available, see KB 477305.
 
 
Step 4. Confirm Data Domain snapshots of Avamar mtree are expiring:

DDBoostUser@ddhostname# snapshot list mtree *
 
Example output:
Snapshot Summary
-------------------
Total:                  6
Not expired:      2
Expired:              4
Snapshot Information for MTree: /data/col1/avamar-15XXXXXXXXX
----------------------------------------------
Name                Pre-Comp (GiB)   Create Date         Retain Until        Status
-----------------   --------------   -----------------   -----------------   -------
cp.20200217160108              0.0   Feb 17 2020 11:01   Feb 18 2020 11:04   expired
cp.20200217160515              0.0   Feb 17 2020 11:05   Feb 18 2020 11:05   expired
cp.20200218160037              0.0   Feb 18 2020 11:01   Feb 19 2020 11:05   expired
cp.20200218160445              0.0   Feb 18 2020 11:05   Feb 19 2020 11:06   expired
cp.20200219160045              0.0   Feb 19 2020 11:01
cp.20200219160552              0.0   Feb 19 2020 11:06
-----------------   --------------   -----------------   -----------------   -------
 
 
The Avamar mtree should have 2 snapshots from the most recent Avamar cycle and the remaining snapshots should be expired.
If there are more than two non-expired snapshots:
- If Data Domain was recently upgraded from v6.0 or IDPA was recently upgraded from v2.0, consult KB 502996.
- Otherwise, engage Data Domain support team for assistance.
 
 
Step 5. Confirm Data Domain cleaning has run within the last 7 days:

DDBoostUser@ddhostname# filesys clean status
 
To check cleaning schedule:
DDBoostUser@ddhostname# filesys clean show schedule
Filesystem cleaning is scheduled to run "Tue" at "0600".
 
 
Note: No space will be reclaimed on Active tier disks until the following steps occur:
1. Backups are expired according to Avamar policy or are manually deleted
2. A full Avamar maintenance cycle runs which will mark backups for deletion and expire old snapshots on DD (occurs every morning by default)
3. Checkpoint backups with references to deleted/expired backup data are rolled off (explained in step 6)
4. Data Domain filesystem cleaning runs and reclaims space on disk (occurs every Tuesday by default)
- It is not advisable to regularly run Data Domain cleaning more than once a week, though cleaning can be run manually on occasion to reclaim space. See KB 470633 for more information.
 
 
Step 6. Consider checkpoint backups:

 - On IDPA systems, checkpoint backups are enabled by default. They allow for a checkpoint restore in very rare scenarios when Avamar does not have a validated checkpointed and is unable to produce one.
- If checkpoint backups are enabled, capacity reclamation will be delayed an additional 7 days. This means as backups are marked for deletion on Avamar, space will not be reclaimed until the Tuesday after next (7-14 days).
- If Data Domain capacity hits 100%, all backups and maintenance activities will fail. If the capacity situation is critical, and provided Avamar maintenance activities are succeeding, it is typically worth the miniscule risk to temporarily disable and delete existing checkpoint backups to avoid the 7 day space reclamation delay. This will help buy time to reconsider the current Data Domain sizing, Avamar retention policies, registered clients, and datasets in use which contributed to the high capacity situation.
 
a) Determine if checkpoint backups are enabled:
admin@avamar:~/>: mccli dd show-prop --name=<DD hostname> | grep Target
Target For Avamar Checkpoint Backups          Yes
 
b) Checkpoint backups can be disabled by editing the Data Domain settings in either the Avamar Administrator or AUI.
c) For information on removing checkpoint backups, refer to (internal article)  KB 476252
 
 
Step 7. Identify changes in client count:

- When new clients are registered to Avamar and begin to backup, unique data is written during backups but old data is not yet expired and deleted. This is particularly important to note for monthly and yearly backups, which will result in capacity growth for the length of the retention period (ex: 1 retention year for monthly/yearly backups).
 
root@avamar:~/#: find /usr/local/avamar/var/ddrmaintlogs/ -type f | sort -Vr | xargs grep -E "gc-finish:.*[0-9]+ client directories on Data Domain" | cut -d ":" -f 2,3,7
Example output:
Dec  4 08:00:     8 client directories on Data Domain
Dec  5 08:00:     8 client directories on Data Domain
Dec  6 08:00:     8 client directories on Data Domain
Dec  7 08:00:    37 client directories on Data Domain
Dec  8 08:00:    37 client directories on Data Domain
Dec  9 08:00:    51 client directories on Data Domain
Dec 10 08:00:    93 client directories on Data Domain
Dec 11 08:00:    93 client directories on Data Domain
 
There is one client directory on the Data Domain filesystem dedicated to each Avamar client that has at least one backup.
 
 
Step 8. Utilize capacity.sh script:

a) Identify highest change rate clients:
admin@avamar:~/>: capacity.sh --ddr --gb --days=30
 
Flags:
- 'days' value can be edited to match common retention period. Note: If clients were registered within the "days" window, decrease the days value to prevent these new clients from being listed with disproportionately high 'NEW DATA'
- 'top' flag can be used to list more than 5 clients
See KB 336542 for additional capacity.sh information
 
b) Consider removing older backups for highest change rate clients (requires customer approval!!):
Deleting older backups for clients listed under the  Top Change Rate Clients  section is the most effective means of recovering space on disk.
- Backups of highest change rate clients are best deleted from the Avamar Administrator or AUI.
 
 There are additional options which can help reclaim space more slowly than backup deletion:
- Reduce the retention period for high change rate clients.
Note: Changing the retention policy will only change the retention for new backups. Backups created before the retention policy change will still have the original retention period. See KB 333438 for information about changing expiration dates for existing backups in bulk with the modify-snapups tool.
- Migrate clients off of this grid to be backed up elsewhere, such as another Avamar/DD or IDPA system. 
- Review and modify Avamar datasets for highest change rate clients to exclude any data that does not need to be backed up.
 
c) Observe trends in daily post-comp data ingest:
- capacity.sh "DDR NEW" column lists the daily post-comp data ingest (disk space after deduplication and compression take place).
Note: Spikes in post-comp data ingest are caused by new client registration or increased change rate for existing clients.
 
 
Step 9. Consider removing the oldest backups on the system (requires customer approval!!):

- Older monthly and yearly backups contain significantly more unique data than recent daily backups, and are strong candidates for space reclamation. However, sometimes regulatory requirements mandate certain retention periods. 
 
a) Review KB 333438 for information on deleting backups created before a certain date with modify-snapups tool
- Example command syntax to delete backups older than 60 days is provided in Notes section below.
 
Flag info:
'--before' affects backups created X number of days ago.  
'--domain=/' can be used to affect backups in all client domains.
 
b) Note number of backups to be affected. MODIFY means the backup will be deleted or expired, CONSERVE means backup did not meet flag criteria and will be unaffected by running script.
Ex:
admin@avamar:~/>: grep -c MODIFY modifyoutput.txt
751
admin@avamar:~/>: grep -c CONSERVE modifyoutput.txt
1363
 
c) Per modify-snapups KB, add executable permission to script, and run script.
 
 
Step 10. De-reference deleted backups and reclaim space on Active tier disks:

- After backups have been deleted, wait for Avamar maintenance cycle, checkpoint backup expiration (if applicable), and Data Domain filesystem cleaning to occur as scheduled, or initiate these activities manually.
- It is generally best to allow Avamar maintenance activities to run during the next maintenance window, and start Data Domain cleaning manually after Avamar maintenance activities complete.
 
Option A) Monitor Avamar maintenance activity status:
admin@avamar:~/>: status.dpn
 
Option B) Run Avamar maintenance activities manually:
1) Suspend maintenance scheduler:
admin@avamar:~/>: dpnctl stop maint
2) Run manual garbage collection:
admin@avamar:~/>: avmaint garbagecollect --ava --kill=20 --maxpass=0 --refcheck=true --throttlelevel=0 --usehistory=false --maxtime=7200
3) Monitor garbage collection status:
admin@avamar:~/>: watch avmaint gcstatus
4) Once garbage collection result="OK", Take a checkpoint:
admin@avamar:~/>: avmaint checkpoint --ava  
5) Monitor checkpoint status and make note of checkpoint name:
admin@avamar:~/>: watch avmaint cpstatus
6) Once status="completed" and result="OK", Run checkpoint validation (HFS check) on manual checkpoint:
admin@avamar:~/>: avmaint hfscheck --rolling --checkpoint=cp.xxxxxxxxxxxxxx --ava
7) One prompt returns, Monitor HFS check status:
admin@avamar:~/>: watch avmaint hfscheckstatus
8) Once status="completed" and result="OK", Take additional checkpoint:
admin@avamar:~/>: avmaint checkpoint --ava  
9) Monitor checkpoint status:
admin@avamar:~/>: watch avmaint cpstatus
10) Once status="completed" and result="OK", Resume maintenance scheduler:
admin@avamar:~/>: dpnctl start maint
11) List checkpoints on Avamar:
admin@neo1-ave:~/>: cplist
cp.20200224122949 Mon Feb 24 07:29:49 2020   valid --- ---  nodes   1/1 stripes     50
cp.20200224230843 Mon Feb 24 18:08:43 2020   valid rol ---  nodes   1/1 stripes     50
12) Confirm all but two snapshots corresponding with new Avamar checkpoints are expired on Data Domain:
DDBoostUser@ddhostname# snapshot list mtree *
DBoostUser@neo1-ddve1#  snapshot list mtree *
Snapshot Information for MTree: /data/col1/avamar-xxxxxxxxxx
----------------------------------------------
Name                Pre-Comp (GiB)   Create Date         Retain Until        Status
-----------------   --------------   -----------------   -----------------   -------
cp.20200221140020          35030.3   Feb 21 2020 09:00   Feb 24 2020 07:27   expired
cp.20200221140258          35030.3   Feb 21 2020 09:03   Feb 24 2020 07:27   expired
cp.20200222001825          35030.3   Feb 21 2020 19:18   Feb 24 2020 18:25   expired
cp.20200224122316          35030.3   Feb 24 2020 07:23   Feb 24 2020 18:25   expired
cp.20200224122949          35030.3   Feb 24 2020 07:30
cp.20200224230843          30918.9   Feb 24 2020 18:09
-----------------   --------------   -----------------   -----------------   -------
 
 
 
Review step 6 for information on checkpoint backups (if applicable).
 
Start Data Domain filesystem cleaning:
DDBoostUser@ddhostname# filesys clean start