Implementing Lazy Garbage Collection

Last modified: October 01, 2009

Applies to: SharePoint Foundation 2010

When SharePoint Foundation no longer holds a reference to a BLOB file in the external BLOB store (including active documents, old versions, and recycle bin documents), you can delete the orphaned file. To support this, you can run garbage collection on orphaned BLOBs whenever the proportion of orphaned files in the external BLOB store exceeds practical limits for your application.

Following are steps you should take to implement garbage collection for the external BLOB store when using the EBS Provider. Several of the steps provide example code snippets to illustrate specific tasks. However, the code examples are simplified and make the following assumptions:

  • External BLOB files for a given site are stored in a directory.

  • The list fits into an in-memory hash table.

  • No new BLOBs are created while garbage collection is running.

Although these assumptions might not apply to your site, you should be able to adapt the examples to your special conditions.

Important note Important

You should follow these steps in the sequence presented; not doing so can cause problematic race conditions.

To implement garbage collection for the external BLOB store

  1. Enumerate all BLOB files in the EBS Provider namespace that corresponds to a given SPSite identifier. Add these BLOB files to a hash table.

    String dirName = Utility.DirFromSiteId(site.ID);
    FileInfo[] files = Directory.GetFiles(dirName);
    foreach (FileInfo file in files)
    {
        ht.Add(file.Name, file);
    }
    
  2. Locate all documents in the content database that corresponds to the SPSite identifier. Remove these entries from the hash table.

    foreach (SPExternalBinaryId blobid in site.ExternalBinaryIds)
      {
        String fileName = Utility.FileFromBlobid(blobid);
            if (ht.Contains(fileName))
            {
                ht.Remove(fileName);
            }
      }
    
  3. Entries that remain in the hash table are files in the external BLOB store that do not have corresponding files in the content database. These are orphan files, and you can delete them.

    foreach (FileInfo file in ht.Values)
      {
          file.Delete();
      }
    
Show: