Records Management features in SharePoint 2010: Part 12– Scalability and Performance

Records Management features in SharePoint 2010: Part 12– Scalability and Performance

Overview

  • Think scalability and performance first!
    • Invest time performing an in-depth current content inventory of Records (are we looking at terabytes?)
    • What are your growth estimates for the size of your electronic record inventories?
  • Large collections of Records require careful planning on numbers and locations of content databases, site collections, sites and document libraries in relation to the file plan
  • Bottom Line: Invest time in planning the SharePoint Logical Architecture for your Records Center
  • Some more thoughts:
    • Try and limit the size of your content databases to 50 GB to 100 GB
    • For very large archives of records, organize your Records Center repositories as independent site collections rather sub-sites and document libraries (consider making a site collection per category in your file plan)
    • Consider RBS
    • Clearly think of your backup, restore and disaster recovery strategy

 

How SharePoint 2010 can help with scaling [1]

SharePoint 2010 has many features to make it easier to scale to massive archives such as:

  • Database query performance optimizations [2]
  • SQL 2008’s Remote Blob Storage (RBS) decreased size of content DB [3]
    • Basically takes binary data out of your content databases resulting in the binary data on the file systems themselves and the metadata in the databases reducing the database size and improving scalability and performance
  • Internal timer job processing improvements
  • Highly scalable search along with new database indexing strategies [4]
    • Compound indexing, index management, and content-by-query optimizations
    • SharePoint now supports multiple index servers
    • Content index can now be divided into multiple index partitions
    • Each index server can be configured to run multiple crawlers
    • Multiple crawlers can crawl content in parallel
    • Index servers are now stateless.  The crawlers build the content index and propagate directly to the query servers. 
    • multiple query servers benefits of redundancy and parallel performance can be made available
    • crawl management and property store data tables have been split into separate databases and multiple tables of this kind can be configured. 
  • List optimizations
    • Tens of millions of docs in a single list
  • Service Applications Architecture
  • New Send to connections allow moving of records instead of just copying
  • Multiple Records Center Site Collections
  • Internal database improvements (e.g. lock ordering, throttling, IOPS efficiency)
  • Background per-item processing throughput maximization
  • Content Organizer is able to organize your Records Repositories
  • Content Type Syndication allows central location to inherit and publish from

 

This allows:

  • Millions of records in a single Records Center
  • Multiple Records Centers! (new in SharePoint 2010, in 2007 you were only allowed 1)
  • A distributed archive allowing many Record Centers to bind together to act as one logical repository
  • Fast searching through your archives of records
  • An easy mechanism to move records to the archive of your choice and leave a reference to where it now exists

But this does not excuse you from planning your architecture for scalability and performance!!

Visit Microsoft’s Technet article called “SharePoint Server 2010 capacity management: Software boundaries and limits” [5] to see more of SharePoint 2010’s new boundaries, recommendations and thresholds that can help with scaling, capacity and performance for your Records Management Solution.  I have listed some here:

 

Limit Threshhold or Maximum
Zone 5 per Web application
Managed path 20 per Web application
Solution cache size 300 MB per Web application
Site collection 250,000 per Web application

 
 
Application pools 10 per Web server

 
 
Content database size (general usage scenarios) 200 GB per content database
Content database size (all usage scenarios) 4 TB per content database
Content database size (document archive scenario) No explicit content database limit
Content database items 60 million items including documents and list items
Site collections per content database 2,000 recommended, 5,000 maximum
Remote BLOB Storage (RBS) storage subsystem on Network Attached Storage (NAS) Time to first byte of any response from the NAS cannot exceed 20 milliseconds

 
 
Web site 250,000 per site collection
Site collection size Maximum size of the content database

 
 
List row size

8,000 bytes per row
File size 2 GB
Documents 30,000,000 per library
Major versions 400000 maximum
Items 30,000,000 per list
Rows size limit 6 table rows internal to the database used for a list or library item
Bulk operations 100 items per bulk operation
List view lookup threshold 8 join operations per query
List view threshold 5000 maximum
List view threshold for auditors and administrators 20000 maximum
Subsite 2,000 per site view
Coauthoring in Microsoft Word and Microsoft PowerPoint for .docx, .pptx and .ppsx files 10 concurrent editors per document
Security scope 1,000 per list

 
 
Web parts

25 per wiki or Web part page

 
 
Number of SharePoint groups a user can belong to

5000
Users in a site collection 2 million per site collection
Active Directory Principles/Users in a SharePoint group 5,000 per SharePoint group
SharePoint groups 10,000 per site collection
Security principal: size of the Security Scope 5,000 per Access Control List (ACL)

 
 

 

The complete list of this series can be seen by the following links:

1. Introduction
2. Document IDs
3. Managed Metadata Service (Term Store)
4. In-Place Records Declarations
5. Site Collection Auditing
6. Content Organizer
7. Compliance Details
8. Hold and eDiscovery
9. Content Type Publishing Hubs
10. Multi-Level Retention
11. Virtual folders and metadata based navigation
12. Scaling
13. Send To…
14. Document Sets

 

 

[1]

http://blogs.msdn.com/b/ecm/archive/2010/02/13/introducing-records-management-in-sharepoint-2010.aspx

[2]

http://blogs.msdn.com/b/enterprisesearch/archive/2010/06/09/sharepoint-2010-search-dogfood-part-3-query-performance-optimization.aspx

[3]

http://technet.microsoft.com/en-us/library/ee748607.aspx

[4]

http://www.houberg.com/2009/10/sp2010_scalability_2_of_4_sharepoint_search/

[5]

http://technet.microsoft.com/en-us/library/cc262787.aspx

Leave a Reply

Your email address will not be published. Required fields are marked *