Tuesday, 30 April 2013

Tuning SCCM 2007 & 2012 Site-to-Site Replication with Thread Settings.

For the longest time I continually listened to people referring to SMS\SCCM as 'slow management server' and for a while I agreed, that is until I discovered the settings that I believe can dramatically improve the time it takes to replicate configuration changes throughout a multiple site the hierarchy. I have looked around on the web and spoken with a Microsoft PFE and there does not appear be any publicly available guidance from Microsoft on best practices for tuning site and software distribution thread settings. The approach I outline below was discovered more through trial and observation that anything else and so your mileage may vary if applied to your environment.

To further illustrate my point I have added the below graphs and related notes:

Fig 1.  Multiple concurrent active package distributions and metadata replication backlogging indicating possible thread misconfiguration.
          Top - Schedule.box\outboxes\lan  (Site to Site metadata)
          Bottom – Distribution Manage.box\Incoming (Package Distribution)

Notice in Fig1 the relationship between the number of concurrent package distributions (below) and the minor backlogging in the Schedule.box\outboxes\lan inbox (above). Once the active package distributions completed the backlogging would clear almost immediately. This was due to software package distributions consuming all available threads for a child site resulting in normal site-to-site metadata replication to backlog. Ideally with the all thread settings tuned correctly both package distributions and site to site metadata changes will flow concurrently with neither adversely impacting the other.

Fig 2. The results of tuning thread settings, note the sharp drop in the top graph immediately post change.
          Top - Schedule.box\outboxes\lan  (Site to Site metadata)
          Bottom – Distribution Manage.box\Incoming (Package Distribution)

Notice in Fig 2. that as soon as the four configurable thread settings were tuned, we immediately saw a noticeable reduction in metadata backlogging which has improved the efficiently of site-to-site communication and reliability of package distribution. The graphs above represent a second tier primary site which has over a dozen child primary and secondary sites supporting 25k clients under normal load conditions. Historically the expectation was that configuration changes could take hours or days to replicate to the lowest site tier (5 levels) however after tuning these settings the average is now 15 minutes or less for a change made at the central site to fully replication to all sites globally. This has also had a positive impact on the upstream replication of client 'state' and site 'status' messages.

Standard Sender Properties:

  • 'All Sites: Number of directly connected child sites multiplied by 10 threads.
  •  Per Site: 10 threads
Note: The above assumes three directly connected child sites.

Software Distribution Properties:

  •  Max number of packages’ multiplied by ‘Max threads per package’ = 'Per Site' -2

i.e. 4 Max number of concurrent packages distributions multiplied by 2 threads per package = Software distribution (Packages) is limited to a maximum of 8 threads, thus always allowing 2 spare threads for site-to-site replication. 

The above settings will result 2 spare threads per site to always allow site-to-site configuration\metadata to flow and not be blocked by any active package distributions.

Key Points:

  • Rate limiting on addresses should be avoided as its use results in only 1 thread being available for site to site metadata replication and package content transfers. Where ever possible rely on other networking technologies such as QOS or Riverbeds to manage WAN link utilization.
  • Sender ‘Maximum Concurrent Sendings’ thread settings should be set based on the number of directly connected child sites and reviewed periodically.
  • Sender thread settings per site should exceed by at least 2 threads what is configured for package distribution to allow headroom for site to site configuration metadata replication to always occur.
  • As a result of the above tuning, issues or abnormal inbox traffic trends are much easier to identify.

Disclaimer: The above has been tested in a 60k production SCCM 2007  environment. ConfigMgr 2012 has the same configurable settings so I assume the same principles can be applied.

1 comment:

  1. Hey Ben,

    Do DPs count as directly connected child sites? Or, is that designation only for CASes, primaries, and secondaries?