Dec 6, 2009

Raising the Bar in Testing


It is important for NSS Labs to periodically raise the bar as the industry advances, both offensively and defensively. In other words, we increase the difficulty of the test to match the needs (as driven by cyber criminals' innovative new attacks), and the capabilities of vendors who have developed new approaches to countering the threats.

This is what NSS Labs has been doing over the past 2 years in particular: raising the bar. As such, our reports and the scores vendors receive on them deserve to be put into context. We perform a number of different types of tests over the past few years:

1. Product Certification - a full review of the whole product; including security effectiveness, performance, manageability and stability.

2. Group Tests - comparative testing of a class of products from leading vendors. Due to the volume of vendors, the depth of the testing may be abbreviated and focused. The browser security, endpoint protection, and Network IPS tests are great examples of these.

3. Security Update Monitor (SUM) Tests - unique to the industry, NSS Labs has been testing IPS products on a monthly basis. Every quarter, we tally the average scores and bestow awards. The attacks in this test set are focused on vulnerability disclosures made the previous month. As such, the volume of new additions is generally in the couple dozen range. Overall, there are currently 300 entries.

4. Individual Product Reports - these are brand new detailed reports that were created during group testing. They capture the nitty gritty details that are rolled up into the group test.

5. Exposure Reports - An industry and NSS first. These reports list specific vulnerabilities that are NOT shielded by an IPS. This information is critical in helping organizations knowing where their protection is and is not. Contact us for confidential demonstration. (Knowing can help identify appropriate mitigations, such as writing custom signatures, implementing new rules, resegmenting the network, or ultimately switching or adding a security product.)

By the Numbers:
Exploit selection and validation is a serious matter. Our engineers take care to identify the greatest risks to enterprises, based on commonly deployed systems, utilizing a modified CVSS.


Certifications performed in 2008 and 2009 tested 600 vulnerabilities. Compare this with other tests, e.g. ICSA IPS tests only 120 vulnerabilities (all server side). Our 600 is more than 480 better, because vendors did not know what they were, thus preventing the 'gaming of the test'.

SUM testing generally adds 20-30 vulnerabilities per month. Relatively small, directly reflective of popular attacks.

Our latest group test utilized 1,159 live exploits. More than 2x the number we used previously for certifications (and nearly 10x more than the next lab).

NSS Labs Tests are Harder Than The Other Guys'...
Difficult, real-world tests are an important part of raising the bar. Marketers like to have big numbers, and when it comes to scores, 100% is the target. Unfortunately, that's not the reality of information security products. We at NSS do not expect any product to catch 100% of the attacks in any of our tests. If they do, we probably are not working hard enough (or the bad guys gave up and went home - unlikely). There are more threats than can be protected against, and depending on the vendor, the margin can be acceptable, or pretty significant.

Take the difficulty of the test into consideration when comparing products. The lower the bar, the easier it is to score 100%. And the less meaningful it is. A 70% score on an NSS Labs Group Test (1,159 exploits) is still 6.7 times more validation than a 100% of 120 exploits in an ICSA Labs(r) test. And a 95% score on an NSS certification is 4.9 times more.

When we at NSS Labs raised the bar on this Q4 2009 IPS Group Test, we really cranked it up. So, if you're wondering why a vendor who previously scored in the 90's is scoring lower on the group test, it's not necessarily because they are slacking off. In most cases it is quite the opposite. Thus, one should be sure to compare products within the test set and methodology. Ergo, the 80% product definitely bested the 17% product by a wide berth. Be very cautious of the latter. And reward those vendors who submitted to this rigorous testing in the first place.