Indicators for Universal Design of Web solutions
As an authority body, Difi supervise compliance with the regulations on universal design of ICT. For this we need methods for evaluation. This article provides an account of the Authority`s methods and indicators for universal design. We also give a brief overview of the data model for the analysis of test data/test results, and the method for selection of test pages and web solutions for verification and area surveillance.
Digitalisation lays the foundation for growth and interaction. The use of ICT can make everyday activities easier for everyone. But the transition to digital everyday life is demanding for many of us. Exclusion from digital social arenas may have major consequences. For example, failure to reach all the users may have major repercussions for companies in the private and public sector, associations and organisations.
Universal design is a priority in the Government's ICT policy. The Government's action plan for universal design 2015-2019 builds on a vision of a society in which everyone can participate. The focus areas are ICT and welfare technology.
The aim is that as many people as possible - regardless of their abilities - must be able to use the same digital solutions, without special adaption or special solutions. Universal design is therefore a smart approach to digitalisation. The more users and customers use digital solutions to receive information and perform self-service, the greater the gain.
The Agency for Public Management and eGovernment, Difi, supervises public and private companies’ compliance with the Norwegian regulations on universal design. For this we need methods to measure/verify to what degree individual solutions are in line with the regulations. The methods for measuring universal design are used for both supervision and area surveillance. They can also be used by companies that wish to evaluate their own solutions or purchase new ones.
In this article we will provide an account of the following:
- the Authority’s method for verification of web solutions and indicators for measurement of universal design of ICT for different industry groups and sectors
- an overview of the data model for the analysis of test data/test results, together with metadata and other information sources.
- a method for selection of test pages and web solutions for verification and status measurements
The project, whose purpose has been to develop the Authority’s indicators for universal design of web solutions, was started in 2016 and will run until the end of 2017. The project has been managed by Brynhild Runa Sterri and Dagfinn Rømen. The project group is interdisciplinary and consists of technologists Dagfinn Rømen and Geir Sindre Fossøy, lawyer Anne Marie Colban and economists Espen Tjøstolvsen and Brynhild Runa Sterri.
This article is written by Brynhild Runa Sterri and quality assured by Dagfinn Rømen. The article will be extended with the following as soon as they are completed:
- documentation of the Authority’s interpretation of WCAG, with associated indicators
- overview of the sources that form the basis of each indicator (techniques/failures)
- list of metadata for the indicator set
The standard that is incorporated into the Norwegian regulations for the design of web solutions, WCAG 2.0, has a hierarchical structure with four principles followed by 12 guidelines linked to the overall principles.
The four principles of WCAG 2.0 are:
- Perceivable: Information and user interface components must be presentable to users in ways they can perceive.
- Operable: User interface components and navigation must be operable.
- Understandable: Information and the operation of user interface must be understandable.
- Robust: Content must be robust enough that it can be interpreted reliably by a wide variety of user agents, including assistive technologies.
Each of the guidelines encompasses several success criteria formulated as technology-independent statements. The success criteria are designed is such a manner that they can be verified. The 61 success criteria (individual requirements) are classified into three levels: A, AA and AAA. The Regulation on universal design of ICT stipulates that web solutions in the private and public sector, targeting the Norwegian general public, should be designed in accordance with the success criteria at level A and AA, with some exceptions. All in all, 35 of the 61 success criteria in WCAG 2.0 are mandatory according to the regulation.
WCAG 2.0 also refers to a range of recommendations and techniques for how to make the contents of websites more accessible. If we adhere to these recommendations, we can make the content available to individuals with disabilities, such as blindness, impaired visibility, deafness and impaired hearing, reduced cognitive functional ability and reduced motor skills. There is also a significant point in that when the guidelines in the standard are adhered to, the web solutions will be easier to use for everyone.
The Authority’s interpretation of every single success criteria/requirements in WCAG 2.0 with associated methods for testing/verification, will soon be documented and included in an extended version of this report. The choice of test objects/test pages, test procedures and basis for evaluation of compliance/non-compliance is based on the interpretation. The qualitative and quantitative test results which are the core of the Authority’s indicators for universal design, are also based on this.
The Authority uses indicators to measure universal design. The word indicator derives from the verb "to indicate" which means point out or show. Indicators are used to gather information about conditions that are not directly measurable, or that are too costly or complex to measure directly. By simplifying complex conditions, an indicator will provide a clear signal about a condition or a change of condition. We often want an indicator to specify or reveal a phenomenon or a condition, using numbers or other measurable quantities.
It is not possible to tell directly from an ICT solution whether it has been universally designed. We therefore require indicators that provide indications about, or specify to what degree, the requirements for universal design have been complied with. When measuring the universal design of ICT solutions, we also look into whether there is a particular type of solution, type of content or functionality that stands out in terms of compliance/non-compliance with the regulations. We are also interested in measuring the results of universal design of ICT within different industries and sectors of society. It is also important to point out what consequences lack of universal design might have for users with different prerequisites. These perspectives are safeguarded by the Authority’s indicators for universal design.
The prerequisites for the work are to meet the needs for:
- Validity: methods of measurement with a high degree of validity and directly based on a documented interpretation of each single success criterion.
- Reliability: methods of measurement and test data must be designed in such a manner that reliable test results are achieved through the tests.
A high degree of validity and reliability is therefore the core of the Authority’s work with indicators. By documenting the work, we will facilitate openness and make the method easy to use for any authority, supplier of web solutions or company that is specifying or evaluating its own ICT solution.
The needs to be met by the full set of indicators are:
- By using the indicators, the requirements are interpreted and described in a manner that makes them measurable.
- By using the indicators, we receive test or measurement results for each individual requirement
- By collating the information or the test results from several or all the indicators, we get information about to what extent a web solution in general is designed in accordance with the regulation.
- Test results for individual solutions can be presented as qualitative information about the solutions and as statistics that provide information about all the ICT solutions tested in a measurement/area surveillance.
- Statistics generated from the test are also analysed together with other data from the ICT solutions and the companies utilising the solutions in contact with customers and the public.
The indicators must as far as possible provide the same result regardless of who performs the tests. It must be apparent what type of content, functionality and other properties are to be evaluated by each individual indicator. It must be apparent what is required to fulfil each individual requirement. The measurement methods can therefore be used for both in supervision and for area surveillance.
4.1 About the indicators
By now we have a total of 50 individual indicators to measure compliance/non-compliance/breach with the 35 success criteria in WCAG 2.0, that are mandatory according to the regulation. The set of indicators must be adjusted where required, amongst others when implementing the EU web directive into the Norwegian regulations. The indicators will also be adjusted to apply to testing mobile applications.
The indicators are prepared in an Excel spreadsheet, with one Excel workbook for each indicator. The approach is based on a risk assessment. We considered that the work to interpret WCAG, develop test procedures, methods to manage the qualitative and quantitative test results and facilitation of a holistic data model, should not take place in parallel with the development of a new tool to safeguard all these requirements. The work so far has provided a good basis for further development of the information contained in the Excel spreadsheets, the data model and the associated documentation, in a more coherent and accessible format.
The Authority’s set of indicators for measurement of universal design for individual solutions, is largely based on manual testing by experts on a selection of web pages in one website. Some indicators are also based on semi-automated tests. We aim to automate as much as possible of the test procedures, with the prerequisite that automated testing is based on a documented and agreed interpretation of WCAG 2.0, and additionally generates qualitative and quantitative test data suitable for evaluation for the purpose of both audits and analysis.
The work is interdisciplinary:
- The standard WCAG 2.0 forms part of the Norwegian regulations, and the interpretation of success criteria/requirements assumes legal expertise.
- To understand the technical content of the requirements and prepare test procedures, we require technological expertise.
- To ensure that the measurement methods reflect the Authority’s interpretation of the requirements and are documented in a manner which means that through tests and measurements we generate reliable and relevant data both qualitative and quantitative, we require analytical expertise.
4.2 The elements that form part of the Authority’s set of indicators for web solutions
The work can be related to the following W3C documents concerning testing of web solutions against the requirements in WCAG 2.0:
- Accessibility Conformance Testing Framework Requirements (ACT)
- Website Accessibility Conformance Evaluation Methodology (WCAG-EM) 1.0
- Evaluation and Report Language (EARL) 1.0 Schema
The Authority’s test methods, the format of the test result and method of selection of web solutions and web pages/types of context have been devised to meet the needs of both individual supervisions/checks and evaluation of an overall status. Therefore, not all the elements outlined in the above documents are covered. The Authority’s needs for quantification of the test results and connection to metadata for further analysis, are another reason for the inclusion of other elements in the methods, than those shown in the documents above.
The elements in the method are:
- Interpretation of the success criteria in WCAG with emphasis on identifying what content the requirement applies to, and what is required to ensure that a solution is designed in accordance with the requirements.
- Description of step-by-step test procedures and a standardised template for registration of test data.
- Standardised and pre-defined results/test results including an evaluation of what result indicates compliance/non-compliance (and any other test results).
- A graded scale that quantifies test results so that they are suitable for aggregation, comparison/benchmarking and other analysis.
- A data model for indicators, test result and metadata. A connection to metadata, and other data sources, so that the test results can be analysed against different industry groups and indicate which use situations/user groups particularly experience digital barriers etc.
In the following, we presents an explanation of the different steps and the elements of the method.
4.3 Interpretation of success criteria
The requirements of the regulations that have been put into operation by the indicators, are varied and complex and cover everything from detailed technical coding and structural requirements, to more judgement-based requirements for formulation of links, error messages, labels and headings.
The Authority’s interpretation of success criteria is based on
- the wording of every individual requirement/success criterion
- articles presented by WCAG, giving information about how we are to understand and comply with the requirements
- advisory documents (techniques) providing the basis for defining evaluation of what criteria that must be met, in order for a web solution to be regarded as designed in accordance with WCAG 2.0.
Besides the wording of the success criteria, the articles connected to each of the success criteria with information about how to understand the requirement, are also an important source. These articles explain in detail the purpose of the requirements, list what specific benefits the requirement is intended to safeguard, and provide examples of solutions which are compliant or in breach of the success criterion. They also contain a list of techniques linked to the success criterion. The techniques provide detailed information about the requirements and include tips for test methods. Information is also provided regarding the expected result of a test, and therefore provides important input on how to define what is required to comply with the requirement.
A part of the interpretation to define what type of content, functionality and structural elements the success criterion are relevant for. This follows from the success criterion, articles for understanding and techniques. Nevertheless, it is necessary to clarify in the test procedures what type of test object it is appropriate to verify.
For example, success criterion 1.3.1 requires that information, structures and relationships that can be perceived visually, also should be described in the code. For this success criterion we have prepared a total of three indicators for headers, tables, and lists respectively. 1.3.1 also stipulates requirements for coding of other types of content, for example form elements. Requirements for coding of form elements are also detailed in success criterion 4.1.2.. By linking the coding of the form elements to 4.1.2, we achieve a more holistic approach to the test of the form elements. With indicators to test form elements, linked to both 1.3.1 and 4.1.2, we would end up with more individual indicators and a more complex test situation.
The Authority has based an interpretation of each individual success criterion on an overall assessment of the sources together with an evaluation of what is expedient and possible to test. Clarifications and any limitations of the success criteria/requirements are documented. These can be perceived directly from the formulation of WCAG 2.0 with its associated documents and are also based on evaluations carried out by the Authority.
4.4 Test method and registration
The test methods are described at a level that ensures that the Authority’s tests/verifications can be verified and accommodated. The type of content and functionality targeted by each of the indicators has been specified, based on an interpretation of the requirement and whether the test object is an individual item or a complete webpage. Different testers should easily be able to identify the relevant content and functionality, carry out the same tests, register the same type of data through the test and produce consistent and reliable test results
Step-by-step methods of testing will make the actual testing more efficient because the individual tester is guided through the entire test procedure. The steps in the test procedure are also formulated with a view to the fact that the tester should have to register as little data as possible during the test. Therefore, to a large degree, the steps are formulated as Yes/No questions.
As an example, we use success criterion 2.4.6, which stipulates that the headings and labels must describe topic or purpose. The questions the tester must consider are:
- Does the website have the visual content that is perceived as a heading?
- Does the heading describe the subject or purpose of the content?
The tester considers each question, step by step, and registers a yes or no for each question.
The tester is required to register some additional information, such as identification of the test object. This could be, for example, (as in success criterion 2.4.6) to note down some key words about the topic or purpose of the content that belongs to a given heading. The type of web browser used for the test and the test page URL will also be copied in the registration page.
The indicator for the test of contrast (success criterion 1.4.3) is far more complex. The test procedure considers the fact that the requirement can be met in various ways. In addition to measuring the contrast between text and background, we also measure font size. If the combination of these indicates that the requirement has not been met, the tester is guided to evaluate a possible high contrast version. The steps in the test include an evaluation of the placement and design of the mechanism to activate a high contrast version, followed by the same test of the high contrast version as was carried out for the website in the normal display. Generally, testing the solutions against the requirements for contrast is fairly extensive. Through the step-by-step test procedures and yes/no questions, the test can be carried out efficiently and with good quality.
Once the test is completed and the test data has been registered, the tester receives automatically generated test data/results derived from the registrations.
4.5 Test results and evaluation of compliance/non-compliance
The test results are generated automatically. These are designed as pre-defined, standard formulations and are the results of the combinations of data the individual tester has registered in the various test steps.
The result of the tests using the Authority’s set of indicators should give the following outcome:
- Compliance: Test result indicates compliance with the success criterion.
- Non-compliance/breach: The test result indicates breach of the success criterion.
- No occurrence: Several situations may produce a “no occurrence” test result. This is because the type of content or the functionality the success criterion applies to:
- is not actually present on the test pages (for example the pages do not contain any video clips),
- or the type of content exists, but contains properties resulting in it falling outside the requirement (for example a video clip exists, but has not been recorded in advance).:
- Not testable: This is the result if the content or functionality the success criteria applies to, is present on the website, but it is not possible to test. Criterion 2.1.1 is an example because it requires that you must be able to use a keyboard to navigate the site. If there is no visible focus indicator, it will not be possible to navigate the site, and it is therefore not possible to test.
- Not tested: This is the case if the success criterion with associated indicators are not included in the test. We have, for example, chosen a set of key indicators that are relevant for status measurement, which covers about 20 indicators covering 16 success criteria (see Section 4.7 for more information about the key indicators). Success criterion/indicators that are not included during the measurement will generate the result "not tested"
The different types of outcome are closely linked to the W3C document "Evaluation and Report Language (EARL)". More detailed information can be found in the Authority’s documentation of the data model for the set of indicators.
In the following we focus on a more detailed explanation of the results of compliance and non-compliance.
Compliance/non-compliance with the requirements is measured on several levels:
- Success criterion level: The test result indicates whether the test objects have been designed in accordance with every individual success criterion that is included in a measurement. Selection of the test objects follows from the content of the success criterion. Depending on the requirement, the test object can be a single element, such as a table or a heading, or a complete website.
- Website level: Aggregated test results indicate to what degree the website (represented by the sample of pages) is designed in accordance with the success criteria that are included in a test.
Complex indicators can contain many possible outcomes within the categories of compliance and non-compliance, while other indicators only include two (one for compliance and one for non-compliance). This depends on the degree of complexity in the requirement/success criterion.
In the example with success criterion 2.4.6, only two outcomes are defined as possible for this category. Registration of the answer "Yes" generates the outcome "Heading describes the topic or purpose of the content", while a "No" response generates the opposite, "Heading does not describe the topic or purpose of the content”.
The requirement for contrast (success criterion 1.4.3) can be met in various ways. Testing contrast can therefore produce many possible outcomes within each of the categories of compliance/non-compliance. Situations that may result in breach/non-compliance are, for example, that the contrast is too low in comparison to the font size, that the contrast is too low in the high contrast version, that there is a lack of coding of the mechanism for activation of the high contrast version etc. There can also be several possible results indicating that the test page complies with the requirements. The reason for this is that there are several ways to meet the requirement.
When the test results are standardised and reflect several possible outcomes, we are also able to evaluate whether, for example, the same type of breach occurs on many websites, or whether the breaches are more or less equally distributed over several types of situations. Similarly, we receive information regarding whether there is a large spread of variation in the way the requirements are met, or if some methods for meeting the requirement stand out as the most widely used. This is important additional information and increases the value of the tests, both for checking individual solutions and for larger status measurements.
All the potential outcomes of a test (that are not a "no-occurrence", "not testable" or "not tested") are categorised for compliance or non-compliance. This means that the evaluation of the test result is as objective as possible, and is to a lesser degree dependent on the individual tester’s judgment.
In the same way as the design of the test steps using yes/no questions, auto-generated, standardised test results will provide a more efficient test execution and better quality in the results. This makes it possible to define a graded scale that reflects the test results, which can then be used to generate quantitative data/statistics.
4.6 Graded scale for measuring universal design
Test results we refer to in the previous section are formulated as text. We use a simple method to quantify the test results, in order to establish a data model for benchmarking and analysis. The purpose is to quantify to what degree websites are designed in compliance/non-compliance with the regulations, both at the general level and a somewhat more detailed level. The graded scale is used for status measurements/area surveillance and not for supervisions.
The graded scale is generic and calculates the score per indicator/success criterion. For success criteria that contain several individual indicators, the score at the individual level can be added to a point score per success criterion and added up to give a total score for a website. The results can also be aggregated for all the websites included in one measurement, and represent a total score for universal design. By using metadata, we can also evaluate what area or topic, seen as a whole, is most in compliance/non-compliance with the regulations, such as navigation, forms, media content etc.
We measure the results at two levels:
- Overall level (compliance/non-compliance)
- More detailed level (various degrees of non-compliance)
At the overall level, we categorically measure whether a test results in compliance or non-compliance. At the more detailed level, we measure how far away a web solution is from complete compliance with each individual requirement or topic included in the test. Is there, for example, a breach of the requirements for contrast on all the pages tested, or have we uncovered a contrast error in a small minority? This is useful information when evaluating the overall universal design status.
The score for the overall level has the values 0 and 1:
- 1 point: all test objects are in compliance with the requirement
- 0 points: one or more test objects do not comply with the requirement
This means that the website gets 0 points for a requirement, if there are one single result that indicates non-compliance with the requirement. This applies regardless of whether a majority of test objects on a website have been evaluated as in compliance with the requirement. If the website receives 1 point for, for example, test of images, this means that all the tested images on the website have a text alternative in compliance with the requirement. By measuring points per success criterion and not for each web page, we identify, in a simple manner, what requirements/success criteria have been fulfilled in the solution and what breaches of requirements have been revealed.
Seen as a whole, this means that if a website has any occurrences of 0 points, the website has not been designed in compliance with WCAG 2.0
For the purposes of analysis, it is interesting to dive a little deeper into the test results than we are able to do by just categorising test results in terms of compliance/non-compliance.
The degree of universal design for a website is measured in points achieved as a percentage of the maximum achievable number of points. We use a percentage score to evaluate and compare results, both at the detailed and the more overall level. The reason for using percentage of achieved points and not the number of achieved points, is that the maximum achievable points will vary from website to website.
Maximum achievable points is the score accomplished if all the test results indicate compliance with all the requirements in a test, corrected for the occurrence of the types “no occurrence”, “not testable” and “not tested”.
If a measurement covers, for example, the content types headings, forms and video, the maximum achievable points for the websites included in the measurement, will be equal only if all the tested websites contain all the three content types. As we are going to compare the results of the solutions as a whole, and not specifically per topic, we have therefore chosen to use a percentage score and not achieved points as the indicator for universal design.
By calculating what percentage of the test object (for example visual headings) is not coded in compliance with the requirement, we receive information regarding the degree of non-compliance.
By analysing the test data in more detail, as this permits, we will get a more detailed understanding of the status of universal design. This is also important in terms of the need for information about how substantial different digital barriers are - and not least - how substantial the measures to achieve universally designed websites will need to be.
4.7 Data model for analysis of test data and metadata/other data sources
The test data that is quantified with the graded scale, provides information regarding to what degree one or more web solutions are designed in compliance with the requirements imposed by the standard WCAG 2.0. By linking test data to metadata about the indicators and the companies/web solutions, we increase the information value of the measurements significantly.
This is the background to all the individual indicators being defined with metadata that shows what content type, structural elements or functionality the indicator applies to. We have also, based on the articles for understanding the success criteria in WCAG, defined user situations/user groups as the meta information linked to the set of indicators. Other examples of metadata are the test method, such as whether the tests are manual or semi automatic and what the purpose of the test is (supervision or status measurement)
The indicators are marked to indicate the categories of user situations or user groups the requirements are mainly intended to safeguard:
- Visual impairment (blind, impaired visibility, impaired colour vision/colour blind, deaf-blind)
- Hearing impairment (deaf, deaf-blind)
- Impaired motor skills
- Reduced cognitive ability
- Norwegian as a second language
Similarly, the individual indicators are categorised by what topic, type of content or functionality they belong to.
The indicators have been sorted in topics using categories of types of content and functionality, such as:
- Images and illustrations
- Media content
- Keyboard navigation
There are several topics in each category. The category of coding will, for example, cover headings, lists, tables and forms. The category of keyboard navigation includes indicators such as the possibilities of reaching everything with a keyboard, focus selection, keyboard traps and skip links.
Some indicators and success criteria will belong to several categories. The coding of form elements can, for example, be related to both the category of coding and forms. We are aware of this in the analysis and therefore only compare categories that are mutually exclusive.
In addition to this type of meta information linked to success criteria and indicators, we also have data on the companies and the web solutions that are appropriate for test/verification. Industry and social sector are of prime interest in this respect. In addition, we also have information that can indicate the usage volume of the solutions we verify. This is, for example, the number of employees in the companies responsible for the web solutions, the number of residents in local municipalities (who make use of the web solution provided by the local authority), and some data on the volume of users visiting the largest websites.
We have established a data model/database to systematise the test result and indicators, information about the solutions that have been tested and the companies. The data model is flexible allowing us at a later point in time to add new metadata and link it to, for example, existing test results.
The data model shall serve several purposes:
- Assisted by the meta information regarding what content types and functionality are included for each individual indicator, we will easily be able to identify a set of individual indicators to test, for example, forms. Similarly, following measurement, we will use the same data model to analyse the test data and evaluate the degree of compliance/non-compliance with the regulations for all the forms included in a measurement.
- Assisted by the meta information that shows the different user situations each success criterion (and associated indicator) is to safeguard, we can analyse the test data to get information regarding the consequences of lack of universal design. In this manner, we will get information about whether digital barriers are evenly distributed for different user groups, or if it is, for example, especially the success criteria that are set to protect individuals with cognitive disabilities that are not being safeguarded.
- By linking the test data from the verification of the solutions to information about what industry group and sector the companies responsible for the solutions belong to, we can uncover what areas of society have the most extensive digital barriers. Data about industry group/sector and size of companies is also used to identify web solutions appropriate for measurement.
The Authority analyses the test result in combination with these types of metadata in order to uncover areas of risk which need follow-up in connection with inspections. This has resulted in identification of text alternatives for images/illustrations, contrast and coding of web solutions (and thereby forms), as an especially important area. This is shown in the report from the Authority’s measurement of the universal design status of approximately 300 websites in 2014 and in analysis of the data from the legal inspections. In the status measurements we also found that the public sector has a greater degree of universal design than the private sector, and that there was a lack of compliance with the regulations in all industry groups. The biggest challenges were found in banks/finance, travel/accommodation and in quite a few media websites.
Metadata is thus an important part of the Authority’s indicators for universal design. A significant degree of targeted measures can be achieved by linking these data sources with test data from evaluation of the web solutions. This applies both to targeting the audits towards the areas of society and topics with the highest risk of lack of compliance, and when we evaluate the consequences of a lack of compliance in the light of what user situations/user groups are especially affected.
The collected documentation on metadata and the Authority’s data model are explained in more detail and documented separately.
4.8 Key indicators for area surveillance
The Authority’s set of indicators contains methods for measurement of all the requirements in WCAG 2.0 which are mandatory in accordance with the Norwegian regulations. It is seldom necessary to verify ICT solutions against all the requirements. We therefore make a selection both for supervision and status measurements.
We have chosen a set of key indicators covering a selection of the requirements in the standard to use for measuring status. The selection is made based on risk and materiality and is based on criteria listed below:
- The four main principles in WCAG 2.0 is represented
- Different user situations/user groups that must be safeguarded (blind/impaired visibility, deaf/impaired hearing, reduced motor skills and reduced cognitive functional ability) is represented
- The most important types of content and properties of the web solutions is represented in the requirements which are included in the key indicators
- The most important areas of risk identified in earlier status measurements and supervision, must be included
Based on the criteria listed above, success criteria and indicators for the set of key indicators have been selected as shown in the table below.
|No||Success criterion||Indicator/test theme|
The key indicators are thought to provide a reasonably good overview of the status of universal design for a website. The set encompasses in all 20 indicators which cover 16 of the 35 success criteria in WCAG 2.0 as referred to in the Norwegian regulations.
We consider that the key indicators provide a good basis for the status measurements, where we can verify a relatively large volume of web solutions.
The key indicators can become a fixed basis for all measurements. Depending on the focus area and requirements, we can add more requirements and rotate these so that over a longer period of time we can verify all the requirements of the standard. This is important to uncover any potential new areas of risk.
The Authority measure the status for universal design, based on a sample of web solutions. The purpose of the selection is to make sure that through the measurement we receive information about the status of universal design of ICT within different sectors and industries. In a similar way, the selection of individual pages in a web solution must contain the most important web pages, content types and functionality.
Our starting point is the recommendations regarding testing the solutions against the requirements of WCAG 2.0, Website Accessibility Conformance Evaluation Methodology (WCAG-EM) 1.0. We have adapted the method for selection of websites to the Authority’s requirements for information, both about individual websites and on the status regarding universal design at a more general level, within different sectors and industries.
5.1 Selection of web solutions
Risk and materiality are the main principle in the selection of web solutions. We therefore emphasise identification of the areas of society (industry goups/industries) where:
- the risk of non-compliance with the regulation is high and
- a breach in the regulations has consequences for many users
We use different sources of information for identification of which sectors and industries in particular ought to be weighted in the selection. We emphasise services that have large volume usage on the internet, combined with risk areas uncovered in earlier status measurements and previous reviews.
Data from Statistics Norway reveals the following about the population’s internet use in 2017:
- 91 per cent use banking services on the internet
- 89 per cent read or download newspapers and magazines
- 54 per cent have purchased/booked travel/accommodation
- 44 per cent have bought clothes/sports articles
According to Difi’s survey of the population at least three quarters of the population use national and municipal websites to find information and perform services.
Based on this information, it is especially important to focus on the parts of public sectors engaged in services for individuals (central and local government), and industries in the private sector, such as banks/finance, media, travel/accommodation and retail, in the selection of web solutions.
The evaluation is also supported by the Authority’s status measurements for 2014 where especially bank/finance and media showed results that indicated relatively widespread digital barriers. A relatively high degree of non-compliance with WCAG 2.0. was also uncovered within travel/accommodation.
In Norway, as in several other countries, we do not have a complete register of public and private web solutions suitable to make a selection from. Reviews carried out by the Authority show that 92 per cent of Norwegian companies (with more than 3 employees) have websites. In a sample survey of companies with more than 50 employees, all the companies had websites. We have therefore concluded that we can use the public register of companies as a basis for identifying companies and web solutions to be included in a sample for area surveillance.
Since risk and materiality are the guiding principles for selection of web solutions for status measurement, the sample will not necessarily be representative of all the web solutions in Norway. In a representative sample of Norwegian companies, the majority would be relatively small. To ensure the sample covers web solutions/companies with assumed large user volumes, we use the number of employees, the number of residents (for municipal websites) and general market knowledge as criteria for selection. In addition, we compose the sample to ensure that we include a sufficient number of web solutions to be able to specify the result within the industries and sectors included in the Authority’s primary target groups.
Prioritised industry groups and sectors in the selection are:
- travel and accommodation
- retail trade
- central and local administration
As a part of selecting websites for status measurement, we also perform other reviews. Through surveys, we record (among other things), the time of acquisition of the solutions. This forms the basis for identification of companies that have acquired solutions after the deadline for compliance with the regulations on universal design, therefore are appropriate for supervision. In these surveys we also ask if the companies have (or have plans to develop) mobile applications. We use this information to put together a sample of mobile applications for measuring the status of universal design for these solutions at a later stage.
The sample and our selection methods will be detailed in the documentation and reports we are preparing in conjunction with each separate status measurement.
5.2 Selection of websites/test pages
For status measurement/area surveillance we put together a sample containing approximately 10 test pages which are the same for all the web solutions included in one measurement. This is to create a consistent basis for comparison and to ensure that we do not weight type of content and functionality differently in different solutions.
In connection with supervision the sample is adapted to the topic for the inspectionsupervision and information documentation about the most used pages on the actual website. Furthermore, the sample must reflect the page types/templates used on the website, and cover the most usual user tasks.
We put together a sample built on the following criteria:
- The sample must cover the most important pages that include the purpose and content of the websites, the front page, for example, or another page with information about the company, the main content of the pages and other significant information
- The sample must cover pages that contain the most important user tasks in the web solution, with content and functionality which is decisive for self-service, such as a form. As far as possible, without completing any financial transactions, all pages included in a selected process must be included in the sample.
- The sample must also cover other important pages on the website, with content that is relevant for the topic/success criteria included in the measurement. In the selection of key indicators, we have weighted both the types of content/topic and user situations that the success criteria are intended to safeguard. Consequently, the sample of websites, in addition to the criteria listed above, must include pages containing images/illustrations, tables and media content.
- Other themes that are less dependent on a specific page selection, such as links, headings, body text, keyboard navigation, possibilities for enlargement etc., will generally be evaluated in the main in pages selected based on the criteria listed in point 1-3. Should this not be the case, the sample of pages will be supplemented with pages making it possible to verify all the types of content and functionality that are relevant for the measurement.
In addition to the selection of test pages, we also specify a routine description for performing tests as a part of a status measurement, how many test objects, for example, the number of images, links etc., which must be verified for the different categories of success criteria.
In the same manner, as for the selection of web solutions, the selection of websites/test pages will be shown in reports being prepared in connection with each separate status measurement.