LinkScan: Overview & Guide to its Usage at Queen's
Since the launch of the central web service in January 1997, the QUB web site has grown at a rapid pace. In January 1997 there was approx. 1 gigabyte of data to be managed, in October approx. 8 gigabytes and in January 2004 approx. 34 gigabytes. This data is owned by circa 200 website maintainers. These staff are referred to as Core Information Providers (CIPs) and each of them need to manage their own individual site's web data.
From the CIPs point of view they needed a tool to :
- help maintain their site's integrity e.g. flag broken hyperlinks
- provide a graphical representation of their site i.e. a Sitemap
From the web server manager's perspective the tool should :
- help maintain the integrity of the global site www.qub.ac.uk
- provide global Sitemaps
- identify files not linked to from anywhere on www.qub.ac.uk (these files can consume significant amounts of disk space)
- provide automated reporting for CIPs
LinkScan provides all of these features and more. We are currently licensed to report on one server only, www.qub.ac.uk , but it is possible to negotiate a reduced rate for more than one server. Our license enables us to scan multiple web sites on www.qub.ac.uk with a maximum of 5,000 docs per site.
Linkscan scans the web site and maintains a database from which reports can be produced. It is a powerful link checking and website management tool. It is
- highly customizable
- does HTML validation
- Validates all good links
- Finds all broken links
- Creates SiteMaps and Interactive TapMaps
- Identifies 'Orphan' files i.e. pages not linked by some route to the site home page and therefore possibly redundant
- Reports to web site owners can be sent via email
- Reports can be produced either statically or dynamically and then viewed in a web browser
- validates hyperlinks for all major protocols including HTTP:, HTTPS:, FTP:, MAILTO: and LDAP:
- validates links within drop-down lists
- parses and extract any hyperlinks embedded in ShockWave/Flash files
- Sites with complex dynamic content including Active Server Pages (ASP), Cold Fusion (CFM), Java Servlets (JSP), PHP, and Vignette may be scanned to optimize test coverage..
A more detailed description of LinkScan can be found at http://www.elsop.com/
LinkScan : usage at Queen's
All websites held on the Queen's central web server www.qub.ac.uk are scanned weekly. Reports are then produced for all units/CIPs. Reports can be found at http://www.qub.ac.uk/scan/unitname/ where unitname has been preallocated to the CIP/website maintainer. Although it is possible to generate 18 types of report in total, only eight types are being made available initially. These eight reports are adequate for CIPs to maintain the integrity of their sites. The reports produced are:
- SiteMap Report - displays the structure of the website.Use this report to display the structure of a website and review the clarity of the Document Titles. The SiteMap may be displayed either in Directory Order (a simple sort of the URL's scanned) or in Link Order (based on the structure of the hyperlinks within the site). The QUB default is Link Order but can be changed.
- Critical Errors Report – displays most important errors on site. Use this report to quickly identify the most critical errors on your website.
- Detailed Errors Report - displays the status of each selected link in the LinkScan database. Use this report to examine the website errors.
- Document Detail Report - displays a one-line summary for each HTML document scanned. Use this report to view an overview of the scope of a scan.
- Problem Documents Report – documents that may contain problems and require attention. Use this report to rapidly identify documents with potential problems.
- Search Documents Report - displays lists of Documents from the LinkScan database. Use this report to perform ad hoc queries on a document-centric view of the database.
- Search Links Report - displays lists of Links from the LinkScan database. Use this report to perform ad hoc queries on a link-centric view of the database.
- Orphaned Files Report - displays a list of files that were found on the servers file system but, had no hyperlinks pointing at them.Use this report to identify files that are no longer is use. In some cases you will want to archive or delete them, and in others you may need to restore some navigational links so they may be found by users.