-
What does this service do?
Nikita is a Web site quality checker. She checks
your Web site's pages for valid (X)HTML, broken links, malformed HTTP headers,
encoding conflicts and more. Here's a very thorough list of
Nikita's features.
-
Who is this service for?
This service is of most use to Webmasters and SEO professionals.
It is especially helpful for Webmasters of sites with multiple authors or automatically
generated content. Nikita stays on the lookout for entropy's little
helpers – link rot, tag soup, encoding inconsistencies and the like – and
lets you know about them in concise, complete reports.
-
Yes, if possible. Nikita can only crawl a certain number of sites at
one time, so if she's already running at full capacity you might have to wait
for a while before she can start crawling yours. Also note that if she's
already crawling a site, she needs to complete that crawl before she can
start another on the same site.
-
It depends on the size of your site. As a very rough estimate,
expect it to take 20 to 30 seconds per page. If this sounds slow to you, please note
two things. First, Nikita spends most of her
time waiting between requests so as not to overload your server.
This wait is governed by the "politeness delay" option which defaults to five
seconds.
Second, the time it takes to spider your site also depends on the number of elements
that each page refers to (like images, CSS files, PDFs, scripts, etc.). Nikita
has to investigate every item referenced by your site in order to report
accurately about it. There might be more of these than you realize,
which means that it might take Nikita longer than you expect to crawl your site.
While Nikita is working, she gives you an estimate of the time remaining. That
estimate reflects only what Nikita knows about your site at that point. It might
grow as she discovers more of your site.
The "elapsed time" on
the completed statistics report
tells you exactly how long Nikita spent spidering your site.
-
How much does this service cost?
Two cents ($US .02) per page that Nikita finds, with the first
125 pages absolutely free. You might find it useful to look at
some precalculated sample costs.
-
Creating an account with Nikita is free and gives you a few benefits.
First, Nikita will remember which sites you've asked her to crawl. You'll get a
list of them when you log in.
You can also ask Nikita to follow up
on a crawl that she's done for you before. (You can read more
about followup crawls just below.) In addition, as a registered user, you won't have
to type in your email address when you start a crawl.
-
The largest site that Nikita has spidered was over 200,000 pages.
Nikita is capable of quite a lot, but you should also consider if you will be
able to process all of the information she generates.
If you think your site will have few validation errors, encoding problems,
broken links and so forth, then don't hesitate to ask Nikita to crawl it
all at once. If you have a large site and you expect Nikita to have a lot
to say about it, consider asking her to cover it in multiple crawls.
Supplying a path in the seed URL is a
simple, effective way of dividing your site into chunks.
-
A followup crawl is when Nikita recrawls a site that she's already
crawled for you.
This allows Nikita to narrow the focus of her reports so that
she only reports on pages changed since her previous visit.
Followup crawls are particularly useful when you want to see if you've fixed the
problems that Nikita found on her previous visit.
Another benefit of followup crawls is that Nikita takes advantage of HTTP caching
and won't refetch pages if (a) your server
indicates they were cacheable and (b) if Nikita found no errors in the
page during the previous crawl. This means less traffic on your server
and potentially faster crawls.
When starting a followup crawl, you can still alter any of Nikita's advanced
options (like lowering a politeness delay or providing a URL filter).
You must be logged in to start a followup crawl.
-
I'm a Webmaster. How do I keep Nikita out of my site?
Nikita obeys the
robots.txt standard. She will obey any rule that uses a user-agent of
Nikita The Spider or NikitaTheSpider.
For instance, a rule like this would keep Nikita out of your entire site:
User-agent: Nikita The Spider
Disallow: /
-
Does Nikita understand custom doctypes?
No. When Nikita encounters an unfamiliar formal public identifier,
she falls back to using HTML 4.01 Transitional (or XHTML 1.0 Transitional for
documents delivered with an XHTML media type). There is a
list of the FPIs that Nikita recognizes.
-
My site uses ASP.Net. Is there a browser capabilities file for Nikita?
Yes. Here's an ASP.Net browser caps file
for Nikita kindly submitted by Kevin F. Note that this file lies a little bit
because Nikita supports neither Java applets nor Javascript. However, setting these
options to "true" makes ASP.Net create pages for Nikita that are similar to those
created for an ordinary Web browser which is probably the content that you want Nikita
to validate.
-
Who is behind this service?
Philip Semanchuk is the designer, programmer, server monkey, cat wrangler, chief cook
and bottle washer.