Class WebRobots
In: lib/webrobots.rb
lib/webrobots/robotstxt.rb
Parent: Object

Methods

allowed?   create_cache   disallowed?   error   error!   flush_cache   new   option   options   reset   sitemaps  

Classes and Modules

Class WebRobots::Error
Class WebRobots::ParseError
Class WebRobots::RobotsTxt

Attributes

user_agent  [R]  Returns the robot name initially given.

Public Class methods

Creates a WebRobots object for a robot named user_agent, with optional options.

  • :http_get => a custom method, proc, or anything that responds to .call(uri), to be used for fetching robots.txt. It must return the response body if successful, return an empty string if the resource is not found, and return nil or raise any error on failure. Redirects should be handled within this proc.

Public Instance methods

Tests if the robot is allowed to access a resource at url. If a malformed URI string is given, URI::InvalidURIError is raised. If a relative URI or a non-HTTP/HTTPS URI is given, ArgumentError is raised.

Equivalent to !allowed?(url).

Returns an error object if there is an error in fetching or parsing robots.txt of the site url.

Raises the error if there was an error in fetching or parsing robots.txt of the site url.

Flushes robots.txt cache.

Returns extended option values for a resource at url in a hash with each field name lower-cased. See allowed?() for a list of errors that may be raised.

Removes robots.txt cache for the site url.

Returns an array of Sitemap URLs. See allowed?() for a list of errors that may be raised.

[Validate]