Class Addressable::URI
In: lib/addressable/uri.rb
Parent: Object

This is an implementation of a URI parser based on <a href="RFC">www.ietf.org/rfc/rfc3986.txt">RFC 3986</a>, <a href="RFC">www.ietf.org/rfc/rfc3987.txt">RFC 3987</a>.

Methods

Classes and Modules

Module Addressable::URI::CharacterClasses
Class Addressable::URI::InvalidURIError

Constants

SLASH = '/'
EMPTYSTR = ''
URIREGEX = /^(([^:\/?#]+):)?(\/\/([^\/?#]*))?([^?#]*)(\?([^#]*))?(#(.*))?$/
PORT_MAPPING = { "http" => 80, "https" => 443, "ftp" => 21, "tftp" => 69, "sftp" => 22, "ssh" => 22, "svn+ssh" => 22, "telnet" => 23, "nntp" => 119, "gopher" => 70, "wais" => 210, "ldap" => 389, "prospero" => 1525
NORMPATH = /^(?!\/)[^\/:]*:.*$/
PARENT1 = '.'   Resolves paths to their simplest form.

@param [String] path The path to normalize.

@return [String] The normalized path.

PARENT2 = '..'
NPATH1 = /\/\.\/|\/\.$/
NPATH2 = /\/([^\/]+)\/\.\.\/|\/([^\/]+)\/\.\.$/
NPATH3 = /^\.\.?\/?/
NPATH4 = /^\/\.\.?\/|^(\/\.\.?)+\/?$/

External Aliases

encode_component -> encode_component
unencode -> unescape
unencode -> unencode_component
unencode -> unescape_component
encode -> escape

Public Class methods

Converts a path to a file scheme URI. If the path supplied is relative, it will be returned as a relative URI. If the path supplied is actually a non-file URI, it will parse the URI as if it had been parsed with Addressable::URI.parse. Handles all of the various Microsoft-specific formats for specifying paths.

@param [String, Addressable::URI, to_str] path

  Typically a <code>String</code> path to a file or directory, but
  will return a sensible return value if an absolute URI is supplied
  instead.

@return [Addressable::URI]

  The parsed file scheme URI or the original URI if some other URI
  scheme was provided.

@example

  base = Addressable::URI.convert_path("/absolute/path/")
  uri = Addressable::URI.convert_path("relative/path")
  (base + uri).to_s
  #=> "file:///absolute/path/relative/path"

  Addressable::URI.convert_path(
    "c:\\windows\\My Documents 100%20\\foo.txt"
  ).to_s
  #=> "file:///c:/windows/My%20Documents%20100%20/foo.txt"

  Addressable::URI.convert_path("http://example.com/").to_s
  #=> "http://example.com/"

Percent encodes any special characters in the URI.

@param [String, Addressable::URI, to_str] uri

  The URI to encode.

@param [Class] returning

  The type of object to return.
  This value may only be set to <code>String</code> or
  <code>Addressable::URI</code>. All other values are invalid. Defaults
  to <code>String</code>.

@return [String, Addressable::URI]

  The encoded URI.
  The return type is determined by the <code>returning</code> parameter.

Percent encodes a URI component.

@param [String, to_str] component The URI component to encode.

@param [String, Regexp] character_class

  The characters which are not percent encoded. If a <code>String</code>
  is passed, the <code>String</code> must be formatted as a regular
  expression character class. (Do not include the surrounding square
  brackets.)  For example, <code>"b-zB-Z0-9"</code> would cause
  everything but the letters 'b' through 'z' and the numbers '0' through
 '9' to be percent encoded. If a <code>Regexp</code> is passed, the
  value <code>/[^b-zB-Z0-9]/</code> would have the same effect. A set of
  useful <code>String</code> values may be found in the
  <code>Addressable::URI::CharacterClasses</code> module. The default
  value is the reserved plus unreserved character classes specified in
  <a href="http://www.ietf.org/rfc/rfc3986.txt">RFC 3986</a>.

@return [String] The encoded component.

@example

  Addressable::URI.encode_component("simple/example", "b-zB-Z0-9")
  => "simple%2Fex%61mple"
  Addressable::URI.encode_component("simple/example", /[^b-zB-Z0-9]/)
  => "simple%2Fex%61mple"
  Addressable::URI.encode_component(
    "simple/example", Addressable::URI::CharacterClasses::UNRESERVED
  )
  => "simple%2Fexample"

Encodes a set of key/value pairs according to the rules for the application/x-www-form-urlencoded MIME type.

@param [to_hash, to_ary] form_values

  The form values to encode.

@param [TrueClass, FalseClass] sort

  Sort the key/value pairs prior to encoding.
  Defaults to <code>false</code>.

@return [String]

  The encoded value.

Decodes a String according to the rules for the application/x-www-form-urlencoded MIME type.

@param [String, to_str] encoded_value

  The form values to decode.

@return [Array]

  The decoded values.
  This is not a <code>Hash</code> because of the possibility for
  duplicate keys.

Converts an input to a URI. The input does not have to be a valid URI — the method will use heuristics to guess what URI was intended. This is not standards-compliant, merely user-friendly.

@param [String, Addressable::URI, to_str] uri

  The URI string to parse.
  No parsing is performed if the object is already an
  <code>Addressable::URI</code>.

@param [Hash] hints

  A <code>Hash</code> of hints to the heuristic parser.
  Defaults to <code>{:scheme => "http"}</code>.

@return [Addressable::URI] The parsed URI.

Returns an array of known ip-based schemes. These schemes typically use a similar URI form: //<user>:<password>@<host>:<port>/<url-path>

Joins several URIs together.

@param [String, Addressable::URI, to_str] *uris

  The URIs to join.

@return [Addressable::URI] The joined URI.

@example

  base = "http://example.com/"
  uri = Addressable::URI.parse("relative/path")
  Addressable::URI.join(base, uri)
  #=> #<Addressable::URI:0xcab390 URI:http://example.com/relative/path>

Creates a new uri object from component parts.

@option [String, to_str] scheme The scheme component. @option [String, to_str] user The user component. @option [String, to_str] password The password component. @option [String, to_str] userinfo

  The userinfo component. If this is supplied, the user and password
  components must be omitted.

@option [String, to_str] host The host component. @option [String, to_str] port The port component. @option [String, to_str] authority

  The authority component. If this is supplied, the user, password,
  userinfo, host, and port components must be omitted.

@option [String, to_str] path The path component. @option [String, to_str] query The query component. @option [String, to_str] fragment The fragment component.

@return [Addressable::URI] The constructed URI object.

Normalizes the encoding of a URI component.

@param [String, to_str] component The URI component to encode.

@param [String, Regexp] character_class

  The characters which are not percent encoded. If a <code>String</code>
  is passed, the <code>String</code> must be formatted as a regular
  expression character class. (Do not include the surrounding square
  brackets.)  For example, <code>"b-zB-Z0-9"</code> would cause
  everything but the letters 'b' through 'z' and the numbers '0'
  through '9' to be percent encoded. If a <code>Regexp</code> is passed,
  the value <code>/[^b-zB-Z0-9]/</code> would have the same effect. A
  set of useful <code>String</code> values may be found in the
  <code>Addressable::URI::CharacterClasses</code> module. The default
  value is the reserved plus unreserved character classes specified in
  <a href="http://www.ietf.org/rfc/rfc3986.txt">RFC 3986</a>.

@return [String] The normalized component.

@example

  Addressable::URI.normalize_component("simpl%65/%65xampl%65", "b-zB-Z")
  => "simple%2Fex%61mple"
  Addressable::URI.normalize_component(
    "simpl%65/%65xampl%65", /[^b-zB-Z]/
  )
  => "simple%2Fex%61mple"
  Addressable::URI.normalize_component(
    "simpl%65/%65xampl%65",
    Addressable::URI::CharacterClasses::UNRESERVED
  )
  => "simple%2Fexample"

Normalizes the encoding of a URI. Characters within a hostname are not percent encoded to allow for internationalized domain names.

@param [String, Addressable::URI, to_str] uri

  The URI to encode.

@param [Class] returning

  The type of object to return.
  This value may only be set to <code>String</code> or
  <code>Addressable::URI</code>. All other values are invalid. Defaults
  to <code>String</code>.

@return [String, Addressable::URI]

  The encoded URI.
  The return type is determined by the <code>returning</code> parameter.

Returns a URI object based on the parsed string.

@param [String, Addressable::URI, to_str] uri

  The URI string to parse.
  No parsing is performed if the object is already an
  <code>Addressable::URI</code>.

@return [Addressable::URI] The parsed URI.

Returns a hash of common IP-based schemes and their default port numbers. Adding new schemes to this hash, as necessary, will allow for better URI normalization.

Unencodes any percent encoded characters within a URI component. This method may be used for unencoding either components or full URIs, however, it is recommended to use the unencode_component alias when unencoding components.

@param [String, Addressable::URI, to_str] uri

  The URI or component to unencode.

@param [Class] returning

  The type of object to return.
  This value may only be set to <code>String</code> or
  <code>Addressable::URI</code>. All other values are invalid. Defaults
  to <code>String</code>.

@return [String, Addressable::URI]

  The unencoded component or URI.
  The return type is determined by the <code>returning</code> parameter.

Public Instance methods

+(uri)

Alias for join

Returns true if the URI objects are equal. This method normalizes both URIs before doing the comparison.

@param [Object] uri The URI to compare.

@return [TrueClass, FalseClass]

  <code>true</code> if the URIs are equivalent, <code>false</code>
  otherwise.

Returns true if the URI objects are equal. This method normalizes both URIs before doing the comparison, and allows comparison against Strings.

@param [Object] uri The URI to compare.

@return [TrueClass, FalseClass]

  <code>true</code> if the URIs are equivalent, <code>false</code>
  otherwise.

Determines if the URI is absolute.

@return [TrueClass, FalseClass]

  <code>true</code> if the URI is absolute. <code>false</code>
  otherwise.

The authority component for this URI. Combines the user, password, host, and port components.

@return [String] The authority component.

Sets the authority component for this URI.

@param [String, to_str] new_authority The new authority component.

The basename, if any, of the file in the path component.

@return [String] The path‘s basename.

This method allows you to make several changes to a URI simultaneously, which separately would cause validation errors, but in conjunction, are valid. The URI will be revalidated as soon as the entire block has been executed.

@param [Proc] block

  A set of operations to perform on a given URI.

Creates a URI suitable for display to users. If semantic attacks are likely, the application should try to detect these and warn the user. See <a href="RFC">www.ietf.org/rfc/rfc3986.txt">RFC 3986</a>, section 7.6 for more information.

@return [Addressable::URI] A URI suitable for display purposes.

Clones the URI object.

@return [Addressable::URI] The cloned URI.

Returns true if the URI objects are equal. This method does NOT normalize either URI before doing the comparison.

@param [Object] uri The URI to compare.

@return [TrueClass, FalseClass]

  <code>true</code> if the URIs are equivalent, <code>false</code>
  otherwise.

The extname, if any, of the file in the path component. Empty string if there is no extension.

@return [String] The path‘s extname.

The fragment component for this URI.

@return [String] The fragment component.

Sets the fragment component for this URI.

@param [String, to_str] new_fragment The new fragment component.

Freeze URI, initializing instance variables.

@return [Addressable::URI] The frozen URI object.

A hash value that will make a URI equivalent to its normalized form.

@return [Integer] A hash of the URI.

The host component for this URI.

@return [String] The host component.

Sets the host component for this URI.

@param [String, to_str] new_host The new host component.

The inferred port component for this URI. This method will normalize to the default port for the URI‘s scheme if the port isn‘t explicitly specified in the URI.

@return [Integer] The inferred port component.

Returns a String representation of the URI object‘s state.

@return [String] The URI object‘s state, as a String.

Determines if the scheme indicates an IP-based protocol.

@return [TrueClass, FalseClass]

  <code>true</code> if the scheme indicates an IP-based protocol.
  <code>false</code> otherwise.

Joins two URIs together.

@param [String, Addressable::URI, to_str] The URI to join with.

@return [Addressable::URI] The joined URI.

Destructive form of join.

@param [String, Addressable::URI, to_str] The URI to join with.

@return [Addressable::URI] The joined URI.

@see Addressable::URI#join

Merges a URI with a Hash of components. This method has different behavior from join. Any components present in the hash parameter will override the original components. The path component is not treated specially.

@param [Hash, Addressable::URI, to_hash] The components to merge with.

@return [Addressable::URI] The merged URI.

@see Hash#merge

Destructive form of merge.

@param [Hash, Addressable::URI, to_hash] The components to merge with.

@return [Addressable::URI] The merged URI.

@see Addressable::URI#merge

Returns a normalized URI object.

NOTE: This method does not attempt to fully conform to specifications. It exists largely to correct other people‘s failures to read the specifications, and also to deal with caching issues since several different URIs may represent the same resource and should not be cached multiple times.

@return [Addressable::URI] The normalized URI.

Destructively normalizes this URI object.

@return [Addressable::URI] The normalized URI.

@see Addressable::URI#normalize

The authority component for this URI, normalized.

@return [String] The authority component, normalized.

The fragment component for this URI, normalized.

@return [String] The fragment component, normalized.

The host component for this URI, normalized.

@return [String] The host component, normalized.

The password component for this URI, normalized.

@return [String] The password component, normalized.

The path component for this URI, normalized.

@return [String] The path component, normalized.

The port component for this URI, normalized.

@return [Integer] The port component, normalized.

The query component for this URI, normalized.

@return [String] The query component, normalized.

The scheme component for this URI, normalized.

@return [String] The scheme component, normalized.

The normalized combination of components that represent a site. Combines the scheme, user, password, host, and port components. Primarily useful for HTTP and HTTPS.

For example, "example.com/path?query" would have a site value of "example.com".

@return [String] The normalized components that identify a site.

The user component for this URI, normalized.

@return [String] The user component, normalized.

The userinfo component for this URI, normalized.

@return [String] The userinfo component, normalized.

Omits components from a URI.

@param [Symbol] *components The components to be omitted.

@return [Addressable::URI] The URI with components omitted.

@example

  uri = Addressable::URI.parse("http://example.com/path?query")
  #=> #<Addressable::URI:0xcc5e7a URI:http://example.com/path?query>
  uri.omit(:scheme, :authority)
  #=> #<Addressable::URI:0xcc4d86 URI:/path?query>

Destructive form of omit.

@param [Symbol] *components The components to be omitted.

@return [Addressable::URI] The URI with components omitted.

@see Addressable::URI#omit

The origin for this URI, serialized to ASCII, as per draft-ietf-websec-origin-00, section 5.2.

@return [String] The serialized origin.

The password component for this URI.

@return [String] The password component.

Sets the password component for this URI.

@param [String, to_str] new_password The new password component.

The path component for this URI.

@return [String] The path component.

Sets the path component for this URI.

@param [String, to_str] new_path The new path component.

The port component for this URI. This is the port number actually given in the URI. This does not infer port numbers from default values.

@return [Integer] The port component.

Sets the port component for this URI.

@param [String, Integer, to_s] new_port The new port component.

The query component for this URI.

@return [String] The query component.

Sets the query component for this URI.

@param [String, to_str] new_query The new query component.

Converts the query component to a Hash value.

@option [Symbol] notation

  May be one of <code>:flat</code>, <code>:dot</code>, or
  <code>:subscript</code>. The <code>:dot</code> notation is not
  supported for assignment. Default value is <code>:subscript</code>.

@return [Hash, Array] The query string parsed as a Hash or Array object.

@example

  Addressable::URI.parse("?one=1&two=2&three=3").query_values
  #=> {"one" => "1", "two" => "2", "three" => "3"}
  Addressable::URI.parse("?one[two][three]=four").query_values
  #=> {"one" => {"two" => {"three" => "four"}}}
  Addressable::URI.parse("?one.two.three=four").query_values(
    :notation => :dot
  )
  #=> {"one" => {"two" => {"three" => "four"}}}
  Addressable::URI.parse("?one[two][three]=four").query_values(
    :notation => :flat
  )
  #=> {"one[two][three]" => "four"}
  Addressable::URI.parse("?one.two.three=four").query_values(
    :notation => :flat
  )
  #=> {"one.two.three" => "four"}
  Addressable::URI.parse(
    "?one[two][three][]=four&one[two][three][]=five"
  ).query_values
  #=> {"one" => {"two" => {"three" => ["four", "five"]}}}
  Addressable::URI.parse(
    "?one=two&one=three").query_values(:notation => :flat_array)
  #=> [['one', 'two'], ['one', 'three']]

Sets the query component for this URI from a Hash object. This method produces a query string using the :subscript notation. An empty Hash will result in a nil query.

@param [Hash, to_hash, Array] new_query_values The new query values.

Determines if the URI is relative.

@return [TrueClass, FalseClass]

  <code>true</code> if the URI is relative. <code>false</code>
  otherwise.

The HTTP request URI for this URI. This is the path and the query string.

@return [String] The request URI required for an HTTP request.

Sets the HTTP request URI for this URI.

@param [String, to_str] new_request_uri The new HTTP request URI.

Returns the shortest normalized relative form of this URI that uses the supplied URI as a base for resolution. Returns an absolute URI if necessary. This is effectively the opposite of route_to.

@param [String, Addressable::URI, to_str] uri The URI to route from.

@return [Addressable::URI]

  The normalized relative URI that is equivalent to the original URI.

Returns the shortest normalized relative form of the supplied URI that uses this URI as a base for resolution. Returns an absolute URI if necessary. This is effectively the opposite of route_from.

@param [String, Addressable::URI, to_str] uri The URI to route to.

@return [Addressable::URI]

  The normalized relative URI that is equivalent to the supplied URI.

The scheme component for this URI.

@return [String] The scheme component.

Sets the scheme component for this URI.

@param [String, to_str] new_scheme The new scheme component.

The combination of components that represent a site. Combines the scheme, user, password, host, and port components. Primarily useful for HTTP and HTTPS.

For example, "example.com/path?query" would have a site value of "example.com".

@return [String] The components that identify a site.

Sets the site value for this URI.

@param [String, to_str] new_site The new site value.

Returns a Hash of the URI components.

@return [Hash] The URI as a Hash of components.

Converts the URI to a String.

@return [String] The URI‘s String representation.

to_str()

Alias for to_s

The user component for this URI.

@return [String] The user component.

Sets the user component for this URI.

@param [String, to_str] new_user The new user component.

The userinfo component for this URI. Combines the user and password components.

@return [String] The userinfo component.

Sets the userinfo component for this URI.

@param [String, to_str] new_userinfo The new userinfo component.

[Validate]