Class UrlValidator

java.lang.Object
org.apache.commons.validator.routines.UrlValidator
All Implemented Interfaces:
Serializable

public class UrlValidator extends Object implements Serializable

URL Validation routines.

Behavior of validation is modified by passing in options:
  • ALLOW_2_SLASHES - [FALSE] Allows double '/' characters in the path component.
  • NO_FRAGMENT- [FALSE] By default fragments are allowed, if this option is included then fragments are flagged as illegal.
  • ALLOW_ALL_SCHEMES - [FALSE] By default only http, https, and ftp are considered valid schemes. Enabling this option will let any scheme pass validation.

Originally based in on php script by Debbie Dyer, validation.php v1.2b, Date: 03/07/02, http://javascript.internet.com. However, this validation now bears little resemblance to the php original.

   Example of usage:
   Construct a UrlValidator with valid schemes of "http", and "https".

    String[] schemes = {"http","https"}.
    UrlValidator urlValidator = new UrlValidator(schemes);
    if (urlValidator.isValid("ftp://foo.bar.com/")) {
       System.out.println("url is valid");
    } else {
       System.out.println("url is invalid");
    }

    prints "url is invalid"
   If instead the default constructor is used.

    UrlValidator urlValidator = new UrlValidator();
    if (urlValidator.isValid("ftp://foo.bar.com/")) {
       System.out.println("url is valid");
    } else {
       System.out.println("url is invalid");
    }

   prints out "url is valid"
  
Since:
Validator 1.4
Version:
$Revision: 1713573 $
See Also:
  • Field Details

    • serialVersionUID

      private static final long serialVersionUID
      See Also:
    • ALLOW_ALL_SCHEMES

      public static final long ALLOW_ALL_SCHEMES
      Allows all validly formatted schemes to pass validation instead of supplying a set of valid schemes.
      See Also:
    • ALLOW_2_SLASHES

      public static final long ALLOW_2_SLASHES
      Allow two slashes in the path component of the URL.
      See Also:
    • NO_FRAGMENTS

      public static final long NO_FRAGMENTS
      Enabling this options disallows any URL fragments.
      See Also:
    • ALLOW_LOCAL_URLS

      public static final long ALLOW_LOCAL_URLS
      Allow local URLs, such as http://localhost/ or http://machine/ . This enables a broad-brush check, for complex local machine name validation requirements you should create your validator with a RegexValidator instead (UrlValidator(RegexValidator, long))
      See Also:
    • URL_REGEX

      private static final String URL_REGEX
      This expression derived/taken from the BNF for URI (RFC2396).
      See Also:
    • URL_PATTERN

      private static final Pattern URL_PATTERN
    • PARSE_URL_SCHEME

      private static final int PARSE_URL_SCHEME
      Schema/Protocol (ie. http:, ftp:, file:, etc).
      See Also:
    • PARSE_URL_AUTHORITY

      private static final int PARSE_URL_AUTHORITY
      Includes hostname/ip and port number.
      See Also:
    • PARSE_URL_PATH

      private static final int PARSE_URL_PATH
      See Also:
    • PARSE_URL_QUERY

      private static final int PARSE_URL_QUERY
      See Also:
    • PARSE_URL_FRAGMENT

      private static final int PARSE_URL_FRAGMENT
      See Also:
    • SCHEME_REGEX

      private static final String SCHEME_REGEX
      Protocol scheme (e.g. http, ftp, https).
      See Also:
    • SCHEME_PATTERN

      private static final Pattern SCHEME_PATTERN
    • AUTHORITY_CHARS_REGEX

      private static final String AUTHORITY_CHARS_REGEX
      See Also:
    • IPV6_REGEX

      private static final String IPV6_REGEX
      See Also:
    • USERINFO_CHARS_REGEX

      private static final String USERINFO_CHARS_REGEX
      See Also:
    • USERINFO_FIELD_REGEX

      private static final String USERINFO_FIELD_REGEX
      See Also:
    • AUTHORITY_REGEX

      private static final String AUTHORITY_REGEX
      See Also:
    • AUTHORITY_PATTERN

      private static final Pattern AUTHORITY_PATTERN
    • PARSE_AUTHORITY_IPV6

      private static final int PARSE_AUTHORITY_IPV6
      See Also:
    • PARSE_AUTHORITY_HOST_IP

      private static final int PARSE_AUTHORITY_HOST_IP
      See Also:
    • PARSE_AUTHORITY_EXTRA

      private static final int PARSE_AUTHORITY_EXTRA
      Should always be empty. The code currently allows spaces.
      See Also:
    • PATH_REGEX

      private static final String PATH_REGEX
      See Also:
    • PATH_PATTERN

      private static final Pattern PATH_PATTERN
    • QUERY_REGEX

      private static final String QUERY_REGEX
      See Also:
    • QUERY_PATTERN

      private static final Pattern QUERY_PATTERN
    • options

      private final long options
      Holds the set of current validation options.
    • allowedSchemes

      private final Set<String> allowedSchemes
      The set of schemes that are allowed to be in a URL.
    • authorityValidator

      private final RegexValidator authorityValidator
      Regular expressions used to manually validate authorities if IANA domain name validation isn't desired.
    • DEFAULT_SCHEMES

      private static final String[] DEFAULT_SCHEMES
      If no schemes are provided, default to this set.
    • DEFAULT_URL_VALIDATOR

      private static final UrlValidator DEFAULT_URL_VALIDATOR
      Singleton instance of this class with default schemes and options.
  • Constructor Details

    • UrlValidator

      public UrlValidator()
      Create a UrlValidator with default properties.
    • UrlValidator

      public UrlValidator(String[] schemes)
      Behavior of validation is modified by passing in several strings options:
      Parameters:
      schemes - Pass in one or more url schemes to consider valid, passing in a null will default to "http,https,ftp" being valid. If a non-null schemes is specified then all valid schemes must be specified. Setting the ALLOW_ALL_SCHEMES option will ignore the contents of schemes.
    • UrlValidator

      public UrlValidator(long options)
      Initialize a UrlValidator with the given validation options.
      Parameters:
      options - The options should be set using the public constants declared in this class. To set multiple options you simply add them together. For example, ALLOW_2_SLASHES + NO_FRAGMENTS enables both of those options.
    • UrlValidator

      public UrlValidator(String[] schemes, long options)
      Behavior of validation is modified by passing in options:
      Parameters:
      schemes - The set of valid schemes. Ignored if the ALLOW_ALL_SCHEMES option is set.
      options - The options should be set using the public constants declared in this class. To set multiple options you simply add them together. For example, ALLOW_2_SLASHES + NO_FRAGMENTS enables both of those options.
    • UrlValidator

      public UrlValidator(RegexValidator authorityValidator, long options)
      Initialize a UrlValidator with the given validation options.
      Parameters:
      authorityValidator - Regular expression validator used to validate the authority part This allows the user to override the standard set of domains.
      options - Validation options. Set using the public constants of this class. To set multiple options, simply add them together:

      ALLOW_2_SLASHES + NO_FRAGMENTS

      enables both of those options.
    • UrlValidator

      public UrlValidator(String[] schemes, RegexValidator authorityValidator, long options)
      Customizable constructor. Validation behavior is modifed by passing in options.
      Parameters:
      schemes - the set of valid schemes. Ignored if the ALLOW_ALL_SCHEMES option is set.
      authorityValidator - Regular expression validator used to validate the authority part
      options - Validation options. Set using the public constants of this class. To set multiple options, simply add them together:

      ALLOW_2_SLASHES + NO_FRAGMENTS

      enables both of those options.
  • Method Details

    • getInstance

      public static UrlValidator getInstance()
      Returns the singleton instance of this class with default schemes and options.
      Returns:
      singleton instance with default schemes and options
    • isValid

      public boolean isValid(String value)

      Checks if a field has a valid url address.

      Note that the method calls #isValidAuthority() which checks that the domain is valid.
      Parameters:
      value - The value validation is being performed on. A null value is considered invalid.
      Returns:
      true if the url is valid.
    • isValidScheme

      protected boolean isValidScheme(String scheme)
      Validate scheme. If schemes[] was initialized to a non null, then only those schemes are allowed. Otherwise the default schemes are "http", "https", "ftp". Matching is case-blind.
      Parameters:
      scheme - The scheme to validate. A null value is considered invalid.
      Returns:
      true if valid.
    • isValidAuthority

      protected boolean isValidAuthority(String authority)
      Returns true if the authority is properly formatted. An authority is the combination of hostname and port. A null authority value is considered invalid. Note: this implementation validates the domain unless a RegexValidator was provided. If a RegexValidator was supplied and it matches, then the authority is regarded as valid with no further checks, otherwise the method checks against the AUTHORITY_PATTERN and the DomainValidator (ALLOW_LOCAL_URLS)
      Parameters:
      authority - Authority value to validate, alllows IDN
      Returns:
      true if authority (hostname and port) is valid.
    • isValidPath

      protected boolean isValidPath(String path)
      Returns true if the path is valid. A null value is considered invalid.
      Parameters:
      path - Path value to validate.
      Returns:
      true if path is valid.
    • isValidQuery

      protected boolean isValidQuery(String query)
      Returns true if the query is null or it's a properly formatted query string.
      Parameters:
      query - Query value to validate.
      Returns:
      true if query is valid.
    • isValidFragment

      protected boolean isValidFragment(String fragment)
      Returns true if the given fragment is null or fragments are allowed.
      Parameters:
      fragment - Fragment value to validate.
      Returns:
      true if fragment is valid.
    • countToken

      protected int countToken(String token, String target)
      Returns the number of times the token appears in the target.
      Parameters:
      token - Token value to be counted.
      target - Target value to count tokens in.
      Returns:
      the number of tokens.
    • isOn

      private boolean isOn(long flag)
      Tests whether the given flag is on. If the flag is not a power of 2 (ie. 3) this tests whether the combination of flags is on.
      Parameters:
      flag - Flag value to check.
      Returns:
      whether the specified flag value is on.
    • isOff

      private boolean isOff(long flag)
      Tests whether the given flag is off. If the flag is not a power of 2 (ie. 3) this tests whether the combination of flags is off.
      Parameters:
      flag - Flag value to check.
      Returns:
      whether the specified flag value is off.
    • matchURL

      Matcher matchURL(String value)