What is URI?

A Uniform Resource Identifier (URI) is a compact string of characters for identifying an abstract or physical resource [RFC2396]. URIs provide a simple and extensible means for identifying a resource. A URI can be further classified as a locator, a name, or both. The term "Uniform Resource Locator" (URL) refers to the subset of URI that identify resources via a representation of their primary access mechanism, rather than identifying the resource by name or by some other attribute(s) of that resource. The term "Uniform Resource Name" (URN) refers to the subset of URI that are required to remain globally unique and persistent even when the resource ceases to exist or becomes unavailable. Please refer to RFC2396 for the completed specifications.

Example of URI:

http://www.polyu.edu.hk

http://proxy.polyu.edu.hk:8181

ftp://ftp.isi.edu/in-notes/rfc2234.txt

ftp://yourusername:yourpassword@the.site.youwant/the/path/of/yourfile

mailto:entchsun@polyu.edu.hk

telnet://hkpu10.polyu.edu.hk
news:comp.infosystems.www.servers.unix

gopher://spinaltap.micro.umn.edu/00/Weather/California/Los%20Angeles

[Next]

Collected BNF [RFC2234] for URI

URI-reference =

[ absoluteURI | relativeURI ] [ "#" fragment ]

absoluteURI =

scheme ":" ( hier_part | opaque_part )

relativeURI =

( net_path | abs_path | rel_path ) [ "?" query ]

hier_part =

( net_path | abs_path ) [ "?" query ]

opaque_part =

uric_no_slash *uric

uric_no_slash =

unreserved | escaped | ";" | "?" | ":" | "@" | "&" | "=" | "+" | "$" | ","

net_path =

"//" authority [ abs_path ]

abs_path =

"/" path_segments

rel_path =

rel_segment [ abs_path ]

rel_segment =

1*( unreserved | escaped | ";" | "@" | "&" | "=" | "+" | "$" | "," )

scheme =

alpha *( alpha | digit | "+" | "-" | "." )

authority =

server | reg_name

reg_name =

1*( unreserved | escaped | "$" | "," | ";" | ":" | "@" | "&" | "=" | "+" )

server =

[ [ userinfo "@" ] hostport ]

userinfo =

*( unreserved | escaped | ";" | ":" | "&" | "=" | "+" | "$" | "," )

hostport =

host [ ":" port ]

host =

hostname | IPv4address

hostname =

*( domainlabel "." ) toplabel [ "." ]

domainlabel =

alphanum | alphanum *( alphanum | "-" ) alphanum

toplabel =

alpha | alpha *( alphanum | "-" ) alphanum

IPv4address =

1*digit "." 1*digit "." 1*digit "." 1*digit

port =

*digit

path =

[ abs_path | opaque_part ]

path_segments =

segment *( "/" segment )

segment =

*pchar *( ";" param )

param =

*pchar

pchar =

unreserved | escaped | ":" | "@" | "&" | "=" | "+" | "$" | ","

query =

*uric

fragment =

*uric

uric =

reserved | unreserved | escaped

reserved =

";" | "/" | "?" | ":" | "@" | "&" | "=" | "+" | "$" | ","

unreserved =

alphanum | mark

mark =

"-" | "_" | "." | "!" | "~" | "*" | "'" | "(" | ")"

escaped =

"%" hex hex

hex =

digit | "A" | "B" | "C" | "D" | "E" | "F" | "a" | "b" | "c" | "d" | "e" | "f"

alphanum =

alpha | digit

alpha =

lowalpha | upalpha

lowalpha =

"a" | "b" | "c" | "d" | "e" | "f" | "g" | "h" | "i" | "j" | "k" | "l" | "m" | "n" | "o" | "p" | "q" | "r" | "s" | "t" | "u" | "v" | "w" | "x" | "y" | "z"

upalpha =

"A" | "B" | "C" | "D" | "E" | "F" | "G" | "H" | "I" | "J" | "K" | "L" | "M" | "N" | "O" | "P" | "Q" | "R" | "S" | "T" | "U" | "V" | "W" | "X" | "Y" | "Z"

digit =

"0" | "1" | "2" | "3" | "4" | "5" | "6" | "7" | "8" | "9"