URI类的作用主要是处理地址字符串,将URI分成对应的片段保存到segments,路由类也主要是通过segments数组来获取上下文中的URI请求信息 。
1. URI类是如何将地址字符串解析成对应片段?
答:URI类首先将URL字符串解析成URI字符串,URI字符串的格式则是我们已经非常熟悉的CI路由地址(查询字符串,SCRIPT_NAME,以及SCRIPT_NAME目录名不会出现在uri字符串中),然后再将URI字符串中的参数解析出来存储到segments数组中,这样就讲url解析成对应的片段了。
2. 解析后的对应片段保存到变量中是怎样的数据结构?
答:解析出来的片段从下标1开始依次保存到segments数组中。
1.__construct()构造函数
URI类在初始化的时候就会对地址进行解析,构造函数会根据不同的环境调用对应的解析函数,并保存解析结果。 cli模式:调用_parse_argv()进行解析 根据uri_protocol这个配置属性决定使用哪个解析函数 默认REQUEST_URI,使用_parse_request_uri解析函数 QUERY_STRING,使用_parse_query_string解析函数 PATH_INFO或其他参数,都使用_parse_request_uri解析函数 这些解析函数会将地址解析成uri字符串,再由_set_uri_string函数将uri字符串解析成对应片段
public function __construct()
{
$this->config =& load_class('Config', 'core');
// If query strings are enabled, we don't need to parse any segments.
// However, they don't make sense under CLI.
if (is_cli() OR $this->config->item('enable_query_strings') !== TRUE)
{
$this->_permitted_uri_chars = $this->config->item('permitted_uri_chars');
// If it's a CLI request, ignore the configuration
if (is_cli())
{
$uri = $this->_parse_argv();
}
else
{
$protocol = $this->config->item('uri_protocol');
empty($protocol) && $protocol = 'REQUEST_URI';
switch ($protocol)
{
case 'AUTO': // For BC purposes only
case 'REQUEST_URI':
$uri = $this->_parse_request_uri();
break;
case 'QUERY_STRING':
$uri = $this->_parse_query_string();
break;
case 'PATH_INFO':
default:
$uri = isset($_SERVER[$protocol])
? $_SERVER[$protocol]
: $this->_parse_request_uri();
break;
}
}
$this->_set_uri_string($uri);
}
log_message('info', 'URI Class Initialized');
}
2.命令行模式下_parse_argv()解析函数
这个uri字符串解析数组,是从$_SERVER[‘argv’]中获取参数
/**
* Parse CLI arguments
*
* Take each command line argument and assume it is a URI segment.
*
* @return string
*/
protected function _parse_argv()
{
$args = array_slice($_SERVER['argv'], 1);
return $args ? implode('/', $args) : '';
}
3._parse_request_uri()解析函数
从$_SERVER[‘REQUEST_URI’]中获取参数
/**
* Parse REQUEST_URI
*
* Will parse REQUEST_URI and automatically detect the URI from it,
* while fixing the query string if necessary.
*
* @return string
*/
protected function _parse_request_uri()
{
if ( ! isset($_SERVER['REQUEST_URI'], $_SERVER['SCRIPT_NAME']))
{
return '';
}
// parse_url() returns false if no host is present, but the path or query string
// contains a colon followed by a number
$uri = parse_url('http://dummy'.$_SERVER['REQUEST_URI']);
$query = isset($uri['query']) ? $uri['query'] : '';
$uri = isset($uri['path']) ? $uri['path'] : '';
if (isset($_SERVER['SCRIPT_NAME'][0]))
{
if (strpos($uri, $_SERVER['SCRIPT_NAME']) === 0)
{
$uri = (string) substr($uri, strlen($_SERVER['SCRIPT_NAME']));
}
elseif (strpos($uri, dirname($_SERVER['SCRIPT_NAME'])) === 0)
{
$uri = (string) substr($uri, strlen(dirname($_SERVER['SCRIPT_NAME'])));
}
}
// This section ensures that even on servers that require the URI to be in the query string (Nginx) a correct
// URI is found, and also fixes the QUERY_STRING server var and $_GET array.
if (trim($uri, '/') === '' && strncmp($query, '/', 1) === 0)
{
$query = explode('?', $query, 2);
$uri = $query[0];
$_SERVER['QUERY_STRING'] = isset($query[1]) ? $query[1] : '';
}
else
{
$_SERVER['QUERY_STRING'] = $query;
}
parse_str($_SERVER['QUERY_STRING'], $_GET);
if ($uri === '/' OR $uri === '')
{
return '/';
}
// Do some final cleaning of the URI and return it
return $this->_remove_relative_directory($uri);
}
4._parse_query_string()解析函数
根据$_SERVER[‘QUERY_STRING’]的参数解析出字符串
/**
* Parse QUERY_STRING
*
* Will parse QUERY_STRING and automatically detect the URI from it.
*
* @return string
*/
protected function _parse_query_string()
{
$uri = isset($_SERVER['QUERY_STRING']) ? $_SERVER['QUERY_STRING'] : @getenv('QUERY_STRING');
if (trim($uri, '/') === '')
{
return '';
}
elseif (strncmp($uri, '/', 1) === 0)
{
$uri = explode('?', $uri, 2);
$_SERVER['QUERY_STRING'] = isset($uri[1]) ? $uri[1] : '';
$uri = $uri[0];
}
parse_str($_SERVER['QUERY_STRING'], $_GET);
return $this->_remove_relative_directory($uri);
}
5._remove_relative_directory()——去掉多余斜杠和相对路径符号
/**
* Remove relative directory (../) and multi slashes (///)
*
* Do some final cleaning of the URI and return it, currently only used in self::_parse_request_uri()
*
* @param string $uri
* @return string
*/
protected function _remove_relative_directory($uri)
{
$uris = array();
$tok = strtok($uri, '/');
while ($tok !== FALSE)
{
if (( ! empty($tok) OR $tok === '0') && $tok !== '..')
{
$uris[] = $tok;
}
$tok = strtok('/');
}
return implode('/', $uris);
}
6.uri解析成对应分段——_set_uri_string()函数
前面解读的几个函数已经将url解析成uri字符串了,但是我们最终需要的是Uri字符串所对应的参数,这样才能根据uri参数路由到正确的位置,set_uri_string()函数的功能便是将uri字符串解析成对应分段
/**
* Set URI String
*
* @param string $str
* @return void
*/
protected function _set_uri_string($str)
{
// Filter out control characters and trim slashes
$this->uri_string = trim(remove_invisible_characters($str, FALSE), '/');
if ($this->uri_string !== '')
{
// Remove the URL suffix, if present
if (($suffix = (string) $this->config->item('url_suffix')) !== '')
{
$slen = strlen($suffix);
if (substr($this->uri_string, -$slen) === $suffix)
{
$this->uri_string = substr($this->uri_string, 0, -$slen);
}
}
$this->segments[0] = NULL;
// Populate the segments array
foreach (explode('/', trim($this->uri_string, '/')) as $val)
{
$val = trim($val);
// Filter segments for security
$this->filter_uri($val);
if ($val !== '')
{
$this->segments[] = $val;
}
}
unset($this->segments[0]);
}
}
7.合法性保障——filter_uri函数
/**
* Filter URI
*
* Filters segments for malicious characters.
*
* @param string $str
* @return void
*/
public function filter_uri(&$str)
{
if ( ! empty($str) && ! empty($this->_permitted_uri_chars) && ! preg_match('/^['.$this->_permitted_uri_chars.']+$/i'.(UTF8_ENABLED ? 'u' : ''), $str))
{
show_error('The URI you submitted has disallowed characters.', 400);
}
}