[ Index ]
 

Code source de eGroupWare 1.2.106-2

Accédez au Source d'autres logiciels libresSoutenez Angelica Josefina !

title

Body

[fermer]

/felamimail/inc/ -> class.htmlfilter.inc.php (sommaire)

(pas de description)

Poids: 994 lignes (31 kb)
Inclus ou requis:0 fois
Référencé: 0 fois
Nécessite: 0 fichiers

Définit 1 class

htmlfilter:: (10 méthodes):
  spew()
  tagprint()
  casenormalize()
  skipspace()
  findnxstr()
  findnxreg()
  getnxtag()
  deent()
  fixatts()
  sanitize()


Classe: htmlfilter  - X-Ref

htmlfilter.inc
---------------
This set of functions allows you to filter html in order to remove
any malicious tags from it. Useful in cases when you need to filter
user input for any cross-site-scripting attempts.

Copyright (c) 2002 by Duke University

This program is free software; you can redistribute it and/or
modify it under the terms of the GNU General Public License
as published by the Free Software Foundation; either version 2
of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
GNU General Public License for more details.

You should have received a copy of the GNU General Public License
along with this program; if not, write to the Free Software
Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA
02111-1307, USA.

spew($message)   X-Ref
See http://www.mricon.com/html/phpfilter.html

This is a debugging function used throughout the code. To enable
debugging you have to specify a global variable called "debug" before
calling sanitize() and set it to true.

Note: Although insignificantly, debugging does slow you down even
when $debug is set to false. If you wish to get rid of all
debugging calls, run the following command:

fgrep -v 'spew("' htmlfilter.inc > htmlfilter.inc.new

htmlfilter.inc.new will contain no debugging calls.

param: $message  A string with the message to output.
return: void.

tagprint($tagname, $attary, $tagtype)   X-Ref
This function returns the final tag out of the tag name, an array
of attributes, and the type of the tag. This function is called by
sanitize internally.

param: $tagname  the name of the tag.
param: $attary   the array of attributes and their values
param: $tagtype  The type of the tag (see in comments).
return: a string with the final tag representation.

casenormalize(&$val)   X-Ref
A small helper function to use with array_walk. Modifies a by-ref
value and makes it lowercase.

param: $val a value passed by-ref.
return: void since it modifies a by-ref value.

skipspace($body, $offset)   X-Ref
This function skips any whitespace from the current position within
a string and to the next non-whitespace value.

param: $body   the string
param: $offset the offset within the string where we should start
return: the location within the $body where the next

findnxstr($body, $offset, $needle)   X-Ref
This function looks for the next character within a string.  It's
really just a glorified "strpos", except it catches the failures
nicely.

param: $body   The string to look for needle in.
param: $offset Start looking from this position.
param: $needle The character/string to look for.
return: location of the next occurance of the needle, or

findnxreg($body, $offset, $reg)   X-Ref
This function takes a PCRE-style regexp and tries to match it
within the string.

param: $body   The string to look for needle in.
param: $offset Start looking from here.
param: $reg    A PCRE-style regex to match.
return: Returns a false if no matches found, or an array

getnxtag($body, $offset)   X-Ref
This function looks for the next tag.

param: $body   String where to look for the next tag.
param: $offset Start looking from here.
return: false if no more tags exist in the body, or

deent($attvalue)   X-Ref
This function checks attribute values for entity-encoded values
and returns them translated into 8-bit strings so we can run
checks on them.

param: $attvalue A string to run entity check against.
return: Translated value.

fixatts($tagname, $attary, $rm_attnames,$bad_attvals,$add_attr_to_tag)   X-Ref
This function runs various checks against the attributes.

param: $tagname         String with the name of the tag.
param: $attary          Array with all tag attributes.
param: $rm_attnames     See description for sanitize
param: $bad_attvals     See description for sanitize
param: $add_attr_to_tag See description for sanitize
return: Array with modified attributes.

sanitize($body, $tag_list, $rm_tags_with_content,$self_closing_tags,$force_tag_closing,$rm_attnames,$bad_attvals,$add_attr_to_tag)   X-Ref
This is the main function and the one you should actually be calling.
There are several variables you should be aware of an which need
special description.

$tag_list
----------
This is a simple one-dimentional array of strings, except for the
very first one. The first member should be einter false or true.
In case it's FALSE, the following list will be considered a list of
tags that should be explicitly REMOVED from the body, and all
others that did not match the list will be allowed.  If the first
member is TRUE, then the list is the list of tags that should be
explicitly ALLOWED -- any tag not matching this list will be
discarded.

Examples:
$tag_list = Array(
false,
"blink",
"link",
"object",
"meta",
"marquee",
"html"
);

This will allow all tags except for blink, link, object, meta, marquee,
and html.

$tag_list = Array(
true,
"b",
"a",
"i",
"img",
"strong",
"em",
"p"
);

This will remove all tags from the body except b, a, i, img, strong, em and
p.

$rm_tags_with_content
---------------------
This is a simple one-dimentional array of strings, which specifies the
tags to be removed with any and all content between the beginning and
the end of the tag.
Example:
$rm_tags_with_content = Array(
"script",
"style",
"applet",
"embed"
);

This will remove the following structure:
<script>
window.alert("Isn't cross-site-scripting fun?!");
</script>

$self_closing_tags
------------------
This is a simple one-dimentional array of strings, which specifies which
tags contain no content and should not be forcefully closed if this option
is turned on (see further).
Example:
$self_closing_tags =  Array(
"img",
"br",
"hr",
"input"
);

$force_tag_closing
------------------
Set it to true to forcefully close any tags opened within the document.
This is good if you want to take care of people who like to screw up
the pages by leaving unclosed tags like <a>, <b>, <i>, etc.

$rm_attnames
-------------
Now we come to parameters that are more obscure. This parameter is
a nested array which is used to specify which attributes should be
removed. It goes like so:

$rm_attnames = Array(
"PCRE regex to match tag name" =>
Array(
"PCRE regex to match attribute name"
)
);

Example:
$rm_attnames = Array(
"|.*|" =>
Array(
"|target|i",
"|^on.*|i"
)
);

This will match all attributes (.*), and specify that all attributes
named "target" and starting with "on" should be removed. This will take
care of the following problem:
<em onmouseover="window.alert('muahahahaha')">
The "onmouseover" will be removed.

$bad_attvals
------------
This is where it gets ugly. This is a nested array with many levels.
It goes like so:

$bad_attvals = Array(
"pcre regex to match tag name" =>
Array(
"pcre regex to match attribute name" =>
Array(
"pcre regex to match attribute value"
)
Array(
"pcre regex replace a match from above with"
)
)
);

An extensive example:

$bad_attvals = Array(
"|.*|" =>
Array(
"/^src|background|href|action/i" =>
Array(
Array(
"/^([\'\"])\s*\S+script\s*:.*([\'\"])/si"
),
Array(
"\\1http://veryfunny.com/\\2"
)
),
"/^style/i" =>
Array(
Array(
"/expression/si",
"/url\(([\'\"])\s*https*:.*([\'\"])\)/si",
"/url\(([\'\"])\s*\S+script:.*([\'\"])\)/si"
),
Array(
"idiocy",
"url(\\1http://veryfunny.com/\\2)",
"url(\\1http://veryfynny.com/\\2)"
)
)
)
);

This will take care of nearly all known cross-site scripting exploits,
plus some (see my filter sample at
http://www.mricon.com/html/phpfilter.html for a working version).

$add_attr_to_tag
----------------
This is a useful little feature which lets you add attributes to
certain tags. It is a nested array as well, but not at all like
the previous one. It goes like so:

$add_attr_to_tag = Array(
"PCRE regex to match tag name" =>
Array(
"attribute name"=>'"attribute value"'
)
);

Note: don't forget quotes around attribute value.

Example:

$add_attr_to_tag = Array(
"/^a$/si" =>
Array(
'target'=>'"_new"'
)
);

This will change all <a> tags and add target="_new" to them so all links
open in a new window.



param: $body                 the string with HTML you wish to filter
param: $tag_list             see description above
param: $rm_tags_with_content see description above
param: $self_closing_tags    see description above
param: $force_tag_closing    see description above
param: $rm_attnames          see description above
param: $bad_attvals          see description above
param: $add_attr_to_tag      see description above
return: sanitized html safe to show on your pages.



Généré le : Sun Feb 25 17:20:01 2007 par Balluche grâce à PHPXref 0.7