By Michael Coates
Expert Author
Article Date: 2010-10-21
Great approach to prevent XSS or HTML injection in your python application while still allowing rich HTML from untrusted sources. I had a great chat yesterday with James Socol, author of Bleach (http://pypi.python.org/pypi/bleach). Many of you are likely already aware of Bleach but I wanted to get some more information out to everyone.
What is Bleach for?Bleach can be used to safely allow an application to accept rich HTMLcontent from an untrusted source (user, third party, etc) and renderthis content within the page. Without bleach this would be a ripe areafor XSS and HTML injection.
How does Bleach work?
Bleach accepts a minimal whitelist of html tags that are defined in thebleach configuration. Any other tags provided within the data are HTMLentity encoded to prevent malicious rendering within the page. As aresult, only the whitelist'ed tags are rendered. As long as the whitelistis intelligently constructed (which it is by default) the rendered contentis never able to perform malicious actions.
When should Bleach be used?
When you want to allow rich HTML content within the body of a page andthis content is coming from an untrusted source (e.g. user, third party site).
When should Bleach not be used?
If you have no intention of allowing any rendered content from the user,then Bleach is the wrong approach. In those cases just stick with thedefault output encoding provided by django or jinja.
Comments