ContentPolicy

From Searchbastard

Jump to: navigation, search

ContentPolicy is a feature of Searchbastard engines. It will be introduced in Searchbastard v1.0b12. Older versions simply ignores the tag.

ContentPolicy allows the engine to block certain images, stylesheets, scripts, etc - based on the type and the url. The stuff is not even loaded, which can result in a huge performance boost. Wikipedia is for example way faster without all the stylesheets (10 files, more than 3000 lines of code!).

Example:

<ContentPolicy reset="true" showLog="false" condition="option('cleanup')=='very_clean'">
   <Block type="script" />
   <Block type="stylesheet" />
   <Accept type="image" location="/buddyicons/.*rosell\.png$"/>
   <Block type="image" location="/buddyicons/"/>
   <Block type="image" origin="\.css$"/>
</ContentPolicy>

This blocks all external scripts and stylesheets, and some images, depending on the location of the image. The location attribute is a regular expression pattern. In the above example, all images from an url with the string /buddyicons/ in it will be blocked, unless the resource name ends with rosell.png. Also, all images that are requested from external stylesheets will be blocked (asuming that the filenames of the external stylesheets ends with ".css", which need'ent be the case)


Here is how it generally works

Each time an external resource is about to be loaded, the content policy is asked if its ok to load. The content policy is informed about various attributes regarding the resource, including the type, the location, and the location of the resource that requested the load (origin) (see shouldLoad).

The code matches one rule at the time. A rule can either be Block or Accept. When a rule is successfully matched, no further rules are examined. If no rules apply, the ressource will be loaded. The rule is said to match, when all attributes matches (type, location and origin)

Note: content policies usually applies to all documents in all windows. But no worry: The ContentPolicy tag is limitted to operate on stuff that goes on in the window/frame that the searchbastard engine is loaded in. Also, it is removed as soon as the Macro has finished processing, so it will not operate on what goes on in the future of that window/frame either.


Rule attributes

type

Required. Must be one of these values: (all | other | script | image |stylesheet | object | subdocument | ping | xmlhttprequest | object_subrequest)

If you want the rule to apply to all content types, you can set it to "all". You can read more about the other options here

location

Optional. If not specified, it will match all locations
A regular expression that will be used for matching the location of the resource Tip: When finding out what to block, its useful to install an add-on can show resources loaded by a page. Adblock Plus is for example capable of this.

origin

Optional. If not specified, it will match all origins
A regular expression that will be used for matching the location of the resource that requested the load. For example, to block all images that are requested from a stylesheet with the filname "main.css", you would write: <Block type="image" origin="main\.css$"/>


ContentPolicy attributes

reset

Optional (default is: false)
When true, previous rules set using the ContentPolicy tag will be removed. When false, the rules will be added to the ones previously set.

showLog

Optional (default is: false)
A debugging option. When true, an alert message is displayed telling you which resources where blocked/accepted, and which rule that was responsible for the block/accept.


Notes

  • Blocking scripts may result in javascript errors in the Console, as script on the page may try to call functions. If no script in page is needed, you can dissallow javascript entirely by setting the disallow attribute of the <Get>/<Post> tag to "javascript". Otherwise, you can inject empty functions into the page using the <InjectScript> tag
  • ContentPolicy must be set before a <Get>/<Post> in order to take effect


Personal tools