Click Here For JSpamFilter Click Here For Latest Downloads. Click Here To Request A Free Trial Of JSpamFilter. Click Here To Purchase JSpamFilter. Click Here For Our Support Section. Click Here For The JSpamFilter Manual. Click Here For A List Of Our FAQs. Click Here For The SiteMap. Click Here To Contact Us. Click Here For Information About Modest Software.

Modest Software, Inc.

Part 12 (Advanced Filter File Settings)
Previous Section Part 11b (Basic Filter File Parameters) Part 13 (Advanced Configuration Options I) Next Section
JSpamFilter Manual Symbol Legend

Configuration Options: settings here are made in the filter.txt file.
Conditional Searches
Filters for Un-necessarily Brief Messages
Anti-Obfuscation Filters: HTML and d00d-speak Filters

JSpamFilter will even decode messages that are "Base64 encoded", which defeats other content filters that rely on the message being sent in plain ASCII text.

Conditional Searches:

JSpamFilter supports conditional keyword searching. There are three different kinds of keywords in JSpamFilter; "root" keywords, "optional root" keywords indicated by a "|", and "optional" keywords indicated by a "+". Optional keywords are only checked if the preceding "root" keyword, and/or one of the "optional root" keywords are found.

The following table summarizes the use of the switches, examples of their use follow:
Symbol Name Syntax Action Notes
  Nothing [score] [word or phrase] The value "[score]" is added to the message score when "[word or phrase]" is found. The basic Filter File entry
| Pipe |[score] [word or phrase] Is combined with the immediately preceding term. JSpamFilter searches for the term and/or the preceding term.
+ Plus Sign +[score] [word or phrase]
- Minus Sign -[score] [word or phrase] The value "[score]" is subtracted from the message score when "[word or phrase]" is found. Can be used as a "manual" form of white-listing. Drop-box white-listing is being developed.
  space [word or phrase] By adding white space around a term (notice the space before and after "[word or phrase]"), only the exact term is scored. That is, the term will not be scored if it is found inside other terms.

Consider this filter file snippet:
20 mortgage companies
|20 mortgage broker
+30 compete for
+30 for your business

In this case, JSpamFilter will always search for "mortgage companies" and "mortgage broker". If either of those terms are found, then JSpamFilter will also search for the phrases "compete for", and "for your business". Every search phrase found will score the number of points indicated. Essentially, this filter file snippet is saying, "If you find 'mortgage companies' and/or 'mortgage broker', then also search for 'compete for' and 'for your business'."

This technique can also be used to prevent unnecessary searches. If you have several search phrases that contain the word "email", then you can improve efficiency by only searching for phrases that contain "email" if the word "email" itself is found. The example filter.txt includes this:

5 email
+20 This email was sent to you because
+30 EMAIL ADDRESSES
+30 MILLION EMAIL
+30 FRESH EMAIL
+30 bulk email
+30 email marketing
+30 mass email
+30 cdrom
+30 cd-rom
+30 sent in compliance

So, 8 searches that include variants of "email" will be skipped if the message does not contain "email". (i.e., it's not possible for "fresh email" to exist in the message if "email" doesn't exist in the message.) There are also a few terms that are often found in spam in conjunction with the word "email".

Filter for Un-necessarily Brief Messages
The following is added to the filter.txt file:
1 <a
+30# href="http:
1 <img
+30# src="http:
5# <!--


This is quite effective on those typical, and typically small, messages that include an excessive number of hyperlink references and lets the JSpamFilter algorithm filter out messages that contain almost no filterable content.

Anti-Obfuscation Filters: d00d-speak and HTML Filters

These filters look for efforts by spammers to "obfuscate" the content of their messages by inserting irrelevant HTML tags, replacing letters with numbers, and other content-malformation exploits.

The Anti-Obfuscation filter and scoring process has been added that detects and thwarts attempts to intentionally obscure or conceal content. The filter detects keywords that have been intentionally obfuscated, that is, concealed and obscured by the spammer, and applies a multiplier to their score. This uses the following switches:

  • Note: all of the "$$" filters should be last in filter.txt; also, the entry for $$html should be first of them.

  • This switch converts a message from HTML and sets a multiplier for keywords found after HTML removal to detect keywords purposefully obscured by irrelevant HTML (the value of "20" is used as an example: it means that a multiplier of 2.0 is used against the base score for keywords that were obfuscated using HTML).

    20 $$html

  • Here the value "35" is used for the score applied to HTML messages that contain no text (images only).

    35 $$BlankMessage

  • Switch for messages where the subject line has been intentionally obfuscated (an example score of "35" is used).

    35 $$SubjectObfuscation

  • A switch that looks for randomized text. This is recommended for English-only mail servers. Here too the value is a multiplier; 20 means: 2.0x base score.

    20 $$RandomText

  • A new function that removes punctuation and then searches again for words that were overlooked. Again, the value is a multiplier; 20 means: 2.0x score of words de-obfuscated.

    20 $$PunctuationFilter

  • A filter that counters "d00dspeak": this looks for and converts numbers into letters: e.g., "de-obfuscation" = "d3-0bfusc4t10n". Value is a multiplier; 20 means: 2.0x score of words de-obfuscated.

    20 $$d00dSpeak

  • *Note for clients using the Updater: please copy the following text and paste it into to your filter.txt file (this corresponds to the switches described above). Make sure the commented lines are unbroken:

    // All "$$" filters should be last; the entry for $$html should be first of them

    // Convert message to HTML and set multiplier
    // (20 means: 2.0x base score for keywords that were obfuscated using HTML)

    20 $$html

    // Score for HTML messages that contain no text (images only)

    35 $$BlankMessage

    // Score for messages where the subject line has been intentionally obfuscated

    35 $$SubjectObfuscation

    // Looks for randomized text. Recommend for English-only mail servers.
    // Value is a multiplier; 20 means: 2.0x base score

    20 $$RandomText

    // Removes punctuation then searches again for words that were overlooked.
    // Value is a multiplier; 20 means: 2.0x score of words de-obfuscated

    20 $$PunctuationFilter

    // Converts numbers into letters for "d00dspeak" d3-0bfusc4t10n
    // Value is a multiplier; 20 means: 2.0x score of words de-obfuscated

    20 $$d00dSpeak




































Previous Section Part 11b (Basic Filter File Parameters) Part 13 (Advanced Configuration Options I) Next Section

JSpamFilter  |  Free Trials  |  About Us  |  Privacy  |  Contact  |  License  |  Site Map

Copyright ModestSoftware 2002 - 2018