There are a plethora of spam-prevention modules for Drupal (e.g. HoneyPot, reCAPTCHA, Mollom, Spamicide, MotherMayI, Spambot) . They work well, generally speaking, but not all websites are the same. I run a website that has a forum with content created by authenticated users. Account registration is open but protected. The node forms are protected, but I still get very specific spam... mostly things either pertaining to some fake technical support service, or Ugg boots advertisements.
Nice try "AndrewBrown".
After a few years I thought about using Rules to look for certain common words/phrases. A single word comparison isn't too difficult, but using Regular Expressions in Rules isn't as easy as I hoped. For starters, the Text Comparison option for RegEx doesn't support flags, so a case insensitive match is a bit more tricky.
Here's the example that I came up with, including both phrase (case insensitive) matching and a search for a phone number. Several posts I observed had phone numbers, and I really don't want users posting phone numbers in this website's forum. After the matching conditions are met, the node is unpublished, the user is blocked, an email is sent to a content moderator, then a message is displayed to the user. Obviously replace the example domain/email values and adjust searching as needed in this Rules export:
{ "rules_forum_spam_filter" : { "LABEL" : "Forum SPAM Filter", "PLUGIN" : "reaction rule", "OWNER" : "rules", "TAGS" : [ "spam" ], "REQUIRES" : [ "rules" ], "ON" : { "node_presave--forum" : { "bundle" : "forum" } }, "IF" : [ { "OR" : [ { "text_matches" : { "text" : [ "node:body:value" ], "match" : "[Cc][Uu][Ss][Tt][Oo][Mm][Ee][Rr][[:space:]]*[Ss][Uu][Pp][Pp][Oo][Rr][Tt]", "operation" : "regex" } }, { "text_matches" : { "text" : [ "node:body:value" ], "match" : "[Cc][Uu][Ss][Tt][Oo][Mm][Ee][Rr][[:space:]]*[Ss][Ee][Rr][Vv][Ii][Cc][Ee]", "operation" : "regex" } }, { "text_matches" : { "text" : [ "node:body:value" ], "match" : "[Hh][Ee][Ll][Pp][[:space:]]*[Ll][Ii][Nn][Ee]", "operation" : "regex" } }, { "text_matches" : { "text" : [ "node:body:value" ], "match" : "[Gg][Mm][Aa][Ii][Ll][[:space:]]*[Hh][Ee][Ll][Pp]", "operation" : "regex" } }, { "text_matches" : { "text" : [ "node:body:value" ], "match" : "(\\+0?1\\s)?\\(?\\d{3}\\)?[\\s.-]\\d{3}[\\s.-]\\d{4}", "operation" : "regex" } }, { "text_matches" : { "text" : [ "node:body:value" ], "match" : "[Gg][Mm][Aa][Ii][Ll][[:space:]]*[Cc][Uu][Ss][Tt][Oo][Mm][Ee][Rr]", "operation" : "regex" } } ] }, { "NOT data_is" : { "data" : [ "site:current-user:uid" ], "value" : "1" } } ], "DO" : [ { "node_unpublish" : { "node" : [ "node" ] } }, { "user_block" : { "account" : [ "site:current-user" ] } }, { "mail" : { "to" : "[email protected]", "subject" : "Possible spam at example.com", "message" : "Please see node [node:nid] by author [site:current-user:uid].", "from" : "[email protected]", "language" : [ "" ] } }, { "drupal_message" : { "message" : "The forum post that you submitted appears to be spam. It will be evaluated over the next few days to confirm. Your account has temporarily been suspended.", "type" : "error" } } ] } }