Using Rules to block SPAM in Drupal 7

There are a plethora of spam-prevention modules for Drupal (e.g. HoneyPot, reCAPTCHA, Mollom, Spamicide, MotherMayI, Spambot) . They work well, generally speaking, but not all websites are the same. I run a website that has a forum with content created by authenticated users. Account registration is open but protected. The node forms are protected, but I still get very specific spam... mostly things either pertaining to some fake technical support service, or Ugg boots advertisements.

Example spam post on Drupal 7
Nice try "AndrewBrown".

After a few years I thought about using Rules to look for certain common words/phrases. A single word comparison isn't too difficult, but using Regular Expressions in Rules isn't as easy as I hoped. For starters, the Text Comparison option for RegEx doesn't support flags, so a case insensitive match is a bit more tricky.

Here's the example that I came up with, including both phrase (case insensitive) matching and a search for a phone number. Several posts I observed had phone numbers, and I really don't want users posting phone numbers in this website's forum. After the matching conditions are met, the node is unpublished, the user is blocked, an email is sent to a content moderator, then a message is displayed to the user. Obviously replace the example domain/email values and adjust searching as needed in this Rules export:

 

{ "rules_forum_spam_filter" : {
    "LABEL" : "Forum SPAM Filter",
    "PLUGIN" : "reaction rule",
    "OWNER" : "rules",
    "TAGS" : [ "spam" ],
    "REQUIRES" : [ "rules" ],
    "ON" : { "node_presave--forum" : { "bundle" : "forum" } },
    "IF" : [
      { "OR" : [
          { "text_matches" : {
              "text" : [ "node:body:value" ],
              "match" : "[Cc][Uu][Ss][Tt][Oo][Mm][Ee][Rr][[:space:]]*[Ss][Uu][Pp][Pp][Oo][Rr][Tt]",
              "operation" : "regex"
            }
          },
          { "text_matches" : {
              "text" : [ "node:body:value" ],
              "match" : "[Cc][Uu][Ss][Tt][Oo][Mm][Ee][Rr][[:space:]]*[Ss][Ee][Rr][Vv][Ii][Cc][Ee]",
              "operation" : "regex"
            }
          },
          { "text_matches" : {
              "text" : [ "node:body:value" ],
              "match" : "[Hh][Ee][Ll][Pp][[:space:]]*[Ll][Ii][Nn][Ee]",
              "operation" : "regex"
            }
          },
          { "text_matches" : {
              "text" : [ "node:body:value" ],
              "match" : "[Gg][Mm][Aa][Ii][Ll][[:space:]]*[Hh][Ee][Ll][Pp]",
              "operation" : "regex"
            }
          },
          { "text_matches" : {
              "text" : [ "node:body:value" ],
              "match" : "(\\+0?1\\s)?\\(?\\d{3}\\)?[\\s.-]\\d{3}[\\s.-]\\d{4}",
              "operation" : "regex"
            }
          },
          { "text_matches" : {
              "text" : [ "node:body:value" ],
              "match" : "[Gg][Mm][Aa][Ii][Ll][[:space:]]*[Cc][Uu][Ss][Tt][Oo][Mm][Ee][Rr]",
              "operation" : "regex"
            }
          }
        ]
      },
      { "NOT data_is" : { "data" : [ "site:current-user:uid" ], "value" : "1" } }
    ],
    "DO" : [
      { "node_unpublish" : { "node" : [ "node" ] } },
      { "user_block" : { "account" : [ "site:current-user" ] } },
      { "mail" : {
          "to" : "[email protected]",
          "subject" : "Possible spam at example.com",
          "message" : "Please see node [node:nid] by author [site:current-user:uid].",
          "from" : "[email protected]",
          "language" : [ "" ]
        }
      },
      { "drupal_message" : {
          "message" : "The forum post that you submitted appears to be spam.  It will be evaluated over the next few days to confirm.  Your account has temporarily been suspended.",
          "type" : "error"
        }
      }
    ]
  }
}