kitchen_kink | Re: TRGP, diat the octopus

You're viewing

kitchen_kink's journal
Create a Dreamwidth Account Learn More

Reload page in style: site light

wrench ceil vinyl bellow cheater irreproachable shulman bismark auxiliary sanderson melanie euridyce
analyst beatify carpet vestigial mathematician guy chrysler cartilage pentane pm termite uk regretting ironic alluvium glory ingrown advise bellum deacon dashboard deportation intercom taught voluntarism ferret come nanette quart

Sometimes, ya just gotta open spam.

Hey hackers - anyone know why one gets these lists of random, rather erudite words in spam, now?

Flat | Top-Level Comments Only

From:

oonh.livejournal.com

They're supposed to walk past spamtrap software without being noticed -- being long mixes of long, serious sounding words.

From:

pir.livejournal.com

Randomness in most forms can be good for getting past software designed to detect particular patterns, since we can't yet write software that is able to read and understand English and tell us if it's something we a actually want to read, yet.

The batch of random crap at the start/end of messages started in Usenet spam but it used to be a block of random characters rather than random words (this stopped the "this identical message has been posted X thousand times" test). The detection software got better and spammers escalated to lists of words since it looks like useful text (without the common keywords for spam) and makes it look more like a real email and not spam.

From:

oonh.livejournal.com

I meant 'high entropy'.

From:

spike.livejournal.com

Everybody these days is using statistical filters to separate the spam from the 'ham'. The spammers now include "rare" words like these to spoof the statistical filters into deciding that the e-mail must be real since it contains so many un-spam-like words.

Arms race, anyone?

-Spike

PS. I'm waiting for them to start using this sort of thing.

From:

dr-memory.livejournal.com

PS. I'm waiting for them to start using this sort of thing.

I've been seeing spam with markov-chain randomtext for at least 12 months now. :(

From:

spike.livejournal.com

Yeah- but did you see what I linked to? Guilty Looks Enter Tree Beers! One of my favorite stories...

From:

dr-memory.livejournal.com

Hah! Sorry, yes, I looked, but merely glanced, and assumed that it was a disassociative-press output (or Gertrude Stein play), and didn't stay long enough to see what it actually was. :)

From:

dda.livejournal.com

anyone know why one gets these lists of random, rather erudite words in spam, now?

I'm pretty sure it is an attempt to get around anti-spam software, since most everything in spam other than the ad itself is there for that reason. Most anti-spam software looks for keywords in the message but also looks at the rest of the message to try to prevent "false positives" (mail that gets blocked but isn't spam). Unlike medical tests, false positives are worse than false negatives; blocking that important mail from your client is worse than "you need another round of tests" from the doctor.

So the spammers both break up the keywords (or spell them in 133t-speak) and pack the message with important-sounding words (and names) to try to fool the software. And the battle gets escalated to the next level.

It will be interesting to see if the anti-spam law that just passed will be effective at all.

From:

amber-phoenix.livejournal.com

I confess a low to middling interest in the hows and whys of spam, but I have to love:

deportation intercom taught voluntarism
vestigial mathematician guy
and
regretting ironic alluvium glory

From:

dietrich.livejournal.com

See, that's why I love it, too. I'd no idea I'd start such a flurry of geek-speak!

From:

madbard.livejournal.com

This particular breed of spam is generally recognizable by its subject head, which is in the format "Re: XXX, YYY the ZZZ". (Despite their big vocabulary, they still appear to have problems with run-on sentences.)

From:

darxus.livejournal.com

I have heard this stuff referred to as "bayes poison". Some of the most fun anti-spam stuff happening now is bayesian filters, which seem to have started getting popular as a result of Paul Graham: http://www.paulgraham.com/spam.html http://www.paulgraham.com/sofar.html http://www.paulgraham.com/better.html

This is the style of statistical analisys people have been mentioning.

Most ammusingly, the filters best capable of catching all of these (without false positives), seem to be the bayesian filters they're targeted at. I believe the best, currently, is spamprobe (which does multi-word tokens, and parses most of the header).

This stuff has been one of my obsessions lately, so if you have more questions feel free to ask. I've been running spamassassin an spamprobe together with some damn impressive results: http://www.chaosreigns.com/sa_sp/

Spamprobe has been perfect for a while, spamassassin misses some spam (with its own bayesian abilities disabled because they're inferior to SP's and I don't believe it's worth maintaining 2 bayesian databases), but I like having two for extra reliability. My goal is to never have to read my spam folder, and only read the folders which contain good mail, and the ones containing the mail SA and SP disagree on.

Somebody really needs to write a tutorial on bayesian statistics targeted at people writing spam filters.

From:

unknownrockstar.livejournal.com

I never open spam for fear of viruses. also on principle, even if I received some unsolicited commercial email for something I was interested in (assuming it was from someone I don't know) I would delete it. if I want to remortgage my house, enlarge my penis or buy viagra online, I'll use google. I don't open anything with an attatchment even from people I know if they haven't told me in advance they are sending me something with an attatchment. I thought that was some magnetic poetry you'd composed. I made my own magnetic poetry set. it's fucking huge, although the words are tiny. of course it includes words like "flerbimblewudget" which I invented.
luv James.

Flat | Top-Level Comments Only

Profile

Oh look, it's Dietrich

2026

S	M	T	W	T	F	S

Page Summary

Style Credit

Style: Neutral Good for Practicality by timeasmymeasure

Expand Cut Tags

No cut tags

Page generated Feb. 23rd, 2026 03:45 pm

Oh look, it's Dietrich

Re: TRGP, diat the octopus

Re: TRGP, diat the octopus

no subject

no subject

no subject

no subject

no subject

no subject

no subject

More than you probably ever wanted to know about spam fighting.

whee!

Re: whee!

no subject

no subject

spam spam spam spam

Profile

2026

Most Popular Tags

Page Summary

Style Credit

Expand Cut Tags