kitchen_kink: (Default)
[personal profile] kitchen_kink
wrench ceil vinyl bellow cheater irreproachable shulman bismark auxiliary sanderson melanie euridyce
analyst beatify carpet vestigial mathematician guy chrysler cartilage pentane pm termite uk regretting ironic alluvium glory ingrown advise bellum deacon dashboard deportation intercom taught voluntarism ferret come nanette quart


Sometimes, ya just gotta open spam.

Hey hackers - anyone know why one gets these lists of random, rather erudite words in spam, now?

Date: 2003-12-30 09:05 am (UTC)
From: [identity profile] oonh.livejournal.com
They're supposed to walk past spamtrap software without being noticed -- being long mixes of long, serious sounding words.

Date: 2003-12-30 09:11 am (UTC)
From: [identity profile] pir.livejournal.com
Randomness in most forms can be good for getting past software designed to detect particular patterns, since we can't yet write software that is able to read and understand English and tell us if it's something we a actually want to read, yet.

The batch of random crap at the start/end of messages started in Usenet spam but it used to be a block of random characters rather than random words (this stopped the "this identical message has been posted X thousand times" test). The detection software got better and spammers escalated to lists of words since it looks like useful text (without the common keywords for spam) and makes it look more like a real email and not spam.

Date: 2003-12-30 09:38 am (UTC)
From: [identity profile] oonh.livejournal.com
I meant 'high entropy'.

Date: 2003-12-30 09:08 am (UTC)
From: [identity profile] spike.livejournal.com
Everybody these days is using statistical filters to separate the spam from the 'ham'. The spammers now include "rare" words like these to spoof the statistical filters into deciding that the e-mail must be real since it contains so many un-spam-like words.

Arms race, anyone?

-Spike

PS. I'm waiting for them to start using this sort of thing.

Date: 2003-12-30 09:30 am (UTC)
From: [identity profile] dr-memory.livejournal.com
PS. I'm waiting for them to start using this sort of thing.

I've been seeing spam with markov-chain randomtext for at least 12 months now. :(

Date: 2003-12-30 10:12 am (UTC)
From: [identity profile] spike.livejournal.com
Yeah- but did you see what I linked to? Guilty Looks Enter Tree Beers! One of my favorite stories...

Date: 2003-12-30 10:24 am (UTC)
From: [identity profile] dr-memory.livejournal.com
Hah! Sorry, yes, I looked, but merely glanced, and assumed that it was a disassociative-press output (or Gertrude Stein play), and didn't stay long enough to see what it actually was. :)

From: [identity profile] dda.livejournal.com
anyone know why one gets these lists of random, rather erudite words in spam, now?

I'm pretty sure it is an attempt to get around anti-spam software, since most everything in spam other than the ad itself is there for that reason. Most anti-spam software looks for keywords in the message but also looks at the rest of the message to try to prevent "false positives" (mail that gets blocked but isn't spam). Unlike medical tests, false positives are worse than false negatives; blocking that important mail from your client is worse than "you need another round of tests" from the doctor.

So the spammers both break up the keywords (or spell them in 133t-speak) and pack the message with important-sounding words (and names) to try to fool the software. And the battle gets escalated to the next level.

It will be interesting to see if the anti-spam law that just passed will be effective at all.

whee!

Date: 2003-12-30 10:21 am (UTC)
From: [identity profile] amber-phoenix.livejournal.com
I confess a low to middling interest in the hows and whys of spam, but I have to love:

deportation intercom taught voluntarism
vestigial mathematician guy
and
regretting ironic alluvium glory

Re: whee!

Date: 2003-12-30 11:14 am (UTC)
From: [identity profile] dietrich.livejournal.com
See, that's why I love it, too. I'd no idea I'd start such a flurry of geek-speak!

Date: 2003-12-30 10:46 am (UTC)
From: [identity profile] madbard.livejournal.com
This particular breed of spam is generally recognizable by its subject head, which is in the format "Re: XXX, YYY the ZZZ". (Despite their big vocabulary, they still appear to have problems with run-on sentences.)

Date: 2003-12-30 07:22 pm (UTC)
From: [identity profile] darxus.livejournal.com
I have heard this stuff referred to as "bayes poison". Some of the most fun anti-spam stuff happening now is bayesian filters, which seem to have started getting popular as a result of Paul Graham: http://www.paulgraham.com/spam.html http://www.paulgraham.com/sofar.html http://www.paulgraham.com/better.html

This is the style of statistical analisys people have been mentioning.

Most ammusingly, the filters best capable of catching all of these (without false positives), seem to be the bayesian filters they're targeted at. I believe the best, currently, is spamprobe (which does multi-word tokens, and parses most of the header).

This stuff has been one of my obsessions lately, so if you have more questions feel free to ask. I've been running spamassassin an spamprobe together with some damn impressive results: http://www.chaosreigns.com/sa_sp/

Spamprobe has been perfect for a while, spamassassin misses some spam (with its own bayesian abilities disabled because they're inferior to SP's and I don't believe it's worth maintaining 2 bayesian databases), but I like having two for extra reliability. My goal is to never have to read my spam folder, and only read the folders which contain good mail, and the ones containing the mail SA and SP disagree on.

Somebody really needs to write a tutorial on bayesian statistics targeted at people writing spam filters.

spam spam spam spam

Date: 2003-12-30 09:05 pm (UTC)
From: [identity profile] unknownrockstar.livejournal.com
I never open spam for fear of viruses. also on principle, even if I received some unsolicited commercial email for something I was interested in (assuming it was from someone I don't know) I would delete it. if I want to remortgage my house, enlarge my penis or buy viagra online, I'll use google. I don't open anything with an attatchment even from people I know if they haven't told me in advance they are sending me something with an attatchment. I thought that was some magnetic poetry you'd composed. I made my own magnetic poetry set. it's fucking huge, although the words are tiny. of course it includes words like "flerbimblewudget" which I invented.
luv James.

Profile

kitchen_kink: (Default)
Oh look, it's Dietrich

2026

S M T W T F S

Most Popular Tags

Style Credit

Expand Cut Tags

No cut tags
Page generated Feb. 23rd, 2026 03:45 pm
Powered by Dreamwidth Studios