<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	>
<channel>
	<title>Comments on: Spam Statistics</title>
	<atom:link href="http://www.sunclipse.org/?feed=rss2&#038;p=193" rel="self" type="application/rss+xml" />
	<link>http://www.sunclipse.org/?p=193</link>
	<description>Now living at http://scienceblogs.com/sunclipse/</description>
	<pubDate>Sat, 11 Sep 2010 03:03:11 +0000</pubDate>
	<generator>http://wordpress.org/?v=2.5</generator>
		<item>
		<title>By: Blake Stacey</title>
		<link>http://www.sunclipse.org/?p=193#comment-2972</link>
		<dc:creator>Blake Stacey</dc:creator>
		<pubDate>Thu, 12 Jul 2007 17:20:13 +0000</pubDate>
		<guid isPermaLink="false">http://www.sunclipse.org/?p=193#comment-2972</guid>
		<description>Next observation:  the &lt;a href="http://en.wikipedia.org/wiki/Autocorrelation" rel="nofollow"&gt;autocorrelation&lt;/a&gt; of the spam count decreases fairly steadily, but the autocorrelation of the "ham" count drops precipitously over about 50 days and then decreases much more slowly &#8212; almost a "hockey stick" kind of curve.  If you leave out the part with the "new service" (which kicks in at day 578 of 630) then the autocorrelation drops smoothly.</description>
		<content:encoded><![CDATA[<p>Next observation:  the <a href="http://en.wikipedia.org/wiki/Autocorrelation" rel="nofollow">autocorrelation</a> of the spam count decreases fairly steadily, but the autocorrelation of the &#8220;ham&#8221; count drops precipitously over about 50 days and then decreases much more slowly &mdash; almost a &#8220;hockey stick&#8221; kind of curve.  If you leave out the part with the &#8220;new service&#8221; (which kicks in at day 578 of 630) then the autocorrelation drops smoothly.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Blake Stacey</title>
		<link>http://www.sunclipse.org/?p=193#comment-2961</link>
		<dc:creator>Blake Stacey</dc:creator>
		<pubDate>Thu, 12 Jul 2007 13:28:35 +0000</pubDate>
		<guid isPermaLink="false">http://www.sunclipse.org/?p=193#comment-2961</guid>
		<description>Aha &#8212; "a major site started running a type of content through it that wasn’t comments, it’s 95% legit rather than the other way around, so the ham spiked because of that."  Thanks.

I'll post again if the Fourier transform, autocorrelation etc. analyses show anything interesting.</description>
		<content:encoded><![CDATA[<p>Aha &mdash; &#8220;a major site started running a type of content through it that wasn’t comments, it’s 95% legit rather than the other way around, so the ham spiked because of that.&#8221;  Thanks.</p>
<p>I&#8217;ll post again if the Fourier transform, autocorrelation etc. analyses show anything interesting.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Matt</title>
		<link>http://www.sunclipse.org/?p=193#comment-2942</link>
		<dc:creator>Matt</dc:creator>
		<pubDate>Thu, 12 Jul 2007 03:12:47 +0000</pubDate>
		<guid isPermaLink="false">http://www.sunclipse.org/?p=193#comment-2942</guid>
		<description>I just explained the ham spike in the comments on the Akismet blog.</description>
		<content:encoded><![CDATA[<p>I just explained the ham spike in the comments on the Akismet blog.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Blake Stacey</title>
		<link>http://www.sunclipse.org/?p=193#comment-2937</link>
		<dc:creator>Blake Stacey</dc:creator>
		<pubDate>Wed, 11 Jul 2007 22:56:50 +0000</pubDate>
		<guid isPermaLink="false">http://www.sunclipse.org/?p=193#comment-2937</guid>
		<description>Hmmm, upon cursory inspection &#8212; &lt;i&gt;i.e.,&lt;/i&gt; open up &lt;a href="http://www.octave.org/" rel="nofollow"&gt;Octave&lt;/a&gt; and try the first things I think of &#8212; the spam spectrum seems to have a peak at a frequency of about 5 days, while the "ham" spectrum has a peak at 7.  I might be able to tell more, but I have to leave the office now to head to a birthday dinner.

Ciao / chow!</description>
		<content:encoded><![CDATA[<p>Hmmm, upon cursory inspection &mdash; <i>i.e.,</i> open up <a href="http://www.octave.org/" rel="nofollow">Octave</a> and try the first things I think of &mdash; the spam spectrum seems to have a peak at a frequency of about 5 days, while the &#8220;ham&#8221; spectrum has a peak at 7.  I might be able to tell more, but I have to leave the office now to head to a birthday dinner.</p>
<p>Ciao / chow!</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Blake Stacey</title>
		<link>http://www.sunclipse.org/?p=193#comment-2936</link>
		<dc:creator>Blake Stacey</dc:creator>
		<pubDate>Wed, 11 Jul 2007 22:31:16 +0000</pubDate>
		<guid isPermaLink="false">http://www.sunclipse.org/?p=193#comment-2936</guid>
		<description>Data &lt;i&gt;and&lt;/i&gt; a Firefox extension &#8212; oh, my!</description>
		<content:encoded><![CDATA[<p>Data <i>and</i> a Firefox extension &mdash; oh, my!</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: The algorithm</title>
		<link>http://www.sunclipse.org/?p=193#comment-2935</link>
		<dc:creator>The algorithm</dc:creator>
		<pubDate>Wed, 11 Jul 2007 22:25:51 +0000</pubDate>
		<guid isPermaLink="false">http://www.sunclipse.org/?p=193#comment-2935</guid>
		<description>The chart data as XML is &lt;a href="http://akismet.com/stats/chart-data.php" rel="nofollow"&gt;here&lt;/a&gt;.

(The almost orgasmic &lt;a href="https://addons.mozilla.org/en-US/firefox/addon/1843" rel="nofollow"&gt;Firebug&lt;/a&gt; extension lets you see all the network traffic from a page, and lots of other goodies. Orgasmicity may vary.)

I wonder what the jump is, too.  New spamming technique is definitely possible, and I can't come up with a slam-dunk other idea.

At first I thought Akismet got a bunch of new users suddenly (Slashdot or something), but then spam would also spike and stay higher.  If, when FFT'd, the "new ham" closely follows the weekday/weekend patterns of past ham rather than those of spam, then maybe it truly is ham and Akismet lowered its false positive rate, although that's strange because it looks like the ham &lt;i&gt;tripled&lt;/i&gt; overnight.  Maybe some service with its own filtering started using Akismet on its "probably ham" messages?  Or, quite possibly, some chart data is flat wrong or they changed how they counted something, though I can't figure what that would be.</description>
		<content:encoded><![CDATA[<p>The chart data as XML is <a href="http://akismet.com/stats/chart-data.php" rel="nofollow">here</a>.</p>
<p>(The almost orgasmic <a href="https://addons.mozilla.org/en-US/firefox/addon/1843" rel="nofollow">Firebug</a> extension lets you see all the network traffic from a page, and lots of other goodies. Orgasmicity may vary.)</p>
<p>I wonder what the jump is, too.  New spamming technique is definitely possible, and I can&#8217;t come up with a slam-dunk other idea.</p>
<p>At first I thought Akismet got a bunch of new users suddenly (Slashdot or something), but then spam would also spike and stay higher.  If, when FFT&#8217;d, the &#8220;new ham&#8221; closely follows the weekday/weekend patterns of past ham rather than those of spam, then maybe it truly is ham and Akismet lowered its false positive rate, although that&#8217;s strange because it looks like the ham <i>tripled</i> overnight.  Maybe some service with its own filtering started using Akismet on its &#8220;probably ham&#8221; messages?  Or, quite possibly, some chart data is flat wrong or they changed how they counted something, though I can&#8217;t figure what that would be.</p>
]]></content:encoded>
	</item>
</channel>
</rss>
