<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: Sequentially assigned IDs and giving away competitive intelligence</title>
	<atom:link href="http://www.brianp.net/2009/10/06/sequentially-assigned-ids-and-giving-away-competitive-intelligence/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.brianp.net/2009/10/06/sequentially-assigned-ids-and-giving-away-competitive-intelligence/</link>
	<description>Occasional Writings</description>
	<lastBuildDate>Sun, 28 Feb 2010 03:21:42 -0800</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.1</generator>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
		<item>
		<title>By: <img src='http://www.brianp.net/wp-content/plugins/rpx/images/openid.png'/> khan</title>
		<link>http://www.brianp.net/2009/10/06/sequentially-assigned-ids-and-giving-away-competitive-intelligence/comment-page-1/#comment-15</link>
		<dc:creator><img src='http://www.brianp.net/wp-content/plugins/rpx/images/openid.png'/> khan</dc:creator>
		<pubDate>Sun, 22 Nov 2009 12:01:39 +0000</pubDate>
		<guid isPermaLink="false">http://www.brianp.net/?p=192#comment-15</guid>
		<description>Brian, what flaws, if any, exist by using a hash function to generate the public IDs? Using a hash function like Keccak (http://keccak.noekeon.org/), you could limit the output key to 64 bits (more compact than UUID) and use a private integer as the input key (which could be uniquely, efficiently and conveniently generated by a database) and use a salt (a Keccak feature) to thwart brute force attacks (by computing the has of a series of integers to see which one produces a given hash).

Collisions appear to be a significant issue, but if your user-creation script (perhaps even enforced by the db with a UNIQUE constraint on the ID_HASH column), when a collision is encountered, the penalty is to call the db increment trigger and move to the next integer and hope you don&#039;t get another collision.

The principal downside I can see is that it&#039;s not O(1), but the authors claim &quot;13 cycles per byte&quot;, which is reasonably efficient for modern hardware and the number of bytes being hashed in this application.

Another hash function that could do the trick. Skein: http://www.skein-hash.info/about.</description>
		<content:encoded><![CDATA[<p>Brian, what flaws, if any, exist by using a hash function to generate the public IDs? Using a hash function like Keccak (<a href="http://keccak.noekeon.org/" rel="nofollow">http://keccak.noekeon.org/</a>), you could limit the output key to 64 bits (more compact than UUID) and use a private integer as the input key (which could be uniquely, efficiently and conveniently generated by a database) and use a salt (a Keccak feature) to thwart brute force attacks (by computing the has of a series of integers to see which one produces a given hash).</p>
<p>Collisions appear to be a significant issue, but if your user-creation script (perhaps even enforced by the db with a UNIQUE constraint on the ID_HASH column), when a collision is encountered, the penalty is to call the db increment trigger and move to the next integer and hope you don&#8217;t get another collision.</p>
<p>The principal downside I can see is that it&#8217;s not O(1), but the authors claim &#8220;13 cycles per byte&#8221;, which is reasonably efficient for modern hardware and the number of bytes being hashed in this application.</p>
<p>Another hash function that could do the trick. Skein: <a href="http://www.skein-hash.info/about" rel="nofollow">http://www.skein-hash.info/about</a>.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: <img src='http://www.brianp.net/wp-content/plugins/rpx/images/openid.png'/> Ron</title>
		<link>http://www.brianp.net/2009/10/06/sequentially-assigned-ids-and-giving-away-competitive-intelligence/comment-page-1/#comment-10</link>
		<dc:creator><img src='http://www.brianp.net/wp-content/plugins/rpx/images/openid.png'/> Ron</dc:creator>
		<pubDate>Tue, 06 Oct 2009 18:17:38 +0000</pubDate>
		<guid isPermaLink="false">http://www.brianp.net/?p=192#comment-10</guid>
		<description>Even without signing up every day, with a reasonably small sample, you can get a pretty good estimate of the user base by applying the &lt;a href=&quot;http://en.wikipedia.org/wiki/German_tank_problem&quot; rel=&quot;nofollow&quot;&gt;German Tank Problem&lt;/a&gt;.  Not really a problem when you actually want people to know how many users you have, but you probably don&#039;t want to tell competitors how many advertisers you have (by using an advertiser id in the URL or query arg).

I like to combine randomness with an autoincremented value: timestamp.  Use the most entropic bits (the microseconds) for the high order bits of the id and the less entropic bits (seconds, mins, hours, ...) for the low order bits (or simply reverse the timestamp bit order altogether).  Then toss on some random bits at the end.</description>
		<content:encoded><![CDATA[<p>Even without signing up every day, with a reasonably small sample, you can get a pretty good estimate of the user base by applying the <a href="http://en.wikipedia.org/wiki/German_tank_problem" rel="nofollow">German Tank Problem</a>.  Not really a problem when you actually want people to know how many users you have, but you probably don&#8217;t want to tell competitors how many advertisers you have (by using an advertiser id in the URL or query arg).</p>
<p>I like to combine randomness with an autoincremented value: timestamp.  Use the most entropic bits (the microseconds) for the high order bits of the id and the less entropic bits (seconds, mins, hours, &#8230;) for the low order bits (or simply reverse the timestamp bit order altogether).  Then toss on some random bits at the end.</p>
]]></content:encoded>
	</item>
</channel>
</rss>
