You could open up the dataset by assigning every word an id, and giving an anony...

dfranke · on May 27, 2009

You could probably extract a good deal of information from this by examining word frequency and sentence structure. Or at least, the attempt would be too much for me to resist.

ivankirigin · on May 27, 2009

You'd love enough to make the risk of competition all but gone. The most meaningful sentences might have a word used 3 times in a 1000 word corpus. You just can't glean meaning on so little context.

I'd love to try too.

dfranke · on May 27, 2009

I've applied twice to YC, so my own applications would provide a Rosetta stone for a lot of important words. But that would be cheating.