Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Logswan – Fast Web log analyzer using probabilistic data structures (github.com/fcambus)
35 points by mulander on Oct 1, 2015 | hide | past | favorite | 3 comments


If you're interested in the probabilistic approach, this is how it works: https://en.wikipedia.org/wiki/HyperLogLog

"The basis of the HyperLogLog algorithm is the observation that the cardinality of a multiset of uniformly-distributed random numbers can be estimated by calculating the maximum number of leading zeros in the binary representation of each number in the set. If the maximum number of leading zeros observed is n, an estimate for the number of distinct elements in the set is 2^n."


If anyone involved in the project is reading this the DNS entry for "www.logswan.org", available as a link on the github page, does not exist.


Thanks for reporting this.

Indeed, there was no site configured on logswan.org when this was posted to HN. I made the required changes but due to the nature of DNS, it'll still return NXDOMAIN for some users until caches are cleared.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: