Hacker Newsnew | past | comments | ask | show | jobs | submit | more fendrak's commentslogin

I was a one-time Xmonad user, but switched to i3 recently. The enormous complexity of configuring Xmonad is what got me to switch, honestly. Perhaps if I A) needed enormous configurability or B) knew Haskell well I would have stuck around, but i3's config is much simpler to manage. i3 is also slightly prettier, but that's orthagonal :)


I recommend using https://github.com/Prismatic/om-tools, if only for the defcomponent macro: https://github.com/Prismatic/om-tools#defcomponent

This abstracts out all the reify stuff and makes for far cleaner component definitions.


As a dev working in Austin, I'd say that's about right, especially when you consider the cost-of-living difference: http://www.wolframalpha.com/input/?i=cost+of+living+washingt...


Agree. That number seems to be in the sweet spot.


The thing that made me start wearing a helmet was the realization that, even if I'm the safest possible bicyclist there's ever been, there's no way to stop the environment from acting against you!

I managed to go over my handlebars while riding down a quiet side street (with no hands, like a badass :/ ), when a gust of wind blew me sideways and caused me to overcorrect and crash.

The Lord protected me on that one and gave me a chance to be a bit more proactive :)


As far as functional languages go, it's a nice compromise between the absolute purity of something like Haskell and the day-to-day practicalities necessitated by building practical, scalable things quickly (core.async, nice Java integration, runs on the JVM).

As far as looking like a parenthesis layer cake, that's part of the Lisp "tradition", really. One can look at is as the price you pay for true homoiconicity and powerful macros.


"the absolute purity of something like Haskell"

Working with Haskell for a while, "absolute" there starts to seem weird - there is room in "programming language space" for "more pure than Haskell", and some attempts at it. I've not drawn any conclusions yet as to whether anything in that space is practical, but it's certainly interesting.


Speaking as someone who's lived through the opposite experience (grew up seeing the wrong thing not believing, found the right thing and believed), I think it comes from the repeated dissonance between reading/being taught one thing, and then not seeing that thing actually lived out. Turns out that when you actually see people practicing what they preach on a large scale, things can turn out quite different!

I can only imagine people don't try and affect change for the same reasons they don't try and affect change in politics at large, namely that the issue seems so large that they can't see themselves making any headway against a seemingly advancing tide.


This is a great solution!

What would be the reason to do it this way if you don't already have the FSA?

If you just need to hash a small-ish set of known items and keep memory usage low, I'd think starting with the table would make more sense. Wouldn't the FSA approach require more memory than the table, since it has to maintain all the intermediate nodes?


Wouldn't the FSA approach require more memory than the table, since it has to maintain all the intermediate nodes?

What do you mean by 'a table'? If you want a set without false positives/negatives or a perfect hash function, you have to represent the possible elements in some way. FSAs are compact when you store a set of sequences where there is a certain amount of redundancy in the form of subsequences that occur in multiple strings. In such cases, FSAs are far more compact that a flat table of strings.

Of course, in an actual compact implementation, you would not implement states and transitions using classes. E.g. Dictomaton stores states as an array of transitions (which are unboxed), which is far more compact (see the numbers on the Github page for some comparisons with comparable data structures).


"If we didn't do it, someone else would."

This sounds like the justification of every grey-area business idea I've ever heard.



Am I the only person who thinks that things like this are totally unnecessary? Is learning/reading regular expressions really that difficult for most people?

Here's the subset of regular expressions that has gotten me through nearly all of the regular expressions I've ever needed to write. As a plus, it has no dependencies!

* - zero or more of the preceding character/group

+ - one or more of the preceding character/group

? - zero or one of the preceding character/group

$ - end of line

^ - beginning of line

. - one of any one character

\ - escape the following character (for a literal '$' or '.', for example)

[<some characters>] - one of the given characters

[a-zA-Z0-9] - letters and numbers inside a group can have ranges!

(<something>) - capturing group (anything that matches inside it will be accessible in the match object)

<thing1>|<thing2> - either the first thing, or the second thing (or the third, or the fourth...)

This isn't a complete, or even precise, definition, but knowing those things will get you to the point where you can read and write expressions like this:

^(-|+)?[0-9]*\.[0-9]+$

which matches things like -.2, 0.123, +0.1, etc. (floating point numbers, basically). This likely has bugs, since I haven't tested it ;)


> which matches things like -.2, 0.123, +0.1, etc. (floating point numbers, basically). This likely has bugs, since I haven't tested it ;)

The fact you can't come up with a regex and be sure it's correct answers your first two questions.

Regexes are incredibly useful, but the syntax is just a bunch of noise. This library turns it into self-docummenting code, which is incredibly valuable. Novices might prefer showing their regex-fu, but after you've spent a good chunk of your life reading and writing code, you value clear code more.


> This library turns it into self-docummenting code, which is incredibly valuable.

The problem is that for anyone who knows how regexps work, this library provides an obfuscation layer. I personally prefer the free-spacing regexp support[1] found in a number of popular parsers. It lets you write a complicated regexp in small and/or indented chunks across multiple lines, including comments. This provides as much or more documentation than this "self-documenting" form and allows the full power of the regexp engine (which big regexps often seem to need). I've occasionally had to work out some pretty gnarly regexps, but when written in two-column code/comments form they end up being pretty understandable and maintainable.

[1] Example from the Ruby docs. IIRC, Python and a few other parsers have similar support: http://www.ruby-doc.org/core-1.9.3/Regexp.html#label-Free-Sp...


Under the hood, what's being generated is still a regex though. How can you be sure ANY code is correct, aside from testing it?


True, but your argument can apply to any sort of abstraction (e.g. using C instead of assembly or even assembly over 1's and 0's).

He's arguing that the nicer syntax is a valuable abstraction over regex's overly-complicated syntax, not that the syntax will be 100% the same as regex 100% of the time.


> How can you be sure ANY code is correct, aside from testing it?

Apparently, we can't. So, you really want to improve your odds of writing correct code at the first attempt by making the language work with you rather than against.

Replacing a bunch punctuation with code that reads like prose is one way.


Well, it's part of Ruby's philosophy: provide clean, readable and understandable code.

I'm a huge fan of regular expressions and I love to discover new ones, try to improve my own, etc. But, it does not mean there isn't place for a human readable regex syntax, and I will definitely give it a try.


> Am I the only person who thinks that things like this are totally unnecessary?

Not unnecessary, “useless”, “kidding” are more appropriate words IMHO.

Where is named captures? Oh, you cannot. Well, ordinary $1 $2 $3? Oh, you cannot capture at all! How do you use back refs? Oh, you cannot? Wait, why the heck then you even need regexps?

There are problems with complex regexes, yes. But this silly toy is not the solution. For solutions look at Perl6's “regexps”, Perl6's Grammars (example down the thread), Perl5's Regexp::Grammars, look how Haskell do it with parser combinators. They are all light years ahead.


Ouch, thanks for the harsh words.

But yes, 1.2 will likely have named captures and back refs. It's a regex, so $1 etc come with turning it into a regex and doing matching.


OK then. But look at state of the art solutions I mentioned ;-)

Perl5's R::Grammars is doing it the same way you do — it generates one-big-scary regex. But this lib is nowhere close to what it can do.

For idea: it will provide fully parsed struct like {proto: 'http', domain: { all: 'www.exampe.com', subdomains: ['www','example'], tld: 'ord', ... }


You can do this with regex captures.


It's not $1="this", $2="that", it's a tree.

So, no, you cannot. And there're many more cool things you cannot


You know what? I agree with you that regexes are not that hard, but the main problem from my POV is the lack of a standard. Each programming language implements those in a similar but always so slightly different syntax.

If we could come up with some sort of high-level API for regexes we might end up in a better situation, given that this would make them effectively language agnostic.

And you could always revert to your platform's native option if there are some missing features on the API (Given that those type of things generally end up with a least-common-denominator approach)


Not true. Regexes have been around for a long time and there are a few standards -- POSIX basic, POSIX extended, and PCRE. Any language/tool worth its salt these days will follow one of these three implementations (usually it's PCRE as it's the most complete). It's usually older tools like awk, vim, find, etc. that have their own quirks -- they were created before these standards existed and so they generally follow extended regex but then introduce their own syntax for certain concepts (in vim, for instance, \<...\> does the same thing as \b...\b in PCRE).


Didn't you just make the previous commenters point, that regex implementations vary? Within those three there are variations too with the flags and how they implement word boundaries (either exclusion of word characters or inclusion of certain spacing chars)


I think the case that there are too many implementations is overstated. Almost everyone can get away with knowing only PCRE these days. Most programmers I've know don't even know that they're using PCRE, they just know they're using "regexes"


I haven't had the incentive to learn regexes well enough to understand why I have to escape random things in different languages and environments. And occasionally something will fail to work all my cases and I have to fall back on a non-regex way of matching.

I've generally found that my practice is basically "try regex; if it doesn't work after ten minutes of effort, abandon regex completely".


Library owner here!

> Am I the only person who thinks that things like this are totally unnecessary?

No, it's been made clear to me that my library is totally unnecessary by quite a few developers. I like it though.

> ^(-|+)?[0-9]*\.[0-9]+$

exp.float, and you only have to define it once.


totally unnecessary? The fact that solving problems with a regex is a common joke answers this question handily enough ("Some people, when confronted with a problem, think "I know, I'll use regular expressions." Now they have two problems.")

I only use regex occasionally and, as such, typically need to refresh each time I go back to it.

Obviously you should choose the level of abstraction that fits your needs but I would love something like this if I did any ruby.


It's just an alternate syntax, trading verbosity for readability. Debating whenever it's necessary or not is bikeshedding.


Am I the only person who thinks that things like this are totally unnecessary? Is learning/reading regular expressions really that difficult for most people?

No, you're not the only one. I have a feeling that things like that massive regular expression that is meant to match RFC822 email addresses is part of the reason for the focus on this kind of "easy to use, human" regular expression interfaces. The problem is that that email matching regular expression is an extreme case that matches a whole bunch of things that MTAs/MUAs need to know about, but that most people, who are accepting email addresses in web forms, don't need to know about, like "email comments", the real name fields, bang paths, etc. 99.999% of email addresses are going to match something simple like ^\S+@[\w-.]+\.[\w-]+$, and you're going to pass that off to the MTA to parse and validate and bounce anyway. For example, most people should be more concerned with having decent bounce processing support rather than trying to make email address matching regular expressions.


For me, the biggest advantage of this sort of "library" approach is that instead of only being able to spedify regexes using literals you can treat them like other values in the language: you can name subregexes, you can create functions that combine or modify regexes, etc.

I'm not sure the "builder pattern" in the OP does this well though...


Thoughts on doing it better? The DSL is just a wrapper for the objects.


I may as well make one suggestion along these lines ( implemented in a similar JS library I whipped up not too long ago: https://github.com/wyantb/js-regex ).

Macros! https://github.com/wyantb/js-regex#macros

Given some named macro, whenever the user wants to use it, they use addMacro (or some analogous method).


Not to mention that if you "learn" regular expressions with this library, you'll become an expert at this library and useless with "regular expressions", which is just one more example of things that can result in your inability to learn and use other languages effectively.

Regular expressions are a pain, but I strongly believe that you shouldn't use abstractions unless you have at least a functional understanding of what you're abstracting.


For one thing I agree it's totally unnecessary. I also think that people who're not familiar with regex should stay away from them as far as possible. Or at least get some understanding on how regex is implemented, learning about finite automata is very helpful to the understanding of regex.


I was prepared to think this was totally unneccesary, but ended up thinking it's actually pretty nice.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: