Hacker Newsnew | past | comments | ask | show | jobs | submit | xi's commentslogin

Maybe you'd like to check FunSQL.jl, my library for compositional construction of SQL queries. It also follows algebraic approach and covers many analytical features of SQL including aggregates/window functions, recursive queries and correlated subqueries/lateral joins. One thing where it differs from dlpyr and similar packages is how it separates aggregation from grouping (by modeling GROUP BY with a universal aggregate function).


This is awesome! I'll add a link to it on PRQL.

I guess the biggest difference between FunSQL (and similarly dbplyr) and PRQL is that the former needs a Julia (or R) runtime to run.

I really respect the library and keen to see how it develops.


FunSQL.jl requires Julia to run (obviously as it is a Julia library) but it produces standard SQL so Julia in this case is just an implementation language.

I have re-implemented parts of FunSQL in Python and OCaml (the one I have ended up using) and have added a concrete syntax similar to what you have in PRQL.

    from employees
    define
      salary + payroll_tax as gross_salary,
      gross_salary + benefits_cost as gross_cost
    where gross_cost > 0 and country = 'usa'
    group by title, country
    select
      title,
      country,
      avg(salary) as average_salary,
      sum(salary) as sum_salary,
      avg(gross_salary) as average_gross_salary,
      sum(gross_salary) as sum_gross_salary,
      avg(gross_cost) as average_gross_cost,
      sum(gross_cost) as sum_gross_cost,
      count() as count
    order by sum_gross_cost
    where count > 200
    limit 20
But, in my mind, the biggest difference between PRQL and FunSQL is the way FunSQL treats relations with `GROUP BY` - as just another kind of namespaces, allowing to defer specifying aggregates. A basic example:

    from users as u
    join (from comments group by user_id) as c on c.user_id = u.id
    select
      u.username,
      c.count() as comment_count,
      c.max(created_date) as comment_last_created_date
The `c` subrelation is grouped by `user_id` but it doesn't specify any aggregates - they are specified in the `select` below so you have all selection logic co-located in a single place.

I think this approach is very powerful as it allows you to build reusable query fragments in isolation but then combine them into a single query which fully specifies what's being selected.


Writing an alternative syntax is straight forward. Perhaps prototype PRQL using xi's excellent FunSQL backend? This way it's working out of the gate. Once syntax+semantics are pinned, writing another backend in the language of your choice would then be easier. Getting the backend correct is non-trivial work, and xi has done this already. Besides, we need a sandbox syntax anyway, so it might be fun to collaborate.


This is pretty close to how my Julia library [0] for composable construction of SQL queries works:

    From(foo) |>
    Where(Get.value .< 10) |>
    Join(From(bar) |> As(:bar), on = Get.bar.id .== Get.bar_id) |>
    Where(Get.bar.other_value .> 20) |>
    Select(Get.group, Get.value) |>
    Group(Get.group) |>
    Where(Agg.sum(Get.value) .> 100) |>
    Order(Get.group)
There is no HAVING and the you can use any tabular operators in any order. Aggregates are also separated from grouping and can be used in any context after Group is applied.

[0] https://github.com/MechanicalRabbit/FunSQL.jl


Are you serious? Regular Russian troops have been fighting and dying in Ukraine for weeks now.

http://www.nato.int/cps/en/natolive/news_112103.htm

http://uk.reuters.com/article/2014/08/28/uk-ukraine-crisis-r...


A Russian native speaker does not equal a Russian. About 2/3 of the population of Donetsk region identify themselves as ethnic Ukrainians. More than half of those name Russian as their native language.


Crimea was occupied by Russian forces in February. Military hostilities in mainland Ukraine started in April when Sloviansk has been captured by Russian ex-FSB officer Igor Girkin and his gang of Russian ultranationalists.


Maybe the TV host didn't challenge him because his comments are already so backward and idiotic as to not need further emphasis.

I'll give you a better explanation: the TV host didn't challenge him because it never happened. The "expert" (another journalist, in fact) never suggested to "physically eliminate about 1.5 million of civilians of Donetsk and Luhansk regions that are not able to fit in Ukrainian Nation" or anything close to it.


Hello, Kirill. :)

The expert is a journalist, from a newspaper «Тиждень». [1]

Here are the qoutes:

Донбасс – это не просто депрессивный регион. Там дикое количество ненужных людей. Я абсолютно осознанно об этом говорю. В Донецкой области примерно 4 миллиона жителей. И не менее 1,5 миллионов лишних. Нам не надо понимать Донбасс. Нам надо понимать украинский национальный интерес. А Донбасс нужно использовать как ресурс. [...] В отношении Донбасса: я не знаю рецепта, как это сделать быстро. Однако наиглавнейшее, что нужно сделать: есть люди, которых необходимо просто убить.

Which means that there is an "excess of 1.5 millions of people" in Donetsk Region, that "[people of] Donetsk Region mustn't be undestood [by the people from the rest of Ukraine], and Donetsk Region [and it's people] must be used as a resource instead" and "I don't know how to solve that problem [to remove excessive civilians], but the main thing is that some people must be physically eliminated".

So:

1. There are 1.5 millions of civilians of Donetsk region that are excessive.

2. People of Donetsk region mustn't be understood by the rest of Ukraine. Which literally means that they do not fit in Ukrainian Nation, they are not part of it.

3. He doesn't know the recipe how to remove the excessive civilians, but some people must be physically eliminated.

In the context of the whole TV show he is talking about elimination of excessive civilians. One could argue if he considers possible to physically eliminate 1.5 million of them, or only part of them and drive others by away by force or by economical means, etc.

> The "expert" (another journalist, in fact) never suggested to "physically eliminate about 1.5 million of civilians of Donetsk and Luhansk regions that are not able to fit in Ukrainian Nation" or anything close to it.

So, even though I have already claimed that it was too emotional for me to mention this TV show, I consider myself to have provided reasonable translation of his words.

P.S.

I've actually used pyyaml parser. :)

Edit: here is the relevant part of the show for you to check: [2]

[1] http://tyzhden.ua/Author/76/Publications/

[2] https://www.youtube.com/watch?v=mhYyj5l9Lx0


Which means that there is an "excess of 1.5 millions of people" in Donetsk Region, that "[people of] Donetsk Region mustn't be undestood [by the people from the rest of Ukraine], and Donetsk Region [and it's people] must be used as a resource instead" and "I don't know how to solve that problem [to remove excessive civilians], but the main thing is that some people must be physically eliminated".

Wow, this is truly creative editing. None of your insertions are implied from the context and the quotes you picked up are several minutes apart. In particular, he talks about 1.5 millions of people lacking meaningful job prospects as one of the causes of the unrest (which is true). A few minutes later, when he talks about about killing people, nowhere he implies millions of civilians, in fact, it's obvious he means armed militants.


I've probably been too emotional about this interview, sorry about that. You are right, he can be interpreted differently.

I have people that I know on both sides of Ukraine and everything that is going on creeps me out completely.


Good to see that you can take facts pipy -- and thanks xi for doing better than I did in discussing.

I -- and probably most people -- have been in the situation where we lost contact with reality, because we have read too much spin, even if the subjects are serious problems.

It is not easy. Good luck pipy. (And good luck to Russia and Ukraine -- they both deserve a break after the last century.)


> Good luck pipy. (And good luck to Russia and Ukraine [...])

Thank you. And you and your compatriots too.


Compatriots?

If you mean the Swedes, they should be OK (except for the grief they create themselves). Sweden is probably not on Putin's top 10 list... :-) :-(

Or do you mean Emacs/Perl/JavaScript guys? :-)


> EDIT: What's the downvote for? A lack of citations?

Maybe because of them? The article about Timoshenko is written by a serious antisemitic nutjob.


Could you please quote the antisemitic parts?


It mentions the Jewish lobby and how congress is eager to receive their funds in two places, but I find it hard to equate that with 'anti-semitic nutjob'.

The Jewish lobby is real enough that they warrant their own wikipedia page: http://en.wikipedia.org/wiki/Israel_lobby_in_the_United_Stat...


Not all Jews are Zionists and most American Zionists are not Jewish. Stop referring to a "Jewish lobby" when you mean an "Israel lobby" or "Zionist lobby." The term "Jewish lobby" is misleading, easily misunderstood, and mildly antisemitic.


Orange juice?

...The Florida industry’s aggressive marketing of oranges and orange juice is a key feature of orange juice history, as it slowly developed demand for the product. In 1907 oranges became the first perishable fruit “ever” to be advertised. As crops expanded quickly, marketing became crucial to avoid overproduction. The growth of farmer cooperatives came largely out of a need to market the products. The Florida Citrus Exchange was organized in 1910 to market fresh citrus and also to do research on processing citrus. It created advertising programs and “built national and international sales organizations.”

http://shkrobius.livejournal.com/312073.html


Armin's implementation of `cached_property` is not entirely correct. Well, it works, but the branch where `value` is not `missing` is never executed: the object's `__dict__` takes precedence over the descriptor as long as the descriptor does not define `__set__` method.

Here is an implementation of `cached_property` I use:

    class cached_property(object):

        def __init__(self, fget):
            self.fget = fget
            self.__name__ = fget.__name__
            self.__module__ = fget.__module__
            self.__doc__ = fget.__doc__

        def __get__(self, obj, objtype=None):
            if obj is None:
                return self
            value = self.fget(obj)
            # For a non-data descriptor (`__set__` is not defined),
            # `__dict__` takes precedence.
            obj.__dict__[self.__name__] = value
            return value


Don't do this. Your implementation does not work if someone invokes the dscriptor's __get__ by hand which is not uncommon. My implementation takes the shortcut but also still does the correct thing if you keep a reference to the property object.


The original paper, which is quite informative and easy to read: http://www.ploscompbiol.org/article/info%3Adoi%2F10.1371%2Fj...


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: