Paraphrased from the repo's readme: Suppose you have a program that consumes use...

atoav · on May 15, 2021

So in short: instead of representing user input (e.g. a Email address) as a string – which you can forget to validate – the idea here is to create a own data type for it, and use the validation step to create said data type.

The rest of your program then works with this data type instead of the string and this way you will get a type error whenever you accidentally use unvalidated data.

A nice idea that goes into a similar direction is to expand on this and create more types for different levels of trust. E.g. you could have the data types ValidatedEmail, VerifiedEmail and TrustedEmail and define precisely how one becomes the other. This way your typesystem will already tell you what is valid and what is not and you can't accidental mix them up.

TeMPOraL · on May 15, 2021

You can also further generalize this idea by noticing you can encode all kinds of life cycle information in your type system. As you transform some data in a sequence of steps, you can use types to document and enforce the steps are always executed in order.

In this example, the user input validation step is f(String) -> ValidatedEmail, then the process of verifying it is f(ValidatedEmail) -> VerifiedEmail. But the same principle can apply to e.g. append() operation being f(List[T], T) -> NonEmptyList[T], and you can write code accepting NonEmptyList to save yourself an emptiness check. Or, take a multi-step algorithm that gets a list of users, filters them by some criterion, sorts the list, and sends these users e-mails. Type-wise, it's a flow of Users -> EligibleUsers -> SortedEligibleUsers -> ContactedEligibleUsers.

And then, why should types be singular anyway? You should be able to tag data with properties, and then filter on or transform a subtag of them. This is the area of theory I'm not familiar with yet, but I imagine you should be able to do things like:

List[User] -> List[User, NonEmpty] -> List[User[Eligible], NonEmpty] -> List[User[Eligible], NonEmpty, Sorted[Asc]] -> List[User[Contacted], Sorted[Asc]].

Or,

Email -> Email[Validated] -> Email[Validated, Verified] -> Email[Validated, Verified, Trusted].

I'm sure there's a programming language that does that, and then there's probably lots of reasons that this doesn't work in practice. I'd love to know about them, as I haven't encountered anything like it in practice, except bits and pieces of compiler code that can sometimes propagate such information "in the background", for optimization and correctness checking.

rictic · on May 15, 2021

The closest I've seen is Idris, which lets you pass around data along with proofs about that data. The proofs can be generated at either runtime or compile time (the usual distinction between the kinds of things you can do at runtime vs compile time is blurred in all sorts of delightful ways in Idris).

The Idris Book (https://www.manning.com/books/type-driven-development-with-i...) is structured in a very practical, "learn some then build some" format that I found to be a joy to read.

akersten · on May 15, 2021

I really like this idea. It feels like weapons-grade TypeScript and could solve an entire class of logical error. If someone has a good method they're already using to encode strict typing like this I'd love to check it out.

greggman3 · on May 16, 2021

This site on F# used to show how to do the same in C# (Which would be more applicable to typescript?) but, the usually coding via types is best in a language designed to make that easy. AFAIK Typescript (nor C++, C#, Java) are designed for that purpose where as Haskell, F# are.

That doesn't mean you can't do it in those other languages, only that it's tedious compared to languages that are designed for it.

https://fsharpforfunandprofit.com/posts/designing-with-types...

https://fsharpforfunandprofit.com/posts/fsharp-decompiled/

eru · on May 16, 2021

This is pretty common practice in languages like Haskell.

Gregaros · on May 16, 2021

Reminds me of dependent types: https://en.wikipedia.org/wiki/Dependent_type

_greim_ · on May 15, 2021

To keep building on this, I think the word "parsing" is just the tip of the iceberg. Parsing is one way to port data across a type boundary, where the source and dest types are optimized for different use cases (e.g. serialization vs type-safe representation). Since the semantic Venn diagrams of any two types might have areas of non-overlap, parse-don't-validate means establishing clear boundaries in your program where those translations happen, then defining the types on either side of the boundary to rule out the possibility of nonsense states elsewhere throughout the program. The idea of nonsense states is closely related and discussed more here[0] and here[1].

[0] http://blog.jenkster.com/2016/06/how-elm-slays-a-ui-antipatt...

[1] https://kentcdodds.com/blog/make-impossible-states-impossibl...

_lqaf · on May 15, 2021

Reminds me a little of taint checking in Perl and Ruby, in reverse.

feurio · on May 16, 2021

Yes. Also I thought of when I have created a Domain in Postgres (https://www.postgresql.org/docs/9.5/sql-createdomain.html) with a constraint written in PL/Perl to ensure that the data going in is always valid (eg. check digits in IDs, a bit like in credit cards).