Home Cryptocurency ExchangesDependent Haskell Is the Future (2018)
Dependent Haskell Is the Future (2018)

Dependent Haskell Is the Future (2018)

The problem is that dependent types (and the very idea of “correctness by construction”) suffer from fundamental, serious limitations — both in theory and in practice — that the author does not seem to be aware of. He also does not seem to be aware of all the other approaches for software verification and correctness (sound static analysis, model checking, concolic testing) that have so far shown greater promise, and formal methods research focuses on them rather than on dependent types (the approach that scales least and suffers from the most limitations); in fact, last I spoke with a dependent type researchers, he told me that now they’re trying to weaken the soundness of their type systems (sort of gradual-dependent-types) to try and fight the problems types suffer from. I’m not saying dependent types don’t have some uses, or that they won’t change enough so they could play a central role in software correctness, but thinking that they’re the future of software correctness (in their current form) shows a complete lack of familiarity with this field.
> The problem is that dependent types (and the very idea of “correctness by construction”) suffer from fundamental, serious limitations

It would be great if you’ve mentioned them and briefly described why are they limiting.

> model checking

Is extremely complex. Proving complex programs using constrain solving would be too expensive, if possible. I would argue that proving stuff manually using SMT for trivial cases, as F* does, would be much more productive than doing the whole proof using any sort of constrain solver (if this would ever work for a sufficiently big program).

The main problem of a constrain solvers is that they don’t scale well. When you proved lemmas for a small block of code, you simply reuse those for higher order proofs. Constraints are being accumulated (tho not linearly) when the program grow and so does its state.

> It would be great if you’ve mentioned them and briefly described why are they limiting.

I’ve done so here: https://news.ycombinator.com/item?id=20714845

> Is extremely complex. Proving complex programs using constrain solving would be too expensive, if possible.

The same is true for deductive proofs, of which dependent types are a particular instance, only the problem is far worse. It is true that there are cases where deductive proofs are more feasible than model checking, but the converse is true in a much wider portion of cases. That’s why deductive proofs are rarely the first choice, and they’re used selectively, usually only after other methods have failed. Overall, deductive proof, although it certainly has its place in certain circumstances, is the least scalable verification method we have. This means that it’s the last method you want to use by default, let alone bake it into your type system.

> Constraints are being accumulated (tho not linearly) when the program grow and so does its state.

This is true regardless of the verification method used. See my blog post here: https://pron.github.io/posts/correctness-and-complexity

> When you proved lemmas for a small block of code, you simply reuse those for higher order proofs.

No, you can’t “simply” reuse results. Here’s an example (given in the post in Java, my preferred language, but as this is a post about Haskell, let me show it in Haskell):

foo :: (Integer -> Bool) -> Integer -> Integer
foo p x
| x <= 2 || odd x = 0
| otherwise = head [i | i Bool
bar x = null [1 | i <- [x-1, x-2 .. 2], s <- [x, x-i .. 0], s == 0]

These two subroutines are very simple, clearly terminating, and you can easily prove almost any property of interest about each of them in isolation. But now I claim that their composition `foo bar` never crashes (on `head`); can you prove me right or wrong? Feel free to "simply" reuse any proof about any of them.

The idea that properties compose and that you can cheaply reuse knowledge about components to prove stuff about their composition is just mathematically false. There are a couple of results I mention in my blog posts that show that. We know that if you have a composition of components X_1 ∘ … ∘ X_n then the cost of verifying some property of their composition is not only not polynomial in n, it’s not even exponential in n (it’s not any computable function in n, I think). I.e. it is a mathematical result that you cannot, generally, hide the complexity of components to make verification of their compsition scale with their number.

> Dependent types rely on deductive proofs for verification, and deductive proofs (the “proof theory” part of a logic) are the least scalable verification method as they are least amenable to automation.

What? You can generate proofs for the trivial cases using various algorithms like constraint solving.

Things like sledgehummer tactic, omega tactic, the whole F* language simply prove you wrong (by construction 😉

You need to prove manually only complex cases, where your constrain solver would not manage to generate the proof.

> But now I claim that their composition `foo bar` never crashes (on `head`)

If you proved that foo does not crashes on head for any predicate P and any integer X, and bar terminates, than foo bar terminates. What’s the problem?

> The idea that properties compose and that you can cheaply reuse knowledge about components to prove stuff about their composition is just mathematically false. There are a couple of results I mention in my blog posts that show that.

Could you refer to the proof of that, as well as give the definition of “cheaply” and especially the evidence that a composition of lemmas and theorems would be more expensive than a constraint solvers-based methods?

I didn’t state that proving is easy or cheap, just that manual proving with a usage of constraint solvers for trivial cases is more scalable and makes more sense for sufficiently big programs.

> What? You can generate proofs for the trivial cases using various algorithms like constraint solving. Things like sledgehummer tactic, omega tactic, the whole F* language simply prove you wrong

I don’t understand this simple proof. How does allowing some
automation prove that it’s not the method least amenable to automation?

> If you proved that foo does not crashes on head for any predicate P and any integer X, and bar terminates, than foo bar terminates. What’s the problem?

Because that would make the verification problem even harder (possibly to the point of undecidability). And this approach simply doesn’t make sense. Clearly, the important correctness properties of your program arise from the composition of all components of your program (or you wouldn’t need them). If the same correctness property arises from each component in isolation, then you wouldn’t need all others. Most correctness properties are just not compositional (inductive), and the belief that a non-compositional property can be proven by combining results about the components at a cost that scales with their number (polynomially, exponentially — take your pick) is mathematically false.

> Could you refer to the proof of that, as well as give the definition of “cheaply” and especially the evidence that a composition of lemmas and theorems would be more expensive than a constraint solvers-based methods?

I never mentioned “constraint solvers-based methods” (and I’m not even sure what you’re referring to [1]). Once you need to prove something, the method is irrelevant, and the theoretical limitations kick in. Showing that one method is easier or harder for instances encountered in practice requires empirical research, so I can’t say whether their harder or easier, just that so far deductive proofs are the least scalable in practice. Finally, I can say that we cannot in general decompose correctness to make it easier. Some relevant results are in the papers mentioned in my post in the sections “Correctness does not decompose” and “Language abstractions cannot generally be exploited”.

[1]: Perhaps you’re referring to SMT solvers and the like, but they are rarely used in isolation. They are used as components in deductive proofs, model checkers and concolic tests. As usual, they are more effective when used in the latter two than in the first.

> Because that would make the verification problem even harder (possibly to the point of undecidability). And this approach simply doesn’t make sense.

What? If foo terminates and bar terminates, than foo bar clearly terminates. How does this make no sense?

> Most correctness properties are just not compositional (inductive), and the belief that a non-compositional property can be proven by combining results about the components at a cost that scales with their number (polynomially, exponentially — take your pick) is mathematically false.

Ok, let’s check your blog and what paper you mention

> A Parametric Analysis of the State-Explosion Problem in Model Checking⋆

What? You state that proving doesn’t scale referring to constraint based solutions? Sure, they don’t scale.

What this foobar example has to do with this tho? Foo does not terminate due to exception, and so does Foo Bar. How doesn’t this compose?

> I don’t understand this simple proof. How does allowing some automation prove that it’s not the method least amenable to automation?

Because you can automate anything you can prove with a SMT or model checker?

> How does this make no sense?

Because most correctness properties aren’t compositional.

> What? You state that proving doesn’t scale referring to constraint based solutions? Sure, they don’t scale.

No, you’ve misunderstood the paper, and missed the part of the post where I provide necessary context. Like a lot of work in computational complexity theory, they’re referring to the model checking problem, i.e. the problem of determining whether a program satisfies some property, not to any particular verification algorithm. Their results apply to all methods (their proofs don’t not rely on any particular verification method but show hardness results).

Also, it’s funny that you say “they don’t scale.” You’re right, none of the formal methods we have scales nearly as well as we’d like, to the point we can say “they don’t scale” [1]. Of them, the one that scales the least is deductive proofs. That’s the main reason it’s the formal methods that is used the least.

> constraint based solutions

I don’t know what you mean by that (SMT?), and I don’t recall mentioning those.

> Because you can automate anything you can prove with a SMT or model checker?

Yes, model checkers are completely automatic.

SMT is a class of algorithms and techniques used in various verification methods — deductive proofs, model checking, static analysis, concolic testing — not a verification method in itself, so I’m not sure what you mean exactly. Sometimes they’re used to prove some logical proposition directly, but they’re not used as a complete software verification tool, because they’re rather limited on their own.

[1]: When applied to the code directly. We have ways of scaling formal methods by verifying high level specifications.

Aside: Sledgehammering doesn’t currently work in Curry-Howard-based provers. (This may change in the future, it’s an active research topic.)
People are missing the point of your post, which is a bit of a shame. The point is that `foo p n` returns an integer when `n` is the sum of two positive integers that both satisfy `p`. `bar n` holds when `n` is prime. Thus `foo bar` always returns an integer iff Goldbach’s conjecture is true. You point is that no one knows how to prove Goldbach’s conjecture therefore no formal methods can prove that `foo bar` never crashes.

If the dependent typing proponents in this thread responded to your point rather than picking holes in your usage of the words “composition” and “terminate” then they might make some interesting counterpoints. For example, they might say

1. “We don’t expect dependent types to be able to prove everything (of course), we only expect them to be approximately as good as any other proof method for almost any property of interest”, or

2. “What do you mean “you can easily prove almost any property of interest about each of them in isolation”? Does “property of interest” have a specific definition? How about `bar (factorial (10101010))`? Is that `True`? Is it not a “property of interest”? Is it not “in isolation”? How about “there exists n such that `bar m == False` for all m > n”? How would you “easily” prove that?

Personally I’m not a dependent typing proponent so 1 is just my wild speculation. I am interested in your response to 2 though.

1. I mentioned why dependent types are not as good, in my opinion, as other verification methods. They are not flexible in the choice of verification method, and the method they’re tied to is the worst default a programmer could want.

2. It’s not a very precise statement, I mean that you can fully describe and understand what it is they do. After all, when you verify a program, you want to check certain properties that are of interest to you (because they’re requirements). So you can ask and answer questions like what it is that they do, are there values of p for which foo always returns false? Values for which it always returns true? Values for which it sometimes returns true and sometimes false? And yes, your question, too, “there exists n such that `bar m == False` for all m > n?” The answer is no, and it was proven by Euclid: https://en.wikipedia.org/wiki/Euclid%27s_theorem

In other words, if two different programmers were writing foo and bar (as is often the case in software), each would have thought — without knowing of the other component — that pretty much any property they want to state about their component could be rather easily proven.

This is a very confusing thread for me.

I think Idris — while not exactly mainstream — has shown that it’s fine to treat type checking and types as a continuum. There’s nothing about dependent types that forces anybody to prove Goldbach’s conjecture or anything of the sort.

You, the programmer, get to choose just how rigorously you want your values to be specified, and if you really want you can ask for proof of the Riemann hypothesis, or you can ask for a… String.

For me, that’s the great thing about dependent types — you, the programmer, get to choose how rigorous you want to be.

> has shown that it’s fine to treat type checking and types as a continuum

It has not shown that at all. That you can encode arbitrary propositions with dependent types has been known for many decades. That doing so is a feasible, affordable way to verify software correctness has yet to be shown by anyone, despite decades of attempts.

> You, the programmer, get to choose just how rigorously you want your values to be specified

But that’s the problem. Specifying your software and verifying it are two different things. The last thing you want is to limit the rigor of your specification to the rigor of your verfication, especially if your verification is done with deductive proofs, as that’s the most costly, least effective verification method. You want developers to choose the rigor of their specification and separately choose the rigor of verification. The main problem with dependent types is that they don’t let you do that, and so force you to specify only stuff you can affordably prove, which is usually the simple stuff.

> It has not shown that at all. That you can encode arbitrary propositions with dependent types has been known for many decades. That doing so is a feasible, affordable way to verify software correctness has yet to be shown by anyone, despite decades of attempts.

Well feasible is definitely a way I would characterize dependently typed deductive proofs. Compcert was a large achievement and produced a formally verified compiler back-end. Therefore it is definitely feasible to design software that is correct by construction using these methods.

I think it’s pretty reasonable to say that it’s not yet affordable to verify all compiler back-ends this way. However I don’t think the future where it is affordable is far off. Folks in the Lean community are making head way and learning from Compcert about how to model the semantics of C-like programs using new advances in type theory that will make it easier to verify such systems.

Therefore I wouldn’t discount Dependently Typed Haskell so completely. Verification methods work in a continuum and I use a few depending on the needs of the project. I often use TLA+ and sometimes I use Lean. One day I’d like to see synthesis get better for large-scale projects. That would be super cool.

> Compcert was a large achievement and produced a formally verified compiler back-end. Therefore it is definitely feasible to design software that is correct by construction using these methods.

CompCert is a very small program (1/5 of jQuery) and it’s taken world-experts years of work. Its verification also wasn’t really done using dependent types, but in Coq, a traditional proof assistant whose theory happens to be based on dependent types, but is much more similar in practice to, say, Isabelle/HOL than to Idris/dependent Haskell.

> However I don’t think the future where it is affordable is far off. Folks in the Lean community are making head way and learning from Compcert about how to model the semantics of C-like programs using new advances in type theory that will make it easier to verify such systems.

It’s been said to not be far off for 40 years now. Yet what can be verified with deductive proofs has remained, for decades, 2-3 order of magnitudes behind “ordinary” software, and the gap isn’t closing.

> Therefore I wouldn’t discount Dependently Typed Haskell so completely.

I’m not. I’m just saying that it’s currently less promising than the main thrust of formal methods research (which focuses on automated methods much more than on deductive proofs) and that it is certainly nowhere near a point where it can be said to be “the future.” I’m also pointing out that there are currently approaches that subsume dependent types — have nearly all of their advantages and virtually none of their disadvantages (contract systems).

Regarding 2 I think you’re missing my point. Firstly, the first “property of interest” was rendered incorrectly (and also stated incorrectly). It was supposed to be

> How about `bar (factorial (10^10^10^10 + 1))`? Is that `True`?

Can we easily prove or disprove that property of interest?

Secondly, I know the second result is true. I’m asking how you would easily prove it. How do you easily show that “forall m. bar m == True” is equivalent to Euclid’s theorem?

> Can we easily prove or disprove that property of interest?

Maybe we could answer that affordably, albeit with some probability, using fast probabilistic primality tests (I don’t know how high they can practically go). But can we easily prove what the result of sorting a particular list with a 10^10^10 elements is? Maybe not, but I don’t think it’s as much a property of interest as “does this program ever crash.” In other words, for every algorithm with some non-trivial complexity, the question of what the result would be for some very large input is hard to answer, but I don’t think that is a common property of interest.

But if I get your meaning, you are correct that sometimes, if our focus is on an algorithm rather than on a working system, we forget that all systems can, at best, be shown correct only with some probability (because the hardware only does what it’s told with probability), and at some point, there are algorithmic bugs that can occur in practice with a probability that’s lower than that of a hardware glitch.

> How do you easily show that “forall m. bar m == True” is equivalent to Euclid’s theorem?

By proving that bar is, indeed, a primality test, which is not hard. I.e. you prove that bar n = TRUE iff ∀ m ∈ ℕ . 1 < m < n ⇒ n % m ≠ 0. This is usually proven by induction.

How is `foo` “clearly terminating”? With `p = const False`’ it’s undefined for any even x >= 3.
Hence diverge, aka not terminates.

You’ve even stated this yourself:

> But now I claim that their composition `foo bar` never crashes (on `head`)

It does, because foo does.

> Hence diverge, aka not terminates.

By focusing on FP nomenclature (whether an exception is regarded as “divergence” or not depends on your language — note that I said “terminates”; the point is that the trace for each execution is finite) you’re missing the issue, that exists regardless of language.

> It does, because foo does.

`foo bar` never crashes (unless I made some silly bug).

Now I’m genuinely confused: do you consider a program that diverge (as in reaches some exceptional state) not as a “crash” but as a form of termination? If so then I don’t understand what your original point was; if you could prove that both functions always return some value then you can also say that composing them will always return some value.
@pron: I’m not an expert in rewriting systems and type theory, but in the literature I have studied a partial function is said to diverge when it’s applied to some particular value that won’t return / produce a normal form (i.e. non-terminating), either because it will recur ad infinitum or because the applied value doesn’t map to the function’s codomain. One could argue that in a well typed program the latter shouldn’t happen, but then one would require a type system with dependent types to type-check programs with functions like head : [a] -> a 🙂
An exception is just a mechanism that some PLs provide to indicate that the program reached such exceptional state where, for whatever reason, some assumption failed and there is no meaningful reduction to compute.
The very notions of a “partial function” and a “normal form” are language-dependent. And yes, certainly in those languages, throwing an exception is called divergence. But partial functions are not an essential concept of computation. A computation can terminate or not, and the name divergence in that context applies to that. In any event, I was careful to speak of termination, the computational concept, rather than divergence, which can have a formalism-specific meaning.
> Partial function is a standard mathematical term, not a programming language (PL) one

Like many things in mathematics, the term has been borrowed and changed. Look, for example at “function”. Do you think “function” in programming (even in Haskell) means the exact same thing as in mathematics?

In general, partial functions mean that for some values, a function is undefined. But in a PL nothing is undefined. In some cases there would be an exception; in others the subroutine would loop forever. But a language like Haskell, because it wants its “functions” to resemble mathematical functions, declares that those perfectly normal computational behaviors correspond to “undefined”. In imperative languages, where subroutines don’t try to correspond to (partial) functions, throwing an exception does not correspond to undefined.

The point is @pera said in https://news.ycombinator.com/item?id=20717160 “Now I’m genuinely confused”. That seems to be down to confusion in terminology.

@pron then makes an apparently incorrect claim which I dispute to try to help clarify. That’s clarification not nitpicking (I intended that way anyway). @pron’s point may be valid but if we don’t have common meanings to words, this will go nowhere.

The confusion is about a word that I didn’t use. I said that both subroutines “terminate”. That is a word with a precise meaning in computation — every possible trace is finite. Someone started arguing about whether or not they “diverge”, and I said that that’s language-dependent (https://news.ycombinator.com/item?id=20723623). Both foo and bar always terminate, and that is both precise and a common meaning.
> The confusion is about a word that I didn’t use.

which word?

> I said that both subroutines “terminate”. That is a word with a precise meaning in computation

I’d say so. But here’s a clip from previous:

> zopa 23 hours ago :- How is `foo` “clearly terminating”? With `p = const False`’ it’s undefined for any even x >= 3.

-and you say-

> pron 23 hours ago :- It would terminate with an exception.

So you’re claiming that ‘terminating with an exception’ is termination “with a precise meaning in computation”. But that’s a bloody strange definition I have never come across.

In fact it’s so weird I had trouble finding a definition because it seems so fundamentally bizarre that nobody I’ve ever met has ever considered throwing an exception (or similar) due to precondition violation to be termination. But here goes:

1943: In a paper, Stephen Kleene states that “In setting up a complete algorithmic theory, what we do is describe a procedure … which procedure necessarily terminates and in such manner that from the outcome we can read a definite answer, ‘Yes’ or ‘No,’ to the question, ‘Is the predicate value true?’.”

That from https://en.wikipedia.org/wiki/Halting_problem

So yes termination is defined and no it’s not what you claim it is. No wonder people are getting confused.

as for… (edit: deleted this as poster accidentally linked to wrong comment)

> But that’s a bloody strange definition I have never come across.

I don’t know what else I can say. Exceptions are not a computational concept, they’re a language concept. The computation terminates in some well-defined state. That the language declares that that well-defined state is not considered a value is a language-specific thing.

> So yes termination is defined and no it’s not what you claim it is.

It is exactly what I defined, as your quote shows. Termination and decision are not the same thing. Imagine that the decider could terminate with, Yes, No, or Error. It would still not be a decision procedure, but it is very much termination. And your very quote says exactly that when it says “terminates and in such manner that…”? If termination already means that it is in such manner, why is the qualification necessary?

Your definition of termination (to mean anything other than reaching a terminal state after a finite number of steps) is based on the fact that in a particular formalism, the lambda calculus, termination yields a normal form. Now, the lambda calculus doesn’t have exceptions, and so every lambda term either reduces to a normal form or the reduction never terminates, but a language like Haskell adds a third possibility. Because Haskell draws inspiration from the lambda calculus, it’s easier, in the discussion of Haskell, to map exceptions — which don’t yield a normal form but do terminate at a definite computational state — to the same notion of divergence, that, in the lambda calculus, is synonymous with termination, but in Haskell it isn’t. So I wouldn’t be surprised if some people took it further, and used termination to be synonymous with “produces a normal form” in Haskell. But, again, this is not how the notion of termination is commonly defined in computation (many notions of computations don’t even have the concept of a normal form or of a value). In Java, for example, where the denotation of a subroutine isn’t similar to a partial function but is, rather, a predicate transformer, an exception and a return value are equally valid results.

But yeah, I agree that selling Haskell as maths, does tend to confuse, and so people can mistakenly think that Haskell terminology is canonical mathematical or computer-science terminology. And yeah, it is sometimes frustrating to speak with those who are only familiar with the mathematics of functional programming and not with the mathematics of imperative programming, or with the theory of computation in general, and so don’t even know which terms are overloaded and which are canonical.

All of this, however, is so irrelevant to the point of my `foo bar` example (namely, that knowing things about components does not mean you can feasibly know things about their composition) that it’s ridiculous.

> Yep, and that reply was posted 10 minutes before you referred to it.

OMG, I linked to that one because I had just written it and it was right in front of me. Here is the same answer in an older comment: https://news.ycombinator.com/item?id=20715970

So not only are you determined to focus on something that is entirely beside the point because you’ve completely missed the point, you are also wrong in the side-track you’ve chosen.

I believe these terms (including “exceptional state”, but no “exception”) are commonly used in formal languages, combinatorics, rewriting systems, mathematical logic and so. They are also used in Haskell because this was partially based on System F, which comes from the simply typed lambda calculus. I understand your frustration and please believe me that it was not my intention to turn this thread into a pedantic show.

But now coming back to your original point and example: I don’t think anyone from the Haskell and ML-family languages camp is claiming one can automatically deduce every possible decidable property from any function with their type system, or at least not yet 🙂

Yes, they are commonly used in formal languages, and their precise meaning is language dependent, which is why I didn’t use partial or divergence but termination, which has a unique computational meaning.

I don’t think anyone claims that any property can be proven. I do think that some think that if a program can be decomposed into simple parts, whose “interesting” properties are provable in isolation, then you can feasibly prove interesting properties of their composition, which is not true. A proof of that was only given in the past 10-15 years, and is referenced in my post.

AFAICS you are fundamentally confused which is causing this mess. This is a good example:

> Imagine that the decider could terminate with, Yes, No, or Error.

That’s just not how it works. Mathematically a function, if it terminated, terminates with an answer. There is no error – the precondition of the function states what is acceptable and the function must come back with an answer (or possibly not terminate, but that’s another thing). That’s why the quote from Stephen Kleene[0] didn’t mention errors.

If I want to define a function computing real-number square root, I set as a precondition that negative numbers must never be passed to it. That’s it. That’s the precondition – “this and only this is acceptable to this bit of code” is what it means.

In any language an exception should happen if you violate that ie. pass in a negative number. That may show up as an exception. But you don’t want exceptions! Which is why you try to formally guarantee the precondition is met! So You are guaranteed of an answer (leaving aside non-termination).

All this formal stuff (at least in my world) is about verifying the code conforms to the precondition such that an error will never occur (edit: wrong, it’s more than verifying to the precondition but go with it for now) . Also verifying that if the precondition is verified that termination will happen.

Put another way, taking the head of an empty list is meaningless, so you set the precondition (edit: of the function hd())to be that always

is_not_empty(lst)

yes, you may get an exception unless verified. But if you can prove that at the point of call lst is never empty then hd() will guarantee never to except.

If any program was entirely verified thus then no exceptions would be needed because the program would be correct (excluding resource exhaustion on our regrettably finite computers).

> Termination and decision are not the same thing.

Mathematically if it terminates, it decides. If it decided then it has terminated. Yes they are the same thing. That’s what the quote says. Termination via exception is not in that world: “in such manner that from the outcome we can read a definite answer”. Blowing up without deciding would mean termination (an exception in your terms) without providing yes/no. Useless.

I’m pretty weak in this area but I do know the basics, and it does not come across that you do.

> OMG, I linked to that one…

No probs, I edited out that.

[0] Heh, same kleene after whom was named the kleene star in regexes. Who know.

> Mathematically a function, if it terminated, terminates with an answer.

There is no such thing for a mathematical function to “terminate.” In classical mathematics, a function is a one-valued relation, a set of pairs, if you like. There is no algorithm and no computation involved. Termination is a computation term. Divergence is a formalism-specific.

> diverge (as in reaches some exceptional state)

The term “diverge,” when applied to an exception, is not a computational term but a term from a specific formalism (e.g.: Haskell). A computation is a sequence of states; if it ever reaches a terminal state (i.e., a state from which it cannot exit), it is said to terminate. You can apply the terms converge/diverge to termination/non-termination, but once you speak of exceptions, you’re using a language-specific terminology.

> if you could prove that both functions always return some value then you can also say that composing them will always return some value

Yes, but that would be too hard. `foo` doesn’t always return a value, but `foo bar` does. What can you prove about `foo` and `bar` separately that would make proving that easier, not harder?

>But now I claim that their composition `foo bar` never crashes (on `head`); can you prove me right or wrong?

Mu. But I can prove that your argument is not well-formed.

1. That is not composition it is application. (This is not simply terminology, you using application is what allows you to reference bar inside foo)

2. The actual composition of those functions (foo . bar) is not valid Haskell and would not typecheck.

As far as I understand — please correct me if I’m wrong — F* uses dependent types as a foundation of maths (i.e. a logic) and embeds program logic via a HTT (= Hoare Type Theory). This is a very nice division of labour between programming and verification.
> The problem is that dependent types (and the very idea of “correctness by construction”) suffer from fundamental, serious limitations — both in theory and in practice — that the author does not seem to be aware of

I don’t want to argue with you on the theory side since I am not an expert, but I can assure you that Idris is a nice language to work with. I prefer haskell due not non-dependently typed stuff, but I think otherwise it clearly shows that a practical dependent programming language to do actual real-world work is possible and ergonomic.

EDIT: I also don’t think that verification is the main advantage of dependent types. Haskells type-system is already unsound (f:: a -> b, f=unsafeCoerce, the universal proving function). I like it because it makes it easy to express more complicated constraints in an ergonomic and approachable way and therefore make the compiler just accept less programs. I see it as minimizing the chance for stupid bugs and give the programmer more information about what’s going on by providing richer types.

Could you elaborate on the “fundamental, serious limitations” by any chance?
1. No type inference (unlike ML-style types).

2. Oracle problem: In practise you never have full specifications (let me be
provocative: Facebook has no specification in principle, only its ever changing code), and often,
especially in early phases you rarely have specifications.

3. You pay the price for dependent types always (e.g. having to prove
termination), even if you don’t use interesting specifications.

4. Like Java’s exception specifications, dependent types are ‘non
local’, meaning that a change in one part of the code may ripple
through the entire code bases, this stands in the way of quick code
evolution, which is important — at least in early parts of
large-scale software projects.

5. Despite dependent types being 1/2 century old by now, and despite intensive effort, dependent
types have not really been successfully been generalised to other
forms of computing (OO, stateful, concurrecy, parallel, distributed,
timed) — unlike competing approaches.

Summary: dependent types are wonderful as a foundations of
mathematics, but have failed to live up to their promise as general
purpose languages.

1. Bidirectional typing can be easily generalized to dependent types. You can’t always infer the type, but an 80% solution is definitely good enough for inference.

2. Your types don’t have to be full specifications, and you can refine them later, no?

3. Idris, at least, doesn’t force you to make all your functions total, and functions are actually partial by default.

4. Again, in Idris, implementations are hidden by default across modules. Maybe there’s a nice middle-ground between hiding everything and exposing everything, but I don’t know what that would look like.

5. What are the competing approaches and how are they better at dealing with, for example, distributed computing?

I’m not very familiar with dependent types, but none of these seem like fundamental restrictions.

1. BIdirectional isn’t magic. If I program MergeSort or Quicksort, and give it the signature forall T.
Array[T] ->
(T -> T -> Bool) ->
Array[T]

Bidirectional will not infer what’s missing towards a full spec.

2. Yes, you can refine later, but if you want to refine later (and in practise you almost always
have the full spec — if it exists at all — only at the end), then
why pay the cost of dependent types from the start? Once you have the
full spec, verify with Hoare logic.

3, 4: I’m not sufficiently familiar with Idris to comment.

5. Alternative approaches, in addition owhat “pron” mentioned includes
program logic, e.g. TLA.

1. Yes, but the situation regarding type inference is only a bit worse than the situation in non-dependently-typed languages. Of course the compiler can’t magically figure out what theorems you want to prove.
This is my point. Adding dependent types comes with costs that you will always have to pay, even if you don’t use the power of dependent types. If you verify with e.g. program logic in the few real-world cases where formal verification is financially viable, then you only pay the price when you do the verification.
Every single one of your claims is addressed by the programming language Idris [1], which has dependent types. This is not to say that Idris is the only dependently typed language which solves these issues, only that it exists as a counterexample to your claims.

1. Idris does not have type inference at the top level, but it does infer types within the body of functions. It can even disambiguate identically-named functions with different types, something Haskell cannot do.

Moreover, Idris (and other dependently typed languages) can perform implementation inference, something which is much more powerful and useful to the programmer than type inference. It lets you work interactively with the Idris compiler whereby you specify a type and Idris creates a skeleton of the function for you, filled with holes that you replace with expressions. You can ask Idris what the type of a hole is and it will helpfully tell you the types of all other identifiers in scope. You can even ask Idris to attempt to fill in the hole for you, something it can usually do when there is only one possible expression with that type (occurs extremely often in practice).

Watch Edwin Brady’s demo of Idris 2 [2] to see this form of interactive development in action. Note that although Idris 2 isn’t finished yet, a great deal of these examples (such as zip) work in Idris 1.3.2 (the current release).

2. Idris lets you specify your program gradually. When problems arise, you can refine your types and Idris will guide you through all of the refactoring you need to do to enforce your new invariants.

3. Idris does not force all of your functions to be total, only the ones you want to use in types. Idris checks the totality of all your functions and can produce compilation errors based on totality annotations attached to your functions. In practice, It is not very hard to make every function in your program total, apart from the main loop.

4. Idris has modules and gives you powerful tools (such as views) that allow you to define a stable API at the module boundary. This means you can mess around with any of the types of private functions to your heart’s content, without any visible change to the outside world. This is entirely the opposite of ‘non local’ Java exceptions.

5. Idris has extremely powerful tools for working with state and writing concurrent programs which preserve invariants that cannot otherwise be checked easily with other tools. As for OO, well, that’s a concept so nebulous as to be almost incoherent at this point. In practice, Idris’s interfaces can dispatch on as many parameters as you want, unlike most OO languages which only do single dispatch.

[1] https://www.idris-lang.org

[2] https://www.youtube.com/watch?v=DRq2NgeFcO0

I’m not sufficiently familiar with Idris to comment — I will learn Idris in the future, hopefully in December.

Be that as it may, I’m somewhat sceptical of implementation inference: after all, you need to give a spec, exactly the thing you typically do not have in large-scale software engineering. The spec emerges at the end and often the program is the spec itself. Moreover, as far as I gather, implementation inference works by program synthesis, and unless Brady has had a massive breakthrough here, that’s an exponential search process, so it won’t scale much beyond inferring small programs like the zip function. (Still I think implementation inference is a useful tool- but it’s not magic.)

Definitely give Idris a go. I’m working through the book Type-Driven Development with Idris [1] by Edwin Brady, the author of Idris, and I’m having a ton of fun with it. Idris is a lot more practical than I thought. It fixes some annoyances I had with Haskell, such as record naming (Idris lets you reuse record names in different types, it resolves ambiguity by type checking).

It’s also eagerly evaluated (unlike Haskell), with a very simple Lazy annotation you can tack on to a type when you want laziness. I think most people will find this much easier to reason about than Haskell’s laziness everywhere.

[1] https://www.manning.com/books/type-driven-development-with-i…

> […] Idris is a lot more practical than I thought.

This may sound like a stupid question (but is probably necessary in a thread about the convergence of proof systems and static typed functional programming languages):

“How can i implement something like unix’ wc?”

So just a program that counts lines (forget words and bytes for now) of a file. This means we need to support (at least in some kind of standard library) basic IO stuff like files, processes and environments. If the language talks about how “simple” (in terms of code/term length) it is to define a binary tree … the integers >= 0 (the famous Nat = Zero | Succ Nat) example, or beware the Fibonacci sequence – most people won’t be impressed.

Because that’s what is associated with “programming language” (as compared to “proof assistents”).

Sorry that sounds a bit like a rant, but after 5 minutes of searching I couldn’t find an example how to read a file in Idris. The closest I found was: http://docs.idris-lang.org/en/latest/effects/impleff.html but it did not tell me how to open a file …

I don’t know about Idris, but the basic idea would be transform the file’s type from (a|b)* into (a* b)* a* where b is a newline and a is any non newline. Then your answer simply counts the first star of the transformed type (change that star to a counter exponent as you are forming the type).
Yeah … but how does that look? A “naive” wc implementation (counting the character ‘n’) is <50 lines of code (if we have modern file IO in the standard library) in a practical PL.

But maybe there is a too big difference between the concept of “files” (as seen by OS developers that view them as a byte-blob in a hierarchical name system) and language designers that want to represent a file as a collection of lines (and probably would like your command line interpreter not allow to call “wc” on a file that contains non UTF8 encoded data) – but this is still a pretty big step for our current computation environment (where I can call programs by name that are also files).

For many people (including me) a “practical” programming language needs an implementation that can generate (compile) an executable or interpret a file representation of its source. Otherwise its a proof assistant.

Maybe we have to merge the concept of data in terms (no pun indented) of bytes (types?) in-memory and collections of bits and bytes that have a lifetime longer than the program (if it’s not a daemon)?

I’m still a beginner at Idris, but here’s my attempt at a “naive” wc: module Main

wordCount : String -> IO ()
wordCount file = let lineCount = length $ lines file
wordCount = length $ words file
charCount = length $ unpack file in
putStrLn $ “Lines: ” ++ show lineCount
++ ” words: ” ++ show wordCount
++ ” chars: ” ++ show charCount

main : IO ()
main = do (_ :: filename :: []) putStrLn “Usage: ./wc filename”
(Right file) printLn err
wordCount file

It counts lines, words, and characters. It reports errors such as an incorrect number of arguments as well as any error where the file cannot be read. Here are the stats it reports on its own source code: $ ./wc wc.idr
Lines: 14 words: 86 chars: 581

Hope that helps.

Wow, thanks for the response – that actually looks like Haskell and if its performance is not that abysmal (let’s say slower than Python), that is pretty cool.

This actually looks quite reasonable – usage string included, I like it. Hats off to you, I will try to compile it to a binary now and do some benchmarking.

Is it really fair to call this an implementation of wc if it is just using a built in function for each case?
These might be reasonable functions to implement in a standard library, so having them it makes sense to use them. I rather feared Idris does not have some IO abstraction (reading/writing) files at all. Maybe I am conflating languages and libraries here, but they often go hand in hand.

My practicability expectations for languages implementing dependent types are pretty low.

If you implement by manipulating the types directly, a few operations with a loop (just for counting lines mind you). Again, I’m not sure how this would be done in Agda, Idris, or COQ, and the system designing is a bit different from other type systems.

But also, any discrete one dimensional problem like wc is going to benefit from a type system with types that resembles regular expressions.

Agreeing with you: I’m only familiar with Idris insofar as I’ve watched talks on it, but my recollection is that its implementation inference is a search across pre-listed implementations. Edwin Brady says that that’s surprisingly useful. I believe him and look forward to trying it out some time soon, but it is not as magical as we might guess.
Type inference in idris works well if the types of the terms in question are non dependent. The moment you introduce complicated types, type inference suffers, which is expected because even idris cannot break the fundamental nature of logic.
This is my point.

Adding dependent types comes with costs that you will always have to pay, even if you don’t use the power of dependent types. If you verify with e.g. program logic in the few real-world cases where formal verification is financially viable, then you only pay the price when you do the verification.

1. Problems with type inference is a problem introduced with polymorphism with implicit type variables and isn’t exclusive to dependent types, ‘f = show . read’ shouldn’t compile because of type ambiguity. What is so specific about dependent types in this regard?
Sure.

1. Dependent types, because they offer 100% certainty in the propositions they express (assuming they’re proven) suffer from the same problem all sound verification methods suffer from, and that is fundamental computational complexity costs of verifying something with absolute certainty. I summarized some of them here: https://pron.github.io/posts/correctness-and-complexity (there are some properties that are easy to verify called inductive or compositional properties — not surprisingly, the properties verified by simple type systems or Rust’s borrow checker fall in that category — but, unfortunately, most correctness properties aren’t compositional).

2. Dependent types rely on deductive proofs for verification, and deductive proofs (the “proof theory” part of a logic) are the least scalable verification method as they are least amenable to automation. That’s why model checkers (that provide proofs based on the “model theory” of the logic, hence the name model checkers) scale to larger programs/specifications. In addition, deductive proofs are much less forgiving of partial specifications. In other words, the contract (or type, in the case of dependent types) must express every property of the algorithm you rely on (unlike, say, concolic testing). This makes writing specifications very tedious. Unsurprisingly, deductive proofs are the least used formal verification technique in industry, and it’s usually used when all other options have failed or to tie together results obtained from model checkers.

Those were the theoretical limitations. Here are the practical ones:

3. Unlike contracts, type systems tie the specification method (types) to the verification method (type-checking, i.e. deductive proofs). This means that everything specified with a dependent type must be deductively proven, or it has to be specified in some other way (i.e. you also need contracts), or else you risk making even your simple types unsound (although I hear some people are working on relaxing that problem, and that’s what I mentioned in my previous comment). But properties differ greatly both in the confidence you need as well as the effort they require to achieve that confidence. Contracts give you the freedom to specify your program however you like, and then choose, for each property, which method you’d like to use to verify it — deductive proofs, model checking, static analysis, concolic tests. randomized tests, manual tests or even inspection, all without risking the soundness of the type system, on which the compiler relies to generate correct machine code. In short: dependent types are inflexible in that they force you to use a particular verification method, and that verification method happens to be the least scalable, most tedious one.

4. Because dependent types are so tied to the program, the program must be written with the specific properties you want to verify in mind. If, after writing the program, you want to prove other properties about it, you’ll probably need to rewrite it.

I don’t think human programmers have any hope of ever being able to maintain noncompositional program properties for nontrivial programs. A model checker may indeed be able to prove some program property without the programmer having to understand why or how that property holds (in contrast to using dependent types) – but I submit that this is not actually much of an advantage; a human working on the codebase will still need a deductive-proof-like understanding.

> This means that everything specified with a dependent type must be deductively proven, or it has to be specified in some other way (i.e. you also need contracts), or else you risk making even your simple types unsound (although I hear some people are working on relaxing that problem, and that’s what I mentioned in my previous comment). But properties differ greatly both in the confidence you need as well as the effort they require to achieve that confidence. Contracts give you the freedom to specify your program however you like, and then choose, for each property, which method you’d like to use to verify it — deductive proofs, model checking, static analysis, concolic tests.

What practical distinction are you drawing here? Surely with any approach you either have a fully sound proof of a given proposition, or you have a potentially unsound proposition that you’ve tested at whatever level you feel appropriate. Like, if I have a SortedList type that can produce an assertion that, given n <= m, l[n] <= l[m], either the constructors of SortedList bake that in as a deductive proof, or they unsoundly assert it and I apply some level of static analysis and/or testing to them – how's that any different from what I'd do using any other technique?

> 4. Because dependent types are so tied to the program, the program must be written with the specific properties you want to verify in mind. If, after writing the program, you want to prove other properties about it, you’ll probably need to rewrite it.

I’m not entirely convinced that this is any harder than any other approach – you’d have to propagate the new properties you were interested in through the code, sure, which would likely involve changing the types that the relevant points in the program were annotated with. But isn’t it much the same process with a detached proof of the program property?

> I don’t think human programmers have any hope of ever being able to maintain noncompositional program properties for nontrivial programs

We do it all the time. Whatever it is that you want your program to do, and any interesting functional property of it, is a non-compositional property. I know that because if it were compositional, you wouldn’t need to write the program. A property P is compositional with respect to terms A and B and composition ∘, if P(A) and P(B) implies P(A∘B) (or more strongly, iff). Compositional and inductive are the same thing if A and B can be any language term. For example, type preservation is a compositional property for, say, ML, and memory safety is a compositional property for, say, Java.

> a human working on the codebase will still need a deductive-proof-like understanding.

I think this is not true, either in principle or in practice: https://pron.github.io/posts/people-dont-write-programs I know I certainly don’t have anything close to a deductive understanding of the program I’m working on, but I do have such understanding of the specific pieces of code I’m writing. This is insufficient for a useful formalization of a program, because correctness properties aren’t compositional, meaning that, in general, a sound specification of a program component would either be insufficient or unhelpful (i.e. no easier than the code itselg) for the writing of other components. Sure, in some cases it can be, but those are usually the easy cases. We need help with the hard ones.

I totally agree that we try to isolate components and expose them with APIs, but 1. the specification here is imprecise and “soft”, and 2. as the results in my “correctness and complexity” show, this does not necessarily make deduction easier (https://news.ycombinator.com/item?id=20714920).

But what you should be asking is this: as there haven’t been any major relevant breakthroughs in formal logic in the past couple of decades, nor in any relevant proof techniques (we still use deduction for safety properties), our deductive proof capabilities aren’t significantly higher now than they were 30 years ago. If it’s all a matter of discipline, then, how come we’re still not able to affordably (or at all, if the programs are not tiny) verify programs with deductive proofs?

> how’s that any different from what I’d do using any other technique?

I’m not sure I understand the question, but the last thing you want is a rich specification language that forces you to use a particular verification technique, and that’s exactly what dependent types do. You want to specify whatever helps you, and then you want to verify using whatever technique you need or can afford, to reach the appropriate level of cost/benefit for your requirements.

> But isn’t it much the same process with a detached proof of the program property?

To be honest, I don’t know that dependent types are harder than “detached” deductive proofs, but this is something that Neel Krishnaswami (who’s an advocate of dependent type advocate) mentioned to me as his observation (he’s also the one who told me about attempts to build “gradual” dependent types to alleviate some of the other downsides I mentioned.) This can also be an upside, because it may making writing some proofs easier, as you write the program with them in mind. Either way, I think deductive proofs are, at the moment, too far behind other verification techniques to contrast different kinds of deductive proofs. But this certainly strikes me as an engineering problem.

I also mentioned this in my other post, but the CompCert compiler is a highly non-trivial example of a program verified using dependent types, and which would not really be possible using automated methods. Of course, not very affordable, since it took years of work from people with expert knowledge. But you get what you pay for, although the price isn’t worth it for most real programs in industry.

Furthermore, you don’t necessarily need to have an intuition about why the program you are working on is correct to formalize it using dependent types. You just need an intuition about a method for CHECKING that the program is correct.

The four-color theorem was proved in Coq, and it required considering over a thousand cases. No one can keep a proof of this magnitude in their head, but that didn’t prevent it from being formalized in a proof assistant. What they did was they just formalized the method for checking and then used that. You can similarly formalize the kind of reasoning that a model checker (or any other automated method) does and then use it in your proof assistant.

> but the CompCert compiler is a highly non-trivial example of a program verified using dependent types, and which would not really be possible using automated methods.

First of all, it wasn’t really verified using dependent types; it was verified in Coq. Coq’s theory is, indeed, based on dependent types, but Coq is used more like a traditional proof assistant (HOL or TLA+) than a dependently typed programming language like Agda or Idris.

Second, CompCert, while certainly non-trivial, is also about 1/5 the size of jQuery, which means it’s at least one to two orders-of-magnitude away from the average business application.

> You can similarly formalize the kind of reasoning that a model checker (or any other automated method) does and then use it in your proof assistant.

Yes. In the few cases where deductive proofs are used in industry, they’re often used to tie together results from model checking.

> We do it all the time. Whatever it is that you want your program to do, and any interesting functional property of it, is a non-compositional property. I know that because if it were compositional, you wouldn’t need to write the program.

I don’t think I understand what you’re saying here?

Say I want to write a program that tells me the duplicated members of a list. My program might call helper routines like, “stable sort the list,” and then “collect adjacent duplicates.”

I might want to prove my program is “correct”, e.g., a statement like, “the output is a list that contains exactly those elements of the input list whose duplicity exceeds 1”). To do this, I can first reason about the helper functions in isolation, then compose these results about the helper functions to establish that my desired property.

Why is this not an interesting property, or a program that I don’t need to write?

> Why is this not an interesting property, or a program that I don’t need to write?

It’s both of those things, but it’s not compositional, because “finds duplicate members in a list” is not a property of either of your components. If it were a property of one of them, you wouldn’t need the other.

All proofs proceed in stages, and “compose” known facts. But that’s not the meaning of compositional, and the fact that the property you want easily arises from the properties of the component is luck. I gave an example on this page where this is not the case (the `foo bar` example), and it is provably not the case in general.

Thanks. I think we may differ in our understanding of “compositional.”

My understanding of “compositional” reasoning is something like, an approach where we (1) establish some properties about our helper functions, and then (2) reason about larger functions using these properties, instead of the definitions of the helper functions.

It seems like you have a different notion of compositional, which I’m still not sure I can articulate in a clear way… something more akin to transitivity?

A property P is compositional with respect to terms A and B and composition ∘ if P(A) ∧ P(B) ⇔ P(A∘B); the equivalence is sometimes replaced with implication in one of the directions (this is a special case for a predicate, i.e. a boolean-valued function; a function F is compositional if F(A∘B) = F(A) ⋆ F(B)).

What you’re saying is that proofs can be composed. That is true, but unhelpful, as it doesn’t make proofs easy. All proofs work this way. Compositionality does (in fact, all properties that are easily proven by type inference are compositional; type safety itself is compositional). I’ve seen somewhere compositionality described as the opposite of emergence[1], which makes sense. A property is emergent in a composition if it is not a property of the components.

[1]: https://julesh.com/2017/04/22/on-compositionality/

Thanks, this seems like a perfectly coherent definition of compositional. I think my earlier confusion was a purely terminological matter.

Regarding “What you’re saying is that proofs can be composed. That is true, but unhelpful, as it doesn’t make proofs easy.” I agree that the ability to compose proofs does not make proof easy. But I would push back somewhat against the idea that this is unhelpful.

The ability to prove, in isolation, the relevant properties about some entity (e.g., a function, a hardware module, or what have you), and then to make use of these properties, without ever looking inside the entity again, to prove properties of some larger system, seems to me to be absolutely essential for scaling proofs to large systems.

This ability is of course completely common in tools like interactive proof assistants, but is less common in more automated procedures.

> The ability to prove, in isolation, the relevant properties about some entity (e.g., a function, a hardware module, or what have you), and then to make use of these properties, without ever looking inside the entity again, to prove properties of some larger system, seems to me to be absolutely essential for scaling proofs to large systems.

I understand the motivation, and of course you’re right that it would be “essential”, it’s just that it has been mathematically proven (see my post) that this is not the case. Results in computational complexity theory, that studies how hard it is to do things, show that “making use of these properties, without ever looking inside the entity again” does not, in fact, make verification easier. More precisely, the difficulty of proving a property of A_1 ∘ … ∘ A_n is not any function of n, because we cannot hide the inner details of the components in a way that would make the effort lower even though that is what we want to do and even though it is essential for success.

In practice, we have not been able to scale proofs to large systems, which means that maybe there is some truth to those mathematical results.

> This is insufficient for a useful formalization of a program, because correctness properties aren’t compositional, meaning that, in general, a sound specification of a program component would either be insufficient or unhelpful (i.e. no easier than the code itselg) for the writing of other components. Sure, in some cases it can be, but those are usually the easy cases. We need help with the hard ones.

Or we need to transform the hard cases into easy ones, which is what the whole dependent typing project is about: turning program correctness into just a matter of type safety, at which point it is just compositional. Maybe there are important cases that we can’t solve this way, but I’ve not encountered any yet.

> But what you should be asking is this: as there haven’t been any major relevant breakthroughs in formal logic in the past couple of decades, nor in any relevant proof techniques (we still use deduction for safety properties), our deductive proof capabilities aren’t significantly higher now than they were 30 years ago. If it’s all a matter of discipline, then, how come we’re still not able to affordably (or at all, if the programs are not tiny) verify programs with deductive proofs?

I think the article itself is a pretty direct answer to this, in as much as it outlines the programme of work that’s necessary to enable us to use these techniques in practical programs with more interesting propositions. I’d argue that we already do huge amounts of program verification with deductive proofs – that is, typechecking – many orders of magnitude more code is typechecked than has any other formal method applied – and the frontier of progress is how much we can represent important program properties within that. In terms of theoretical proof techniques, the dependent Haskell type system is indeed much the same as, say, the Java type system. But in terms of what properties are practical to encode and verify that way, the dependent Haskell type system is streets ahead. We’ve only had practical dependent-typed languages very recently – arguably we still don’t. And many of the techniques for encoding important properties in our type systems (whether dependent or not) are also relatively new (e.g. the monadic region papers are relatively recent).

> the last thing you want is a rich specification language that forces you to use a particular verification technique, and that’s exactly what dependent types do. You want to specify whatever helps you, and then you want to verify using whatever technique you need or can afford, to reach the appropriate level of cost/benefit for your requirements.

Assuming that we do always have an escape hatch, I don’t see that using types forces anything any more than any other approach to specification? If I want to prove a larger overall proposition using types then of course I have to express any smaller propositions that I depend on in the language of types (as would be the case with any proof technique – there’s no getting away from needing to express our premises and conclusions), but I still have the freedom to verify that smaller proposition using any technique or none.

> Turning program correctness into just a matter of type safety, at which point it is just compositional.

It isn’t. “Truth-preservation” in all deduction systems is also compositional (if the premises are true, so is the conclusion); that doesn’t make the propositions compositional with respect to program components (if I conclude that X > 3 and Y > 2, then I can also conclude that X + Y > 5; the conclusion preserves the truth of the premises, not the property itself, as neither X nor Y is greater than 5). If you have a property of a program that isn’t a property of a component, then the property is not compositional (note that type safety is: A∘B is type-safe iff A and B are; but type safety here is merely the inference step, similar to truth preservation). If it’s not compositional, it can be arbitrarily hard to prove.

> Maybe there are important cases that we can’t solve this way, but I’ve not encountered any yet.

If you’re able to affordably prove any correctness property of any real-world program, this knowledge is worth billions. I am not aware of anyone who’s ever made a similar claim.

> I’d argue that we already do huge amounts of program verification with deductive proofs – that is, typechecking – many orders of magnitude more code is typechecked than has any other formal method applied – and the frontier of progress is how much we can represent important program properties within that.

But the properties proven are easy. It’s like saying, I can jump 50cm, so jumping all the way to the moon is just a matter of degree. No. There are real, fundamental, theoretical barriers between proving compositional properties and non-compositional ones (or those that require one or two inferences and ones that are more); between proving properties with zero or one quantifier, and properties with two quantifiers.

Also, while you can be pleased with Haskell’s formal accomplishments, have you compared them to those of model checking in safety-critical real-time systems? Because once you do, you realize it doesn’t look so good in comparison.

And your comparison of Haskell and Java is particularly interesting. You say that Haskell’s type systems is “streets ahead”. I agree (although I think this comes at a great cost). However, Haskell correctness or development cost is not “streets ahead” or even noticeably ahead. This shows that strengthening some deductive property is no guarantee to improving quality. Why? Because software correctness is very complicated. Its theory is complicated, and its practice is even more complicated.

> I don’t see that using types forces anything any more than any other approach to specification

I think depdedent types are just redundant. You can get all or most of what give you without the inflexibility and the bad default using other means.

But look, I’m personally not too fond of dependent types in particular and deductive proofs in general, and consider them something you use when all else fails, and you have the money to spare. Others can be as bearish on model checking or concolic testing, paths that I think are more promising and I believe are seeing a bigger research investments. But what is certain is that no one has any idea whether we will find a way to drastically increase software correctness in the next twenty years, let alone which method will get us there. The one thing you can be sure of anyone who tells you they do is that they are simply unfamiliar with the different avenues explored or with many of the challenges involved (including social challenges).

> And your comparison of Haskell and Java is particularly interesting. You say that Haskell’s type systems is “streets ahead”. I agree (although I think this comes at a great cost). However, Haskell correctness or development cost is not “streets ahead” or even noticeably ahead.

I’d argue correctness is in fact streets ahead: orders of magnitude fewer defects (in terms of the code not doing what the programmer intended; the programmer not understanding what some other human intended is still as big a problem as ever) in practice. It’s interesting how little value that actually brings in ordinary commercial development, but the correctness techniques will be there when the industry is ready for them. Development cost is much harder to measure; I’d argue that some systems developed with Haskell-like type systems would simply not have been possible to make without them, but that’s the sort of question that’s very difficult to answer objectively.

> orders of magnitude fewer defects (in terms of the code not doing what the programmer intended; the programmer not understanding what some other human intended is still as big a problem as ever) in practice.

I don’t see that, and I’m not aware of any report that reports that.

> It’s interesting how little value that actually brings in ordinary commercial development, but the correctness techniques will be there when the industry is ready for them.

Except that in some cases it’s already true. Companies like Amazon and Microsoft are seeing great value in using TLA+ and other formal methods, and now management encourages engineers to use them. I just don’t see it for Haskell. Facebook is a good example, as they invest a lot in correctness research (both formal methods, like Infer, as well as inductive methods for automatic bug fixing and codegen[1]) and they also use some Haskell. But they’re pushing hard on formal and inductive methods, but not so much on Haskell. I don’t think correctness is a problem that cannot be addressed, and I believe there are techniques that address it. I just don’t see that Haskell is one of them.

> Development cost is much harder to measure; I’d argue that some systems developed with Haskell-like type systems would simply not have been possible to make without them, but that’s the sort of question that’s very difficult to answer objectively.

You cannot claim that your technique is important and at the same time that it’s effects are hard to see. I addressed that here: https://news.ycombinator.com/item?id=20718734

[1]: https://youtu.be/9QvcUNBPw4A https://youtu.be/2WIwx02t4PQ

Sure, Haskell’s perfectly fine. But so are most of the popular languages, and those companies see more value in those.
First of all, I would agree with you that full formalization of program correctness using dependent types is generally extremely costly in terms of time and money, and that there are often other formal methods which might get you results that are “good enough” while requiring much less effort and expertise. However, I do think that there are some inaccuracies in some of the statements you are making about proof assistants based on dependent types.

One major benefit that proof assistants have over other formal methods is that they can be used to prove essentially arbitrary properties of programs, which is not true of any automated methods. Basically, if we as humans are able to convince ourselves that a program has a certain property, then we could also show this using a (dependently-typed) proof assistant.

(Yes, theoretically all proof systems have limitations due to Gödel’s incompleteness theorem, but this limitation applies just the same to ALL formal methods. Furthermore, you’re not that likely to run into it in practice unless you are doing metatheory, and even there many interesting statements CAN still be shown.)

Consider the formally verified C compiler CompCert, where the assembly output of the compiler has been proven to have the same behavior as the input source code. A proof like this simply cannot be done using automated methods currently, and requires human assistance. This kind of correctness proof is also highly non-compositional, since the correctness proof of a compiler does not mirror the structure of the compiler very closely. Hence this also invalidates your point about dependent types not being able to handle non-compositional properties very well. There are plenty of other examples, like the seL4 microkernel.

I also do not understand what bearing computational complexity results have on humans manually formalizing their informal intuitions about program correctness in a proof assistant. Humans do not work on arbitrary problems where they have no intuition at all about why a program is correct, hence referring to worst-case computational complexity results does not make much sense here. In fact, if you have no intuition about why a program works, then how did you manage to write it in the first place? You surely did not write random garbage in the hopes that you would get a working program.

You might say that there are programs you can check using a model checker, but which humans cannot reasonably keep in their heads or form an intuition about, and therefore these proofs cannot formalized in a proof assistant. But what you can do is implement a model checker in your proof assistant and then show that the model checker is correct. After you’ve done that, you can simply use the model checker to do your proofs for you. And of course it should be possible to formalize the model checker itself, since the people that wrote the model checker must have some intuition about why it works and produces the correct results.
In this sense, proof assistants subsume all other formal methods, since they can be “scripted” to include them.

Furthermore, it is generally not the case that you have to rewrite your program in case you want to prove some more properties about it. You may be thinking of a programming style where the type of a program encodes all the properties you are interested in, but you can avoid this style where necessary, by separating your program and its correctness proof.

> One major benefit that proof assistants have over other formal methods is that they can be used to prove essentially arbitrary properties of programs, which is not true of any automated methods.

It is true of model checkers.

> There are plenty of other examples, like the seL4 microkernel.

Yep, there are plenty of such examples, I’d say 2 or 4, and they’re all less than 10KLOC, i.e. 1/5 of jQuery.

> I also do not understand what bearing computational complexity results have on humans manually formalizing their informal intuitions about program correctness in a proof assistant.

Because humans can’t outperform Turing machines.

> Humans do not work on arbitrary problems where they have no intuition at all about why a program is correct, hence referring to worst-case computational complexity results does not make much sense here.

For one, the results aren’t just worst-case. One of the most important ones is on parameterized complexity. Second, problems that are very hard in the worst case have rarely been proven easy in the “real world” case, unless the real world case was somehow categorized in a neat subset, and this isn’t the case here. It’s possible that we’re usually far from the worst case, but if so, automated methods could also exploit that, and still be better than manual ones in most cases.

> In fact, if you have no intuition about why a program works, then how did you manage to write it in the first place? You surely did not write random garbage in the hopes that you would get a working program.

This is a common assumption, but a wrong one: https://pron.github.io/posts/people-dont-write-programs

> In this sense, proof assistants subsume all other formal methods, since they can be “scripted” to include them.

The same can be said about all other formal methods. Indeed, various sound static analysis tools like Frama-C or optionally fall back to manual proofs when needed. In the end, though, in practice deductive proofs don’t scale as well as other methods. It’s a good tool to have in your toolbox, but you don’t want it as your default. Your model checker could just as easily ask for help from a proof assistant. So it’s not the case that any one method subsumes all others, but you could combine them. Which is why dependent types are problematic, as they make these combinations harder than, say, contract systems, that are flexible in the choice of verification method.

> You may be thinking of a programming style where the type of a program encodes all the properties you are interested in, but you can avoid this style where necessary, by separating your program and its correctness proof.

True, but that’s not the style discussed here, of dependent Haskell or Idris.

I agree that verifying programs using dependent types is impractical for large programs. I think that the reasons are mostly practical rather than theoretical, however. There’s too little benefit for too great a cost if you want essentially 100% correctness guarantees for every single part of a large, existing program.

Humans regularly do “outperform Turing machines.” Most of the complicated mathematical proofs that have been produced by humans, cannot currently be automatically produced using an algorithm in reasonable time. Pick almost any non-trivial mathematical theorem, and you would be hard pressed to write a program that could prove it in a reasonable amount of time given just a list of the axioms. In theory, humans might not be able to outperform TMs, but in practice they regularly do, as attested to by the large amounts of mathematical proofs that we have produced by hand.

Model checkers also outperform humans for some tasks, but then again this is because human intuition has been used to write the model checker in the first place. My point is that using theoretical computer science results to talk about the difficulty of finding mathematical proofs in real-world cases is a bit of a non-sequitur. Yes, the Linux kernel would be practically impossible to verify completely using semi-automated tools, because it would take too much time and effort. But is this because of some theoretical result about the model checking problem? I don’t think so.

One difference between having a model checker call a proof assistant and using a proof assistant to implement a verified model checker is that if you write a verified model checker implementation, you can be almost 100% certain that it’s bug-free, because it creates proof terms that can be independently checked by the (ideally small and simple) proof kernel of the proof assistant. You don’t have to trust that the model checker has been implemented correctly. This difference does not matter that much in practice, however, since model checker implementations probably receive a lot of testing and are quite trustworthy.

You’re right that I am mostly talking about Coq, and that a different style tends to be used in Haskell/Idris. Note that dependent Haskell is actually logically inconsistent for backwards-compatibility reasons anyway, so there’s not much to lose there by cutting a few corners for pragmatic reasons. It is not necessary to verify every single part of the program. Dependent types in Haskell gives you the freedom to do more. You can choose not to use that freedom where it doesn’t make sense.

> Most of the complicated mathematical proofs that have been produced by humans, cannot currently be automatically produced using an algorithm in reasonable time.

That’s true, but humans have not done that in a time frame that’s even remotely reasonable for verifying programs. Automatic verifcation methods outperform humans by orders of magnitude, on average, as far as program verification is concerned. My point is that if it is true that real-world programs are far from the worst case — and note that we do not currently have much evidence to support that belief, because we have not been able to regualrly and affordably verify programs with deductive proofs — then automated tools will be able to exploit that as well (once we know what “that” is, assuming “that” exists at all).

> My point is that using theoretical computer science results to talk about the difficulty of finding mathematical proofs in real-world cases is a bit of a non-sequitur.

Why? That’s the entire point of the entire field of computation complexity. But don’t think I’m pessimistic. I just think software verification is an extremely complex subject, and anyone who says they can certainly tell now which approach will “win” simply does not know what they’re talking about.

As Fred Brooks wrote, realizing the limitations and understanding the magnitude of the challenge is not pessimistic. Physicists don’t feel that the speed of light is a non-sequitur. Quite the opposite: understanding the limitations and challenges allows us to focus our efforts where they’ll do the most good, rather than bang our heads against the wall just because that’s what we happen to know how to do.

> Dependent types in Haskell gives you the freedom to do more. You can choose not to use that freedom where it doesn’t make sense.

Given that only 0.01% of programmers use Haskell in production I don’t really care what approach Haskell uses. But as someone who takes verification seriously, I think contract systems give you the freedom to use deductive proofs — just as dependent types do — as well as other methods, which currently have proven much more practical and useful in the real world.

What I meant by non-sequitur is that I agree with your premises (that certain problems are computationally intractable or undecidable, and this is good to know), and I agree with your conclusion (full verification using deductive methods/dependent types is unfeasible for large software projects). However, I do not agree that the conclusion is true because of the premises.

That is, I don’t think one can say that the reason why human-aided verification is unfeasible for some projects (say, the entire Linux kernel) is because of theoretical limitations. Yes, verifying the Linux kernel would take too much time, but not because of any specific theorem in theoretical computer science.

The reason why I am emphasizing this point is that I believe that referring to theoretical results in discussions like this makes it seems as though there is some kind of fundamental limitation that prevents people from verifying the Linux kernel in a proof assistant. Even though it is quite a jump to conclude that from theoretical results, especially when human intuition is involved.

I would say the reason that people haven’t verified a program the size of the Linux kernel is that it’s just too much work. Just like if I were to try to write the Linux kernel from scratch on my own, it would take me years and years. But not because of a mathematical theorem. Some things are just a lot of work without there being a deep reason behind it.

We can go much further with deductive verification combined with automated verification compared to automated verification alone, precisely because human intuition can be used to guide the automated tools.

EDIT: To give another example, I would have no problem with the following reasoning:

A SAT solver that is efficient for arbitrary SAT problems cannot be implemented, because SAT is NP-complete (assuming P =/= NP).

However, I would have a problem with the following line of reasoning:

Intel CPUs cannot be fully verified by people because SAT is NP-complete.

> That is, I don’t think one can say that the reason why human-aided verification is unfeasible for some projects

That’s not exactly what I said, or at least not what I think. I think that full, 100% confidence verification (the method is irrelevant) has some serious challenges as suggested by those theorems, and that it is, therefore, unjustified to believe that there would be a clear solution any time soon. One way to circumvent those challenges is, for example, to reduce the level of confidence.

> Yes, verifying the Linux kernel would take too much time, but not because of any specific theorem in theoretical computer science.

I’m not saying it definitely is, but I don’t think anyone can say it isn’t, either (well, technically, a theorem won’t likely apply to a specific piece of code, except as a sample from some distribution).

> The reason why I am emphasizing this point is that I believe that referring to theoretical results in discussions like this makes it seems as though there is some kind of fundamental limitation that prevents people from verifying the Linux kernel in a proof assistant.

I think that the fundamental limitation (about the general case) suggests great challenges. And again, the specific method doesn’t matter as long as it’s sound. A proof assistant and a model checker are both bound by those results, as both are sound.

> Even though it is quite a jump to conclude that from theoretical results, especially when human intuition is involved.

I’m not sure human intuition is very relevant. We’ve had human intuition and proof assistants for the past 40 years, and still we haven’t been able to prove software beyond a very small size. So I think that at the very least we should consider the possibility that if we can jump two feet up, then even though it seems like jumping all the way to the moon is a matter of degree, there just could be some fundamental obstacles in the way. I’m not saying we can’t overcome them, but I’m saying we can’t assume they’re not there.

> But not because of a mathematical theorem. Some things are just a lot of work without there being a deep reason behind it.

OK, but people have written the Linux kernel, and that’s a lot of work, too. Don’t you think that if a mathematical theorem says that, in the worst case, A is much harder than B, and that in practice, A is, indeed, much harder than B, then maybe theory has something to do with it?

> We can go much further with deductive verification combined with automated verification compared to automated verification alone, precisely because human intuition can be used to guide the automated tools.

I think that’s speculation and that it’s unjustified by either theory or practice. It’s unjustified by theory because if human intuition can consistently prove successful (to the point of being an industry technique), then whatever method is used suggests some recognized easier-than-worst-case subset, and so automated tools could take advantage of that, too.

It’s unjustified by practice because you could have said the same thing thirty years ago, and here we are now, and automated tools are doing better on average (it’s kind of complicated to compare, because there are cases where either one is much better than the other, but overall, when we look at “verification throughput”, automated methods verify more software more cheaply). It is true that automated tools have some limitations that we haven’t been able to break through, but so do manual proofs.

You could indeed make the case that the intractability/undecidability-type theorems suggest that the problem of program verification is inherently difficult, and not something that can be solved in most cases using a trick. It requires some deep thinking in many cases, the same kind of thinking required to find complex mathematical proofs.

Currently humans are much better at proofs that require this kind of deep thinking. Automated tools are better for proofs that are based on a simple structure, like a large enumeration of cases (as in SAT solving and model checking, oversimplifying things a bit). This is the very reason they are automated: the reasoning performed by them is so simple/well-structured that we are able to write a program to do it for us.

However, many proofs fall outside of this easy-to-automate subset, and require human assistance. Human intuition does consistently do better than automated tools, but there is no single easily recognizable “trick” or “method” to it, which makes it hard to implement in a tool.

In my view, the reason that theoretical results do not apply very directly to human reasoning is precisely that human reasoning is “intuitive” and hard to capture in a program. We are not really Turing machines in any practical sense. We are more adept at proving things (perhaps creating automated tools to help us along the way) than any known algorithm.

I absolutely agree that automated tools make much more sense for your average industry application, but if you want to be able to use the full power of human intuition, instead of restricting yourself to program properties that are easy to prove automatically, then there is no way around manual proofs.

> Human intuition does consistently do better than automated tools, but there is no single easily recognizable “trick” or “method” to it, which makes it hard to implement in a tool.

Except in software verification, the facts on the ground are that it is the automated tools that do better than humans. We’re not talking hypotheticals or speculation here. If you’ve written software proofs you won’t find this particularly surprising. They’re tedious and frustrating, but rarely require much ingenuity. Attention to detail plays a far more important role [1]. They’re not at all like proving mathematical theorems. Software proofs are to math proofs what painting a hospital is to painting the Sistine Chapel.

> In my view, the reason that theoretical results do not apply very directly to human reasoning is precisely that human reasoning is “intuitive” and hard to capture in a program.

Perhaps, but so far humans do not seem to outperform the expectations. So you’re counting on a human ability that, to date, we’ve not seen. In fact, humans rarely if ever beat computational complexity expectations. SAT solvers do much better than humans at supposedly-NP-complete problems, for example, and so do various approximation algorithms. Why doesn’t our intuition help us outperform algorithms there?

And again, if intuition were the decisive factor, how come we’re doing such a terrible job? If algorithms beat us today, what kind of boost in intuition do you expect that will allow us to beat them tomorrow?

Look, formal proofs are great. They’re clearly the gold standard. So I encourage everyone here who thinks proofs are the future of software correctness to try them out to write correct large and complex programs. The problem is that they’ll fail, and either abandon formal methods altogether, or worse, waste their time proving what they can at the expense of verifying what they should.

> but if you want to be able to use the full power of human intuition, instead of restricting yourself to program properties that are easy to prove automatically, then there is no way around manual proofs.

Maybe. It’s certainly possible that one day we’ll figure out how to make manual proofs work reasonably well for software verification. It’s also possible that by then automated tools will get even better, especially considering that, as they’ve shown greater promise over the past few decades, that’s where most of the research is focused. Sure, there will always be some niches where tedious manual work is required, but I see no indication that that niche would grow beyond the size it occupies now, an important yet small part of formal verification.

[1]: https://brooker.co.za/blog/2014/03/08/model-checking.html

I wonder why clojure’s spec idea isn’t being discussed that much.

Having wrestled with scala’s type system for like a decade, it all seems so pointless and stupid when compared to spec. At the time I thought this scala type system stuff was so advanced and scientific. So embarrassing to ppl who push this stuff.

Because clojure’s spec is nothing new? programmers in dynamic langs have being doing it for ever, spec just gives you a framework to save on boilerplate. Also, spec’s runtime performance is atrocious.
> Because clojure’s spec is nothing new?

You’re absolutely right, but dependent types are nothing new, either. They’re just new to Haskell. So contract systems aren’t new, but they’re new to languages some people actually use (or rather, that a contract system is used by a non-vanishingly small set of programmers is new).

> Also, spec’s runtime performance is atrocious.

Have you tried writing deductive proofs with dependent types? I think the point was about using spec for development, not for runtime validation.

Can you give some concrete examples on older spec-like things in other languages? It’s not really clear what aspects of spec you refer to with this.

It’s easy to miss the details that give it a quality of its own, like the composability, property based testing support, and preservation of extensibility in APIs if you just have a quick look at spec.

> spec just gives you a framework to save on boilerplate.

> spec’s runtime performance is atrocious.

How is that though? If you are reading stuff off of database/api you still have to do the same datavalidation even in scala, for example.

I might be misunderstanding what you are saying but what runtime data validation have specifically do with spec.

Every such thread we have pron disliking type theory vemenently. It just makes me chuckle now.
Not only do I not dislike type theory vehemently, I really like some of its uses (like linear types; don’t believe me? I praised them in a recent talk: https://youtu.be/r6P0_FDr53Q?t=553). I’m not too fond of deductive proofs as a software verification method because so far it manages to scale worse than other formal methods (that don’t scale that well, either). What I strongly dislike is:

1. Fetishization of certain formalisms, particularly combined with a rather limited understanding of said formalisms or formal systems in general.

2. Appeals to “the power of math” (mathiness) when discussing the extremely complex disciplines of engineering, especially when combined with a very limited knowledge of math.

3. Claims of the sort, “I like A more than B, hence I conclude that A is the best thing in the world”, especially when combined with little knowledge of C or D (or A and B, really).

And I don’t like cults.

It just so happens that those qualities are common among those who claim to particularly like type theory or functional programming (or, at least, among those who publicly talk or write about them). I don’t think it’s accidental, either. I think that some luminaries in that field encourage that kind of cult-of-magical-mathiness behavior, so this sort of thinking has become a part of the ethos of that community.

There is more cult than cult leader. The people who enjoy functional programming find it vastly enjoyable/better/whatever. I fully admit it’s hard to convey this and hard to demonstrate this empirically because the cost and scale of an adequate experiment would be prohibitive.

The biggest thing I don’t like about the model checkers and other strategies is:

1. No separation of proof and proof search. Deductive proofs are a good lingua franca. I want to mix and match techniques to derive the proofs and have them work together. Many of the things you like have potential to be repurposed as such “tactics”.

2. Composition of verified programs. Functional programers really value composition and the things you like tend to be used for “whole progra verification”. At the very least, while one can verify individual parts, there is no mechanized way to combine those proofs.

I also think the evidence already points away from your scale argument. To put it simply, right now Agda is good for toy stuff, things you like are good for medium stuff, but only Coq and Isabelle/HOL is used for the biggest projects like Cake ML and Comp Cert. Both of those are proof theoretic tools.

> I fully admit it’s hard to convey this and hard to demonstrate this empirically because the cost and scale of an adequate experiment would be prohibitive.

Ah, but you see, here we have a problem. If you claim that your favorite technique has a big measurable impact on software development costs, then observing them shouldn’t be hard at all, even without careful empirical studies. There are many problems with free markets, but applying selection pressures that quickly pick adaptive techniques is something markets are actually good at. And if your claim is that the benefit is not easily measurable because it’s not so large and decisive, then why is your claim important at all?

And, BTW, I think the preference for a programming style is very personal. I learned and used FP before I learned OOP, and I like both (maybe for different things), but I can’t say I have such a strong preference for either. They’re both OK.

> but only Coq and Isabelle/HOL is used for the biggest projects like Cake ML and Comp Cert. Both of those are proof theoretic tools.

Coq and Isabelle have so far only been used to verify deep properties of tiny programs, up to ~10KLOC; that’s 1/5 of jQuery. Even then the effort was huge. Model checkers have been used on much larger programs, and, at least for small programs, they’re commonly and regularly used in industry (especially in safety-critical hard realtime systems).

> If you claim that your favorite technique has a big measurable impact on software development costs, then observing them shouldn’t be hard at all, even without careful empirical studies. There are many problems with free markets, but applying selection pressures that quickly pick adaptive techniques is something markets are actually good at.

I’m not convinced this is true in an environment where every problem is unique and failure rates are already high – humans are known to be very bad at working with low probabilities. Even if we’re talking about an order-of-magnitude difference, who could distinguish between a startup – or an internal new project – with a 1.5% success rate versus a 15% success rate?

That goes doubly when we’re talking about costs: a friend who’d worked for several investment banks observed that they all had roughly the same number of developers overall, but the distribution of those developers was radically different: bank X would have 5 programmers working in area A and 50 programmers working in area B, while bank Y would have the opposite, even though both were solving the same problem in area A and area B. Could you tell whether Google or Facebook has 500 or 5000 or 50000 developers from the outside? Does Uber have 50 or 500 or 5000? (I genuinely don’t know).

To my mind the most compelling case: WhatsApp apparently did succeed with an order of magnitude fewer developers than comparable companies – and the industry responded with a collective shrug. (My interpretation is that the industry is still so immature – or so fast-moving – that an order of magnitude cost saving still doesn’t matter; software produced at 10x what it “should” cost can still be amply productive/profitable, it’s easier to increase profits by increasing revenue than by cutting costs).

> who could distinguish between a startup – or an internal new project – with a 1.5% success rate versus a 15% success rate?

But if we can’t distinguish the results, do they really matter? After all, the whole point of changing how we work is to get a measurably better result. If it’s not better, why change? (of course, assuming there’s a positive effect at all). Something cannot be at once very important and unnoticeable.

> To my mind the most compelling case: WhatsApp apparently did succeed with an order of magnitude fewer developers than comparable companies – and the industry responded with a collective shrug.

A shrug? You can draw a straight line from that to the work I’m doing now in OpenJDK, at the very heart of the mainstream. I know that because WhatsApp’s technical success is partly how I sold the idea. Similar efforts are taking place in most other mainstream software platforms. We very much took notice, and we were very much convinced, and that’s why we’re doing what we’re doing. And the contact we’ve had with the world’s biggest software companies and software-dependent companies (including your own employer) about this work shows us that they’ve taken notice as well.

> But if we can’t distinguish the results, do they really matter? After all, the whole point of changing how we work is to get a measurably better result. If it’s not better, why change? (of course, assuming there’s a positive effect at all). Something cannot be at once very important and unnoticeable.

It can be at once very important and poorly understood. I’d draw an analogy to the state of mathematical analysis in the era of Riemann and Cauchy, where no rigorous basis for the techniques existed; skilled practitioners could and did produce correct theorems, while unskilled students applied the same techniques and produced nonsense. So even when there was ultimately an underlying rigorous reality (as we found out decades later), in practice being able to work effectively was far more a matter of individual flair and taste.

I feel like that’s where we are with software development practices at the moment. We all know there’s a huge variation in productivity, but to even measure them – much less explain the causal factors – is still a black art. (I’d think the same would be true of artists or research scientists – anyone whose work is fundamentally about original innovation. I’m sure there are techniques and approaches that make an order-of-magnitude difference in productivity, but no agreement about what they are)

Perhaps I’m just more incredulous. Many more people are just as sure that homeopathy works and that vaccines cause autism, so I’d just as soon assume it’s psychology at work. But I think what you’re saying is that we don’t yet have beneficial techniques just ideas that could lead to them. I can’t argue with that because I have no information one way or another. All I’m saying is, when someone does figure out how to develop software considerably better/faster, that achievement would be hard to hide.
I suspect that the effective techniques are known (or “known”) to some practitioners, but buried among a pile of ineffective techniques, with different people swearing by different techniques based on their experiences. Similar to how traditional medicine can be a mixture of genuinely important treatments for local medical circumstances and completely ineffective treatments that happened to be (spuriously) correlated with good outcomes.

You’re quite right about how it looks from the outside; the possibility that I’m just experiencing my own set of spurious correlations is real and worrying. But ultimately there’s only so far that I’m willing to overrule my own lived experience.

I wish I shared your confidence about future progress. There have been good and bad managers, and good and bad communicators, for decades, but we don’t seem much closer to a generally understood method that can be applied by everyone.

> The people who enjoy functional programming find it vastly enjoyable/better/whatever.

That’s a tautology…

Anyway…. I think I “get” enough of FP and I use parts of it. I certainly find it very useful. Types as well, very useful for me.

However, headlines like the current one and comments along the same hype line are just annoying. I sure like practical people discussing these things, but reality in this universe is that if you actually want to do something in the real world there is no abstraction that’s even close to complete. Everything leaks like a sieve. You compensate by choosing and mixing your tools and by adapting on the fly. Theory is a tool, nothing more. “I’m a functional programmer” – okay, I’m merely a “programmer”, and I choose whatever works best and don’t fuzz about labels and purity.

The points you mention all work in parts, and disintegrate in other parts when they come into contact with a real problem plus its context.

These discussions are so… random and often emotional and committed because there is no context to measure against. It’s just words. If there is a concrete problem including the concrete context (the entire environment and business context) it’s much easier to agree on something. Without that, every commenter comes in with their own context vision in their brain, which depends on what they’ve been doing. In the end any non-trivial piece of engineering (software or not) is much more messy than any theory can account for.

Exactly what @pron said in the comment you replied to about the “Appeals to “the power of math”” akin to a believe in magical solutions. I find it extremely pretentious TBO, although I certainly value math very highly. But it’s a tool with severe limitations and not a magical instrument.

All this stuff – math, type systems, FP – appear to us so “pure” and as the perfect solution because we invented it and (also because of that) it is a great match for how our brain works: Which means disregarding 99% of reality (and our sensors are very limited to begin with). But whenever we try to apply a thus derived “pure” solution to the real world we find that the disregarded and filtered-out aspects start appearing in all corners. If the person looking at it also has the same bias they concentrate on only the part that was successfully modeled and think “this worked perfectly”, so in evaluating the solution you have the same filter again. But whenever you try to do something more complex and/or more complete with math it’s a major undertaking, taking many very good people a long time. So yes, math can be used to model more aspects, but that results in an explosion of complexity.

The closer you program on “math space” problems the more perfect the mathy solutions like FP and type systems will fit. The more the real world has to be dealt with the more the leaks in the abstractions will become apparent. You can try to deal with them with the “mathy” solutions to remain in that space, but that then becomes more and more complex, and at some point you end up with a “hack” to cut through the thicket.

I highly recommend learning the basics of (cell-level and below) biology and biochemistry instead of yet another programming language to see a very different system. Then it becomes apparent the differences between various programming tools are much less than the heated discussion make it seem. When you look with a magnifier glass tiny differences look like mountains. FP, OOP, procedural programming, whatever – it’s all not all that different really, in comparison. It should be more apparent when you remember it all runs on the exact kind of hardware architecture. Our computing hardware is not diverse at all really (quantum computing as a very early and very limited experiment being a noteworthy exception; as are mostly older analog computing systems). The layers on top are all closely bound by and to it.

There’s loads to respond to here, but let me just say

> Everything leaks like a sieve.

Not in my experience. I, for work, use 100% Haskell (well, and some SQL) mainly to write web apps. Not “math space” problems by any means. We use and have written many abstractions. None of them are particularly leaky. Some of have implementations I don’t thoroughly understand and yet have no trouble using anyways—I no no better way to demonstrate a lack of leaks than that.

> That’s a tautology..

It almost is, except for “vastly”. I’m saying there isn’t very many people who enjoy it but find it in their experience merely slightly better.

That programmers are becoming more interested in maths and the mathematics of program construction can only be a good thing. As someone clearly interested in maths, I am surprised you seek to attack them as “cults”. I just see enthusiasm, no different to how folks have enthused about object-oriented programming for decades.

As Brian Goetz (Java Chief Architect) has said, Java will never become Haskell, but it will continue to take ideas from it. So even Oracle is drumming up interest in functional programming. About time I say, industry has been neglecting ideas from functional programming for too long.

> I am surprised you seek to attack them as “cults”.

I don’t call people who actually study and understand math as a cult. Those people don’t usually make outlandish claims that “if only people applied some math magic dusts, writing correct programs would be easy,” because they know the really difficult challenges (after all, it is complexity theory, a mathematical theory of sorts, that tells us that verifying programs is very, very hard). But that’s not what I see in the FP community. I see mathiness, not math. I also see cultish adoration; the lambda calculus was discovered, not invented? Come on! It’s not only wrong, but it’s a kind of religious adoration. I call cults cult.

I don’t think you can dismiss a platonic view of mathematics as simple cultish adoration and wrong.

There’s many problems with the platonic world of ideals and a long history of arguments and counter arguments, but definitely nothing is settled and most working mathematicians have an intuitive feeling of platonism and that there is a ‘right’ solution to be found out there.

It has nothing to do with the Platonic view of mathematics, that I don’t try or wish to dismiss at all. Even if the real numbers are real and, say, complex numbers were discovered, the lambda calculus was still very much invented. The lambda calculus is a formal system, an artificial language, first described in a paper in which the author, Alonzo Church, talks about the arbitrariness and aesthetics involved in inventing a formal system, and ironically, specifically talks about the process of invention in that context: https://news.ycombinator.com/item?id=20699465

So claiming that the lambda calculus was invented is 1. factually wrong, 2. fosters a deep misunderstanding of what formal systems are, 3. encourages a mystical view of mathematics. So it is at once wrong and mystical, and so I find it cultish.

Note that when, say, discussing the use of complex analysis researchers don’t usually spend time on the claim that complex numbers were discovered. But some FP luminaries do, as part of their (perhaps unintended) mission to instill awe and admiration toward their favorite programming style at the expense of understanding.

oh I don’t think that’s fair. the view that mathsy things are discovered not invented is hugely widespread amongst many people who know what they’re talking about. I don’t know what I’m talking about, and I think such matters are invented not discovered. but I do think Haskell is a great language that will drastically improve the quality of maybe even any project that doesn’t need c-like efficiency. these concepts are independent.
Creating a formal system, an artificial language, is not a mathematical discovery and it has nothing to do with “math things.” There are interesting “math things” one can discover about Python, but no one would take you seriously if you say Python was discovered rather than invented, and no one who knows the history of logic would take seriously any claim that the lambda calculus was discovered. The person who invented the lambda calculus said it was invented in the very paper he presented it, so no, the people who say it was discovered do not know what they’re talking about, at least not on that matter.
Lambda calculus is a lot simpler than Python. It bares a striking similarity to Gentzens natural deduction rules and is essentially isomorphic to it. It is also very well studied and its properties are well known. Hence it makes an excellent basis for a programming language. The idea that Lambda calculus was discovered and not invented was recently entertained by Phil Wadler. Given his academic credentials, I think we can assume he knows what he is talking about.
Why can we assume that? Is he an expert on the history of logic? That he’s an expert in an adjacent field doesn’t make him at all an expert on that. If you listen to his talks and compare with the actual texts he mentions you’ll see that he’s wrong.

And the similarity is not “striking” at all. It’s by design. It is only striking if you don’t know the actual history of how these formal systems developed. A couple of years ago I composed an entire anthology on the subject: https://pron.github.io/computation-logic-algebra

Your points reminded me of this:

A naturalist account of the limited, and hence reasonable,
effectiveness of mathematics in physics

Lee Smolin

https://arxiv.org/pdf/1506.03733

Also the book on which the article is based “The Singular Universe and the Reality of Time” is a physics book but I thought it has huge relevance to programming live systems (as opposed to e.g. compilers). The big difference between what we have to do when programming rather than when we are doing mathematics is dealing with time. (But that’s what makes it so interesting – and frustrating)

As someone who uses mathematics for programming (just look at my blog: https://pron.github.io/) I am very much in favor. What I’m against is people who don’t understand math, and promote a magical view of it (not math but “mathiness”), as they don’t realize its strengths and challenges. That’s what I find annoying in the FP community.
Dependent Haskell is not sound (type in type, i.e. a type level version of Russell’s Paradox, and lack of totality checking). Its primary purpose is not theorem proving per se, but rather unifying the explosion of GHC extensions that currently are weak approximations of a full dependently typed system. More than anything it is precisely an exercise in whether dependent types are useful even if you aren’t getting 100% watertight assurances with them.

Sure the article is a bit exuberant, but I think you’re being a bit harsh here. Yes the field of formal verification, from my amateur practitioner’s point of view, is a mishmash of techniques right now, with no single technique that has quite the right ergonomic vs power ratio that dominates over all others. But as far as I can tell there is in fact a significant amount research being done in dependent types. Many programming languages now have very limited dependent types. GADTs are one form that shows up in e.g. OCaml, Haskell, and Scala. Indeed Jane Street had a blog post about how GADTs were surprisingly useful for them (https://blog.janestreet.com/why-gadts-matter-for-performance…). Ongoing research projects occur for languages such as Agda and Coq.

Just like with formal specification systems like TLA+ and Alloy where you don’t need to specify in exhaustive detail what you’re after to see significant value, you don’t have to prove comprehensive theorems to derive significant value from dependent types. An easy, lightweight win is to leave behind value specific tokens when performing some sort of runtime check rather than trying to prove the runtime check (e.g. keeping track that yes I ran the isPrime function on this number, rather than proving that a number is actually prime), which you can then use to subsume pre/postcondition contract systems, since you can attach markers of arbitrary checks being made that don’t necessarily affect the value (e.g. keep track that I did check this user ID against a blacklist before trying to use it elsewhere).

The main problem with dependent types as far as I can tell is that only very recently have we had languages that try to write actual programs with full-spectrum dependent types appear. Idris is really the only example I can think of. Languages like Agda are first and foremost provers. I suspect the vast majority of Agda programs are never run after they’ve typechecked.

As a result we have little idea of what good practices with full-spectrum dependent types looks like. I have suspicions it looks fairly different from the hacks that current statically typed languages try to use to approximate it. For example I suspect some correct by construction approaches to ordinary statically typed FP languages quickly spiral into monstrosities with full-spectrum dependent types that can be nicely sidestepped again with evidence of runtime checks instead of full proofs.

Even with Idris, we don’t have people using these systems in anger. We don’t know how much to expose of function implementations, how much a totality checker matters, how best to avoid proof infection (where because a low-level component demands a proper proof and that proof is used at runtime, you must deliver a valid proof and can’t use some sort of TrustMe escape hatch of a null value that has any type). There are ideas. Idris 2 introduces a type system that distinguishes between proofs that are never used at runtime and runtime values, allowing users to safely use the TrustMe escape hatch if necessary. But there isn’t a large community working with them while trying to ship business products.

That for me is the greatest promise of dependent types in Haskell. Getting more real world experience of programmers using dependent types to uncover ergonomic holes. The Haskell community in the grand scheme of languages is small. It is nonetheless orders of magnitude larger than the full-spectrum dependently typed community and, if Dependent Haskell lands, represents perhaps the first time that a dependently typed language is used to directly write, rather than verify, production systems.

>More than anything it is precisely an exercise in whether dependent types are useful even if you aren’t getting 100% watertight assurances with them.

They are. Anyone working with linear algebra can tell that. Hell, arrays tracking bonds in types rather than in some runtime checks would be a huge thing.

> They are. Anyone working with linear algebra can tell that. Hell, arrays tracking bonds in types rather than in some runtime checks would be a huge thing.

That they can be used in a useful way doesn’t mean you should have them. Other things (like contract systems) can do pretty much what dependent types do, but don’t suffer from as many disadvantages. A tank could get your kids to school, but that doesn’t mean it’s a good idea to get a tank if what you need is to get your kids to school.

>That they can be used in a useful way doesn’t mean you should have them.

It’s a good enough reason to want them as an option.

>Other things (like contract systems) can do pretty much what dependent types do

If you want a system at least as strong as dependent types, you need a system equivalent to dependent types or more generic. Than you can just as well drop the awkward system pretending to not be a type theory and use an actually working formalism.

As far as I can see, Hoare logic is a variant of propositional logic, meaning it is suitable for monomorphic code with static dispatch, but not much more. Extending it to handle more and more generic and more and more polymorphic code eventually transforms it into a variant of type theory (Hoare Type Theory). Might as well start from the side of pure type theory and extend it with linear types – the result should be equivalent.

> It’s a good enough reason to want them as an option.

I’m not so sure. If you have a good specification mechanism that offers a choice of verification method, why do you need dependent types in addition?

> If you want a system at least as strong as dependent types, you need a system equivalent to dependent types or more generic.

There are many factors by which to judge the utility of specification and verification tools. There is no specific metric of “strength” that determines the value of a formalism. As Alonzo Church wrote in the paper in which he invented the lambda calculus, formal systems are not more or less true or more or less good; rather, they can be more or less convenient for different uses, and more or less aesthetically appealing to different people.

In general, developers need a system that can be flexible. The difference in cost between verifying something with 99.9% confidence and with 100% confidence can be 10x. Sometimes you’re willing to pay that 10x for a slight increase in confidence, and sometimes you’re not. Often, the decision is different for different parts and different properties of a program, or for different properties of the same part of the program, or for the same property in different parts of the program.

> Than you can just as well drop the awkward system pretending to not be a type theory and use an actually working formalism.

You’ve studied all logics and all type systems and have concluded that all the type theories are always less awkward to all people than all “direct” program logics?

Also, no logic is pretending to be or not to be a type theory. Type theories are not the canonical representation of logic. That’s just not how formal systems work.

> Hoare logic is a variant of propositional logic, meaning it is suitable for monomorphic code with static dispatch, but not much more. Extending it to handle more and more generic and more and more polymorphic code eventually transforms it into a variant of type theory (Hoare Type Theory).

Hoare logic can be used for polymorphism, dynamic dispatch, higher-order subroutine, and anything else type theory can be used for. But I’m not claiming it’s superior to type theory. Type theory is fine.

> Might as well start from the side of pure type theory and extend it with linear types – the result should be equivalent.

No, because the problem with dependent types is not the theory, which is, indeed, equivalent to whatever logic you choose and vice versa. The problem is that they force you to use deductive proofs for verification. Not only do you not want your specification method to force you to use a particular verification method, but rather you want flexibility because verification methods vary tremendously in cost, deductive proofs make for the worst default verification methods, as it’s the most costly and least scalable (at least so far).

> If you have a good specification mechanism that offers a choice of verification method, why do you need dependent types in addition?

Because I don’t see it as a good enough method.

> There is no specific metric of “strength” that determines the value of a formalism.

There is, even though it is hard to judge: the number of cases it works for with same amount of work invested and genericity of obtained results.

>Hoare logic can be used

Consider a sorted arrays Arr with elements of type A and integer index ranging from k to l. To express that it is sorted one needs a way to say that forall n : Int, m : Int, k <= n < l, k <= m < l, n Arr(n) <= Arr (m)

Consider an implementation of a relational merge-join operation. To express constraints on the input arrays one needs to express that they have a key to join over and are sorted using compatible comparator over that key, i.e. to introduce predicates referencing predicates in the code.

To me it looks like you have type theory except it moved from typing of variables to associated predicates. I really don’t see how it is better.

>The problem is that they force you to use deductive proofs for verification.

I really don’t see a problem with it. Granted, I work with academic/numerics software, not enterprise software, and there might be a difference in priorities… But from my side I don’t see a problem.

> Because I don’t see it as a good enough method.

You don’t see what as a good-enough method?

> There is, even though it is hard to judge: the number of cases it works for with same amount of work invested and genericity of obtained results.

Well, even deductive proof proponents (like Xavier Leroy) are forced to admit that of all known and practiced formal methods, deductive proofs score lowest of all on that metric.

> To me it looks like you have type theory except it moved from typing of variables to associated predicates.

Then it’s not a type theory. A type theory is one way of expressing logical propositions, and, say, first-order logic over set theory is another. You can express similar things in both. Why do you think that it means that one is really the other? If you could say in French the same thing as you can in English, does that mean French is really English?

> I really don’t see how it is better.

It’s not better! The problem with dependent types is not the theory, which is perfectly fine! The problem is the practice. You can describe a property using dependent types or, say, FOL in some contract system. But only with dependent types you have to verify that property using deductive proofs, the “weakest” formal method by your metric. With a contract system, you can choose to verify the same proposition with a deductive proof, or you can use one of the “stronger” methods.

I’m sorry, but how you can consider anything but deductive proof as a proof? Inductive ‘proofs’ are not strict proofs, they are suggestions that need to be tested. Well, except mathematical induction which is said to be a variant of deductive reasoning and can be perfectly modeled with dependent types.

this

> you can choose to verify the same proposition with a deductive proof, or you can use one of the “stronger” methods.

is nothing more than a way to say “you can choose to proof or to do some handwaving instead”.

> I’m sorry, but how you can consider anything but deductive proof as a proof?

Because if the logic is sound and something is true in the model, it cannot be disproven in the logic (and arguably engineers care more about the model, anyway). In other words, you can rely on model-theoretic proofs, which are not deductive (at least not in the logic itself).

Also, I don’t think that engineers are necessarily interested in proof, but rather in verification, i.e. knowing that the proposition is true with some probability. This is also more easily achieved in the model theory.

> is nothing more than a way to say “you can choose to proof or to do some handwaving instead”.

No, it isn’t. You can choose to model-check (a model-theoretic proof), or you can verify with high but less than absolute certainty. Neither of these is “some handwaiving”, and both, BTW, are bigger topics of research in the formal methods community than deductive proofs (take a look at formal methods conferences). And don’t forget that because engineers are interested in the correctness of a system, not of an algorithm, there is always a probability that the system would fail, because there is always a probability that the hardware would fail. So below a certain probability, proving algorithm correctness is of no interest to engineers.

It also seems to me that you believe that the ability to write proofs is a choice. This is something that only people who have not used formal methods could believe. In practice, we could not write deductive proofs even if we really wanted to, because it is just too laborious. The largest software ever verified end-to-end using deductive proofs was 1/5 the size of jQuery, and it took world-experts in verification over 10 person-years (it took them 20-30, but they say they can cut it down), and it required an extreme simplification of the algorithms used. So the reality is that we have no idea how to verify programs that aren’t tiny using deductive proofs, and that’s why most of formal methods research is looking in other directions.

If verifying programs with deductive proofs were an actually viable option, I would be very much in favor. I’m not because it isn’t, but there are other verification methods that have so far shown more promise in practice. It’s not a matter of principle, but of practical feasibility.

>Also, I don’t think that engineers are necessarily interested in proof,

Depends on the area the engineer works in. Starting from some cost of failure you actually want a proper proof, because the inherent chance of error in incomplete induction becomes unacceptable.

>Because if the logic is sound and something is true in the model,

You can perfectly formalize this model using dependent types and use it there, don’t you?

> So the reality is that we have no idea how to verify programs that aren’t tiny

Yeah, I know, our brains are to small to properly verify everything, and we need to find ways to outsource as much as possible to machines.

Doesn’t mean we need to willfully embrace blind faith in incomplete induction.

> Depends on the area the engineer works in.

Sure, but let’s say that it’s a conservative estimate that 99% don’t need proofs. I acknowledged in the beginning that deductive proofs have a place in some niches.

> You can perfectly formalize this model using dependent types and use it there, don’t you?

The logic is already a formalization of the model. Yet finding a counterexample in the model directly is very often easier than finding a deductive proof in the logic. Some people got a Turing award for that discovery.

> Doesn’t mean we need to willfully embrace blind faith in incomplete induction.

Formal methods is not “blind faith”. Second, we haven’t willfully given up on deductive proofs; it’s just that for decades we’ve tried to scale them and couldn’t, so out of necessity we found methods that have worked better in practice. I don’t know if 200 years from now people won’t be able to make deductive proofs feasible, but why not use formal methods to make our software better and cheaper now?

In any event, all I want is for people to know (what formal methods research and practice already does) that software correctness is an extremely complex topic, with severe challenges that are both theoretical and practical, and anyone who suggests that there is a known specific solution to the problem or that one of the many imperfect avenues we’re exploring is “the future”, doesn’t know what they’re talking about.

>Sure, but let’s say that it’s a conservative estimate that 99% don’t need proofs

99% don’t need formal verification, they are OK with unit tests. If one is actually that concerned with an effort to perform actual verification, they most likely want an actual proof.

>The logic is already a formalization of the model.

So?

>it’s just that for decades we’ve tried to scale them and couldn’t,

For decades people didn’t care for formal verification. Absolute majority of ‘engineers’ don’t see any use in it. You, on the other hand, see little use for formal verification with dependent types. Well, forgive me for drawing a parallel here.

I want some specific guarantees that are fairly easy to express and keep track with dependent types, not some fancy offshot of Hoare logic. Why would I care for a method clearly inferior for my use cases?

==

Let’s return a bit and examine one specific case again. Consider situation: one writes a generic merge-join function over sorted arrays. There, input arrays must sorted with same predicate over same, possibly virtual, key.

To express this in specification on the input data one need to

– quantify over the types of both arrays

– quantify over the type of the key

– and all the functions able to produce a key of this type from elements of array 1

– and all the functions able to produce a key of this type from elements of array 2

– quantify over the ordering predicate over values of the key type

– express that ordering holds for each pair of elements of the array — i.e. quantify over indicies of arrays 1 and 2.

– For which one needs information on index bonds attached to arrays

I can, with some effort, express it in the language of type theory. I’m very interested how one can express it using, say, first order logic that, you know, doesn’t allow for quantification over predicates.

> 99% don’t need formal verification, they are OK with unit tests.

Formal verification can be cheaper than tests. It’s not just about a higher confidence, but also about achieving similar confidence more cheaply. That’s one of the reasons why deductive proofs are not so popular in formal methods research — they’re too inflexible.

> If one is actually that concerned with an effort to perform actual verification, they most likely want an actual proof.

But we don’t need to hypothesize — they don’t. Almost all formal verification in industry uses model checkers and sound static analysis. If you are interested in addressing the needs of certain small niches, that’s one thing, but the thrust of formal methods research is about helping as much software as possible, and so deductive proofs are not the focus, because they’re well behind other methods.

> So?

So, if you have a model checker, you don’t need to write a proof. The model-checker could automatically provide a model-theoretical proof for you, at the press of a button. That’s how, say, avionics software is verified. That’s how Intel’s chips are verified. People can’t afford to write proofs for that volume of software. People write proofs only when model checking fails, or to tie together model checking results. It’s no one’s first choice (unless they’re researchers or hobbyists).

> For decades people didn’t care for formal verification.

It doesn’t seem like you’re familiar with the field. There was a huge boost in formal methods in the 70s followed by a research winter when people over promised and underdelivered. Then, about 20 years ago, there was a resurgence with different goals. In general, research now focuses on managing cost vs. benefit, reducing certain kinds of bugs, and finding ways to make verifiation more flexible, as we’ve so far been unable to make end-to-end verification scale.

> Why would I care for a method clearly inferior for my use cases?

I don’t understand why you think it’s inferior. It is absolutely not, and it is superior where it matters most. Expressiveness is the same, so specification is the same, but verification is flexible. You can use deductive proofs, or verifiation methods that have shown more promise.

> I’m very interested how one can express it using, say, first order logic that, you know, doesn’t allow for quantification over predicates.

What? Of course FOL allows quantification over predicates and functions if they are elements of the domain, as they are in say, set theory (in fact, there is an assumption at the base of modern logic, that any mathematical theory could be expressed in some first-order language). Or, you could use HOL, if you like. But again, the theory is not the problem here. It’s the chosen verification tool (and BTW, formal methods practitioners normally prefer to weaken the specification power if it reduces the cost of verification). Your example, in particular, is easily expressed in FO set theory, or many kinds of many-sorted FOL, or in HOL. For example,

∀ K, V . ∀ f ∈ K → V . …

is a trivial, perfectly normal, first-order logic over sets. In fact, dependent type theories like Lean’s strive to be interpretable in set theory. But really, the choice theory is the least important issue with dependent types or software verification in general.

If you are, however, interested in theory (as I am), you can read my series on TLA+. It is a first-order logic that is strictly richer and more expressive than lambda-calculus-based dependent type theories (unless you deeply embed it in them, and describe your computations in monads): https://pron.github.io/tlaplus

>Almost all formal verification in industry uses model checkers and sound static analysis.

But does it happen because it is the only option or the best option? Also, encoding semantics specs is hard, so proper verification of algorithmic side is usually not an option anyway.

>I don’t understand why you think it’s inferior.

Because it absolutely is. It forces me to use an additional tool when I don’t really need to with good enough type system.

Ugh. Look, example: memory safety. For quite some time memory leak needed to be tested for. We have a set of tools for that. Than memory control discipline in C++ arise. And then we have linear types in Rust. With Rust we don’t need a separate checker: everything needed to ensure memory safety is embedded into the language type system.

Similarly, with linear algebra basic constrains can be expressed in current GHC type system. However, promoting input values to type level is awkward at best, so Haskell needs a small advancement here. Again, I don’t need separate model checker here, moderately advance GHC type system works just fine.

> Of course FOL allows quantification over predicates

Wiki disagrees with you. Oh, well.

>If you are, however, interested in theory (as I am), you can read my series on TLA+.

As I said, at best I might be interested in HOL Isabelle. And even that is doubtful, so far Agda and Coq appear more useful for what I’m interested in.

> But does it happen because it is the only option or the best option?

I don’t understand the difference. Model-checking scales better than deductive proofs and is far cheaper. If you have less than 5 years to write a 10 KLOC program or your program is larger, then deductive proofs won’t work. Does that mean it’s better or that it’s the only choice?

> Also, encoding semantics specs is hard, so proper verification of algorithmic side is usually not an option anyway.

This is true for all formal methods, but here, too, most of them beat deductive proofs. With deductive proofs you have to annotate every unit (or work much harder). Model-checking (and concolic tests) can handle partial specifications. This is yet another reason why formal methods research is not focusing its efforts on deductive proofs.

> Again, I don’t need separate model checker here, moderately advance GHC type system works just fine.

It doesn’t, though. Memory safety is a compositional property, as is every kind of property that can be easily checked by deductive proofs (or almost compositional, requiring one or two inferences). Most correctness properties are not compositional. That deductive proofs (of propositions that aren’t compositional) don’t scale is not some hypothesis of mine. They’ve been tried for years, and they just don’t. The biggest programs ever verified with them are ~10KLOC and they took years of expert work.

> Wiki disagrees with you. Oh, well.

No, it doesn’t. But if you think it does, it means you don’t yet know what signatures, theories, structures, interpretations and models are — the very basics of all formal logics — so explaining that might take a while (in short, if propositions and functions are part of the structure, of course FOL can quantify over them; if it couldn’t, it wouldn’t have been the lingua franca of mathematics and the crowning achievement of formal logic). You can read my very short introduction to formal logic here: https://pron.github.io/posts/tlaplus_part2#what-is-a-logic . It’s relevant to any formal logic you may want to learn at some point, as they’re all more similar to one another than differen, but it may be too terse if you don’t have any prior background.

> As I said, at best I might be interested in HOL Isabelle. And even that is doubtful, so far Agda and Coq appear more useful for what I’m interested in.

Sure. In general, Lean/Coq/HOL are more geared toward research — inventing new logics, formalizing “high” mathematics — and TLA+ is more geared toward industry use and verification of software and hardware, and because that’s what I need, that’s the one I use most. But it really doesn’t matter which you learn — they’re all much more similar to one another than different. Their most important differences are in user-friendliness and their tooling. I would recommend Lean over Coq and Agda, though; I think it subsumes both, and its documentation and tutorials are better, so it’s easier to learn. I enjoyed TLA+ and Lean more than Coq and HOL, and I haven’t tried Agda. But this could be a matter of personal preference. Again, they’re all more similar than different.

>It doesn’t, though.

read this: https://serokell.io/blog/dimensions-and-haskell-introduction , then look into the referenced libraries.

Have fun arguing with objective reality.

As for resource safety, Haskell is yet to follow Rust and introduce linear types, but many cases are covered by bracket pattern https://wiki.haskell.org/Bracket_pattern which can be enforced on type level using some tricks with ST monad. BTW, it is very similar to RAI discipline suggested by many C++ guidelines, but here it can be enforced.

Again, have fun arguing with objective reality.

>This is true for all formal methods, but here, too, most of them beat deductive proofs.

Sure, feel free to feel superior about that. The problem is, again, type system can be embedded into programming language and allow for quite a bit of guarantees even without full blown dependent types. Java has static typing for a reason.

Sure, to enforce some properties using type system we need an uncomfortably deep delve into type theory or use awkwardly polymorphic code. However, a lot of properties can be enforced with relatively low cost, and Haskell is a long-running research project on what properties are practical to enforce this way.

So, why I need to use proof assistants for properties that can be enforced via practical type system? I really don’t see a reason.

> read this: https://serokell.io/blog/dimensions-and-haskell-introduction , then look into the referenced libraries.

Dude, I actually work with dependent type proof assistants (Lean), with a HOL proof assistants (Why3), and FOL/ZFC proof assistant (TLA+’s TLAPS), as well as with a model checker for TLA+ (TLC); I’ve also worked with a model checker for Java in the past. I have read several books on the subject, and have written two (both available online: https://pron.github.io/tlaplus, and https://pron.github.io/computation-logic-algebra). I’ve also trained engineers in formal methods.

> Have fun arguing with objective reality.

You’ve read a few blog posts on one aspect of one formal method, and seem to have not even heard of the majority of software correctness research. You certainly don’t seem to have practiced formal methods in industry even using one method, let alone more than one. You don’t even seem to know what first-order logic is, what higher-order logic is, and I’m sure that if I asked you the difference between HOL and Martin-Löf type theory, or between classical and constructive mathematics you won’t even know what I’m talking about; ditto if I asked you what it means for a type theory to have a set-theoretical interpretation (as Lean’s, and maybe Agda’s — I don’t know — type theories do). You’ve read a few blog posts about a small part of one of the most complex, refractory topics in all of computer science and software engineering. You don’t know what the objective reality is.

That you can prove some stuff with dependent types has been true for at least 30 years now. That other approaches, of which you seem to know nothing, scale much better in practice to the point that industry rarely uses deductive proof for anything but small, niche problems and uses model checkers all the time to verify some of the software your life depends on is just a fact. If you don’t know what the alternatives are, how can you argue with what I’m saying?

> Sure, feel free to feel superior about that.

I don’t feel superior. It seems that, unlike you, I actually use deductive proofs for verification, as well as model checkers, because, unlike you, I actually practice formal verification. But, knowing the tools, I know what works and when. I’m not researching one approach or another; I use which one currently works best, and the reality is pretty clear.

> The problem is, again, type system can be embedded into programming language

Contract systems are also embedded into programming languages. You seem to know one thing (know of, seems more accurate) and argue it’s the best, when you have no idea what other things are out there.

> and allow for quite a bit of guarantees even without full blown dependent types. Java has static typing for a reason.

Sure, that because static types are awesome! There are some things — simple, compositional properties — for which nothing else works better. Type systems beat the alternatives for these simple properties and, at least currently, lose to the alternatives when deeper properties are involved. Correctness is complicated, and different properties require different tools. That is why, when deep properties are involved, I prefer separating the specification method from the verifiation method. Contract systems are embedded into programming languages and work alongside simple types. They then allow you to choose the best verification tool for the job: you can use deductive proofs, like depdendent types do, and other verification methods when those work better (which is the majority of cases). Dependent types tie you down to the verification method that is the most restrictive and least scalable.

> So, why I need to use proof assistants for properties that can be enforced via practical type system? I really don’t see a reason.

You don’t. The two are equivalent. I.e., they are equally the least scalable verification method we know. But contract systems, unlike depdent types, allow you to choose: you can use deductive proofs or other, cheaper, more scalable approaches. You need them because deductive proofs — the only verification approach that dependent types give you — just don’t scale. But I don’t need to convince you. I encourage you to learn Idris/HOL/Lean/TLA+/Coq — whichever one you like (better yet, learn more than one) — and try writing deductive proofs for large programs. Try different kinds of programs, too: sequential, interactive, concurrent and distributed. Knock yourself out. Perhaps when you speak from experience you’ll have a better grasp of what objective reality is. In the process, you will learn a lot, so I truly wish you the best of luck!

But take one piece of advice: don’t become dogmatic about something until you’ve actually looked around. If you argue that X is better than Y and Z because X is all you know, you just sound like a cultist. Learn first, argue later.

>If you argue that X is better than Y and Z because X is all you know, you just sound like a cultist. Learn first, argue later.

Apparently, you didn’t look in the mirror. Oh, well. Thank you for proving my prejudices right with this post =).

I’ve tried all the methods discussed, and my point-of-view is based on my personal experience and years of involvement with formal methods. You’ve tried none, and your point-of-view is based on… a couple of blog posts you’ve read?
Wow. Did not anticipate such a long reply thread. Man it looks like I really should write up a comparison of my (amateur) experiences with TLA+ (TLC + TLAPS), Dafny, Idris, and Liquid Haskell.

Also permeakra, regardless of whether you agree with the points that pron is making here, if you ever find yourself interested in TLA+, I would highly recommend his four part TLA+ series (potentially as an intermediate step after e.g. Lamport’s video course). It’s absolutely fabulous.

> if you ever find yourself interested in TLA+

Unlikely. Though I might be interested in HOL Isabelle eventually.

Maybe a sidetrack, but I do think CS majors will find it extremely useful to use deductive proofs when creating libraries.

Think of the implications of when you are checking for sorting libraries you have one that includes a proof that it’s working and maybe even a proof of runtime cost. As a developer you don’t need to write proofs at all as you will not deem it worthwhile. My guess is that this whole popularity contest when checking for a library will lessen considerably (how many stars does this repo have? When was it last updated?)

I’m excited for the future both as a computer scientist and developer.

I mean regardless..having dependent types in Haskell is going to come in handy in industry. It’s obviously not the case that dependent types are strictly too costly to ever use.
I think the title is tongue-in-cheek. Clearly the future is Python and JavaScript.
> Why Dependent Haskell Is the Future of Software Development

I find this hard to believe, considering probably less then 1-2% of developers in the world have any clue to what this article is about.

That.. is mostly the case for innovative concepts at some point before they get adopted. At some point some years ago you could write a lengthy and passionate article about React or Typescript and discuss why it would be so advantageous and most front end developers would feel lost.

Good ideas aren’t necessarily easy to comprehend. In order for them to work for a wider audience someone need to properly package, document and maybe simplify them for it to work for them.

I agree. But the headline was not referring to a concept (Dependent Types) but specifically to “Dependent Haskell” as the future of software development. Making it sound like we all are going to be Haskell programmers in the future.
I think I can rephrase my opinion for either the language or concept and still agree with it. I believe Haskell has continuously become more developer friendly and may some time reach some kind of friendliness and use case support threshold that allows for a wider adoption.

I recently started looking at Haskell a couple of weeks ago after years of hiatus. The IDE support have gone from almost none-existing and/or bug ridden to almost as good as Java and the new package manager has so far been pleasant to use. Although I’m using it for small side projects so it might not be fair comparison to my day work use case.

This is probably because it’s about cutting edge language design. The feature the article is about (dependent types) might well be the future of software development and could end up in every strongly typed language, but because it’s new and Haskell’s functional syntax looks weird compared to C-based languages, it makes the headline sound a bit hyperbolic.
I wouldn’t disagree that dependent types are great for certain specialised problem domains, but they’ve already been around for ages and didn’t yet revolutionise the world of software design. I think the sentiment that this is unlikely to make a big splash is totally on point. Languages have to balance ease of use with features in order to become very popular. Nothing as purebred as Haskell is likely to become the next Python or Visual Basic. Real work horses are dirty, messy machines. Of course this new feature might turn out to be popular for Haskell users.
>but they’ve already been around for ages and didn’t yet revolutionise the world of software design.

I think it’s mostly because languages with dependent types are awfully clunky to use. Perhaps dependent typing is even a dead end, but I wouldn’t count out other ideas such as refinement typing just yet. I think it will take a lot of work and failed attempts (such as typestate in early Rust) to get fancy type theory into mainstream languages (just like it took a while for Java to get lambdas).

Case in point: static analysis tools for C++ are getting more capable and popular recently.

Formal methods in general are extremely clunky to use. Just see how much hate Rust gets for making it extremely difficult to make an intrusive linked list. And memory safety is only one kind of correctness assertion.

So I’m gonna be a pessimist here and say: nope, not gonna be the future, because programmers are lazy and “ship it now” is better than “prove the bugs don’t exist”.

Not all programmers build WordPress plugins.

Some programmers write software for aircraft. The “ship it now” maxim doesn’t exist in that context.

No one is writing aircraft software in Haskell. They need good control over memory usage.
That isn’t a counter-example unless you’re taking his post literally. In which case I claim that Python is a system programming language because one time somebody somewhere wrote a driver in it.
Well… for 95-99%[1] of all software being written, “ship it now” beats “prove that bugs don’t exist”. Better formal methods could really help that 1-5%, but they’re not going to impact the 95% very much.

[1] All statistics made up on the spot. I think it’s in the right ballpark, though.

The cpp static analysis tools are all wildly unsound. This is a core design principle behind all real world deployable static analysis systems, whether they are based on abstract interpretation or model checking or symbolic execution, or anything else.

Static analysis is a powerful technique. But sound formal methods have been failing to make a splash for nearly five decades.

Honestly, I’d be happy with a language that can propagate the result of a conditional statement into compile-time constraints inside the branches, and also can create executables without jumping through 20 hoops.
The thing about Haskell is you shouldn’t consider it a finished product. Yes, there are people using it and their experience is invaluable, but it’s also a huge research project into what a programming language could look like. It isn’t mainstream ready yet, but you can see the future come into view.
On the other hand, even those messy real work horses have been adopting functional programming features at a rapid pace over the past decade or two. And many of those features originate from Haskell.
> and didn’t yet revolutionise the world of software design

Well, they have. There’s a reason why the companies who have adopted these technologies have adopted them.

The future is here, it’s just not evenly distributed.

But why dependent types and not one of the many other formal methods that have been around for decades? Especially given the trend towards compilation speed being a relevant design choice (e.g., golang).

Sound global formal verification largely sucks. Type checking is about as far as we’ve ever gotten using it for widespread real world applications, and most type checkers in major languages are only “sound-y”.

I remember going to a happy hour at a large consulting company I worked for and 95% of people had never heard of Haskell much less knew anything about it. So needless to say I was easily pulled away from that company when the next person I interviewed with had written production Haskell code.
I find this to be odd. Haskell seems to be the one language where recruiters lead with “we program in Haskell here” before saying what the company does.

Do people really choose companies based on the language they are using?

“We Program in COBOL!”

(Walks away slowly avoiding eye contact)

A bit less facs- fascatio- fesceeshus- silly, companies that aren’t smart enough to invest in good tools probably aren’t smart enough to invest in good people. You can overdo the tech worship, but you can totally underdo it and cost yourself a bomb.

At some point, money is no longer an issue (or falls into diminishing returns), and I’d even take some amount of a pay cut if it meant working in a language and on projects that were mentally stimulating
But even this is odd to me.

Programming seems to be the easy part. The hard (and stimulating part) is everything around that. If somebody says that they work in Haskell but doesn’t say what they are working on then I’ll just be confused.

To me, it is like selling a company based on where you put the curly braces (a slight exaggeration here).

Haskell is less of a programming language and more of a religion, so people choosing companies based on their religion makes sense.
> Haskell, at its core, is simple: it is just a polymorphic lambda calculus with lazy evaluation plus algebraic data types and type classes.

This can potentially unseat the current reigning champ of Haskell-related statements which is “A monad is just a monoid in the category of endofunctors, what’s the problem?”

I agree that the comment is humorous, but I think by “simple” the author means “that is mostly the full set of things that is happening” rather than “easy to grok”.
To be fair, at least that’s computer science speak rather than category theory speak.
If the author really wants Dependent Haskell to be the future, maybe he should use Pyhon and JavaScript speak to spread concepts.

A world with little handles attached to it slips too fast out of the curious minds.

The title notwithstanding, this article is definitely aimed at Haskellers, or at least people who have more than rudimentary knowledge of Haskell and type systems.

Anyway, it’s rather difficult to talk about advanced type systems in terms of languages that have no type system at all 🙂

What body of knowledge defines “type classes”? I ask out of relative ignorance, kinda saw it as Haskell thing
The term “type class” is original to Haskell, I think, but the concept (constrained parametric polymorphism) is fairly standard programming language theory.
In both cases, the sentence is not unreasonable if you understand all the words it contains (and I’m not suggesting you specifically don’t understand those words).

Those sentences are both very dense in information, but they’re disguised as sentences intended to simplify/summarise an idea.

There is a persistent myth that “A Haskell program that compiles is correct by definition”

Indeed it is “correct” in the sense that it has a very high likelihood (maybe 90% ?) of doing what the programmer intended (Unlike say C or C++ where the probability drops to about 50% for beginners)

However, that is a poor definition of “correct”.
While it’s true that a lot of huge consequences have occurred due to stupid errors like buffer overflows (unintended bug), there are equally disastrous bugs caused by intent – i.e. the programmer had no idea the code could fail.

The biggest challenge to software development is not the “How can I avoid silly mistakes?” but rather “How do I capture this extremely complicated real world semantics in code?” and languages can’t really solve that

You’re building a strawman. I’ve never heard of that myth, the saying is more like “if a Haskell program compiles, it usually works“. And you can say the same thing about a couple of other statically typed, functional programming languages.

> “How do I capture this extremely complicated real world semantics in code?” and languages can’t really solve that

Actually languages can help a lot.

This isn’t just about “silly mistakes”. Compilers are _theorem provers_. Quite literally there’s an equivalence between type theory and mathematical logic. I hope you’re not saying that math logic doesn’t help in modeling real world semantics.

So yes, via modelling types in an expressive language you can at least have proof that the changes you make are consistent with the world view you had before those changes.

Of course people accustomed to less expressive languages like Java won’t believe this until they see it. But you should really look into Haskell to see the difference. And dependent typing brings this to the next level.

I used Ada for a few years back when I was working on satellite programs. It’s the closest thing to a “If it compiles, it probably works” language that I’ve ever seen.
I really wish you could use all of the adacore stuff for free in commercial closed source software and just pay for IDE support and maybe some specific libraries.
The point of the type system is that you can express your intent through an invariant or property at one place (i.e. the “head” function can specify that it should never be used on an empty list) and propagate that outwards so that it’s enforced at all times.

This is more effective than trying to remember to always check the head function in every single place you ever try to get the first element of a list.

Yes, you can still write a compiling but wrong program. But it will likely be pretty close to what you actually want it to do, and it’s easier to notice the behavior is not what you expected in one place, enforce that behavior via types, and see where you went wrong through compiler errors.

The craziest thing about this is that programmers will often think “well, this tech can’t prevent every possible kind of bug, therefore it’s useless and we may as well use Python/Ruby/JS/whatever”.

Programming usually happens in the context of a business, and the technology used has implications in a cost/benefit analysis.

Being able to express invariants at the type level is cheaper for a business at scale.

Programmers who think types let them express their invariants are often ignorant about the fact that expressing those invariants makes their program a mess, so much that they have to consider possibly mutually incompatible language extensions, or reduce modularity of their program to the point where it becomes a pain to compile and becomes hard to maintain, and might require a complete redesign at some point when a minor functional requirement is added…

If you consider this huge overhead in development cost, maybe time is better spent in less intrusive ways of improving software reliability.

Sure, but it’s about striking a balance between the type (specification) and the program (implementation). The myth mentioned is just something people say to beginners in Haskell but no experienced programmer truly believes – reason being again this balance.
When you can express so much in your type system that the implementation can be mostly automated, you have shifted the semantic mess to a level above but have not really made any substantial progress. Still, it’s undeniable that type systems are perhaps the only kind of formal methods who really made it to the mainstream software developer because they can be useful. What I find frustrating is that no one is really guiding the programmers in finding good patterns for achieving a healthy balance specially for code that needs to be maintained and adjusted to new use cases.
I’m not sure they have become mainstream. Java and the like have Generics, but I think that’s about where “mainstream” ends. (And even Generics can be problematic).
I think many programmers are skeptical of advanced type systems because the OPs straw man is what gets advertised at the water cooler. I’ve seen it myself in professional settings.

My impression is that advanced type systems interest people because they’re cool moreso than because they’re practical tools for reducing errors.

The kind of discussion I see on this post generally reinforces my impression.

That hasn’t been my experience so far, but I’d be interested to read in more concrete terms exactly what level of type-level invariant expression begins to create overly-rigid software.

FWIW, I’m not suggesting moving everything possible into types. I’m suggesting types are better than not having them, just like tests are better than not having them.

A software artifact that has a very specific type usually also needs to be very specific in terms of its required preconditions. Which puts a significant burden on the user of that artifact. And now that user probably requires its own additional preconditions to be able to conform. And so on.

I don’t know, maybe there is a good way of structuring the software to avoid this effect (I’m not extremely invested in type systems), but maybe it’s just a lot of additional pain on top — too much perceived pain for people who can barely make their software “seems to work”.

I can confirm this 100%. Having worked at an old (19 years) Java code-base full of unnecessary patterns & abstractions. Even with really great refactoring help from the IDEA, making meaningful changes is very very time consuming.
I’ve spent weeks on it.
The complexity was creeping in over the years and is almost impossible to get out again. But it has types!!!11!

And if I then look at the actual data transformations behind all this code: It would be a 2 day exercise in Clojure to re-build this from scratch.

Java has probably the most notoriously limited typesystem in actual use. Judging type systems by Java is like judging electric cars by the Sinclair C5. Try an ML-family language sometime.
It will have the same intrinsic problems. It doesn’t matter that it is Java.
Nonsense. How can you claim that without having seen or tried it?
but that is not the point of Haskell’s type system, since it doesn’t even begin to try to reach that goal.

firstly, the most available `head` function crashes on half the constructors.

more fundamentally, just try using any file or network io method. you just won’t have a clue how it can fail, because it isn’t even documented.

there’s many methods where the best you can do is work out what exception is being used to wrap a C errno and read the appropriate Linux man page.

at least C documents it’s functions that can return errors, and usually encodes them in the type. in Haskell, it’s just, “if you’re dumb enough to use IO (and what alternative have you got), anything can crash all the time. deal with it”.

Haskell is great right up to that point, but that’s a critical point.

(I’ve tried to find ways to deal with this, but the most people seen to refer too something else when discussing how horrible exceptions are, so I wonder if I’m insane)

You’re not insane at all. A substantial proportion of the
Haskell community (I feel like it’s a majority but it’s hard to be sure) have a strong distaste for partial functions (e.g.
head) and functions that throw exceptions (i.e. where you
can’t see from the type signature what the possible failure
modes are). I’m afraid that momentum means that the base
libraries change slowly, but you’re a long way from alone.

I suggest you come to https://www.reddit.com/r/haskell/ if you’d like more discussion about it. We often have threads wishing for partial functions to be removed from base!

You might like the paper “Ghosts of Departed Proofs” which shows a great technique for dealing with preconditions in Haskell.
I’ll surely read it, but based on the abstract, it’s a design pattern for library writers, not a sanity pattern for library users.

it won’t let me know how or if the function you wrote will throw an exception, or what kind of exception it will throw, unless it’s documented in the type.

but my objection is that many times, the possibly of failure is only present in my generic, language/api agnostic knowledge that a file might not be readable or a ship might drop an anchor on a cable. having thought about it, I then have to read the source code to understand what exception might get thrown, so I can prevent my program from crashing.

(i would prefer it to continue as best as it can in the case that some api is unavailable, which is obviously a usecase dependent decision)

My general approach to programming is that every language at the moment is bad in serious, fatal ways, and that we are all just experimenting and practicing so that 50 years from now we might have a decent approach. Haskell is horrible in some ways, excellent in others, and this paper seems to offer a promising technique.
> there are equally disastrous bugs caused by intent – i.e. the programmer had no idea the code could fail.

At least some of these sorts of bugs (depending on how you define “fail”) are captured even in Haskell today by things like Maybe. In fact, if Haskell was total (like Idris) you wouldn’t even be able to write a function that searches a list for an element with this type:

find :: (a -> Bool) -> [a] -> a

Because there’s no way for you to magic an a out of nowhere in the empty list case (whereas laziness lets you use bottom). You could however write a buggy find over known non-empty lists because you could always return one of the known elements, types aren’t a panacea.

Dependent types go further and let you guarantee things like “zip only compiles for two lists of exactly the same length” which can be helpful in some circumstances (specifically in Haskell we use things like zip [0..] too often to want to change the function literally called zip, but you could imagine another name, perhaps zipExact as in the Safe library).

I suspect you probably already know this and meant a different kind of failure, such as mathematically invalid (which you could probably encode via a sufficiently powerful type system), or a simple misunderstanding of the requirements (which you—as the misunderstander—obviously can’t).

> whereas laziness lets you use bottom

in principle, only if you can prove you don’t use the result. there’s nothing wrong with not calculating a value that doesn’t exist.

I can’t have much experience with dependent types, but I don’t see how a compiler could ever tell if zip took two lists of the same length. Don’t dependent types happen at run-time? It seems like you would have to solve the halting problem for them to work at compile time.
It takes a while to grok but for example constant expressions would have type “List Int 7” but a list value from I/O might have type “List Int n” where n is a natural number that also exists at runtime. Now zip would be “forall a, b, n. List a n -> List b n -> List (a, b) n”. Right away you could do a self-zip with your I/O list because the compiler knows that n == n but if you want to zip two different such lists it gets more interesting.

Now you have n, m, and xs : List Int n, ys : List Char m, and you want to do zip xs ys but m doesn’t “unify” with n so the type is wrong. So you have to do an equality check at runtime which gives you a proof of equality or inequality. In the equal case you have a value of type “n == m” and then you can use a language feature to rewrite the type of xs (or ys) according to this equality.

This was written on a phone kind of hastily but maybe it makes some sense.

I’ve been learning the dependently typed functional language Idris over the past few weeks. The type-safe zip over Vects (linked lists with a length in the type) is a canonical example of compile-time proof that they must be the same length: zip : Vect n a -> Vect n b -> Vect n (a, b)

This type specifies that both Vects must be length n. If they are not equal in length, the code will not compile. In fact, this type is so narrowly specified that Idris can infer the entire implementation of the function from it.

Don’t dependent types happen at run-time?

Dependent types in Idris occur at compile time and are erased as part of compilation. At runtime, those Vects do not carry around any length information.

It seems like you would have to solve the halting problem for them to work at compile time.

Idris addresses this by having a totality checker. Every function in Idris is marked as either total (known to halt, barring any bugs in the totality checker) or partial (possibly not halting). Only total functions are allowed in types. For example, the function append:

append : Vect m a -> Vect n a -> Vect (m + n) a

This type uses addition as a type level function. Since (+) for natural numbers (which m and n are) is proven to be total when Idris compiles the library where it’s defined, Idris allows it to be used as a type level function.

So how does the totality checker work? Wouldn’t that purport to solve the halting problem? No, the totality checker in Idris isn’t magic, it’s conservative about what it considers to be total. In order for it to declare a function total, it must identify at least one argument that progresses towards a base case at every step. It can do this even with mutually recursive functions.

In practice, if you write an interactive program in Idris, you can make every function total except for main, which presumably contains an infinite loop that checks for input forever.

> This type specifies that both Vects must be length n. If they are not equal in length, the code will not compile. In fact, this type is so narrowly specified that Idris can infer the entire implementation of the function from it.

It’s more accurate to say that Idris can guess an implementation of the function, but it’s not the only possible implementation with that type. There’s nothing in the type to stop the function from reversing one or both of the lists before the zipping stage, for example. More generally, it could randomize the order of elements in one or both of the supplied lists.

That’s true, though it has no reason to do those other things. In general it tries to do the least amount of work possible to accomplish the goal.

It’s interactive, though, and it’s extremely fun to use. It gives you the feeling that you’re solving a puzzle, and like a good puzzle game, it can do the obvious bits for you.

Yes, but the “least amount of work”, as far as Idris is concerned, has no necessary relationship to what the programmer intended when they defined the type. It guesses the simplest thing that fits, but that’s far from a guarantee of correctness.

It is a serious shortcoming that even when you pay the complexity cost of dependent types, you still can’t specify the intent in this case. I mean, the semantics of “zip” can’t be specified with a type. You can define a “sorted list” or “ordered list” type, but you cannot enforce maintainance of the natural, arbitrary order of list elements. In this sense dependent types are both too complex to reason about and simultaneously not powerful enough to fully specify the function.

One solution to ensuring that the order is preserved (that I’ve actually used in production Haskell code where this was a real risk) is to use a heterogeneous list instead of a Vec. rzipWith :: (forall a. f a -> g a -> h a) -> Rec f as -> Rec g as -> Rec h as

With this type I don’t think you’d be able to rearrange the elements, but you do have to pay the cost of using Const everywhere, as in you can specialize approximately my function as follows (handwavy, won’t compile): zipWith :: (a -> b -> c) -> Vec n a -> Vec n b -> Vec n c
zipWith f as bs = recToVec (rzipWith f’ as’ bs’) where
as’ = vecToRec (fmap Const as)
bs’ = vecToRec (fmap Const bs)
f’ (Const a) (Const b) = Const (f a b)

vecToRec :: Vec n (f a) -> Rec f (Replicate n a)
recToVec :: Rec (Const a) as -> Vec (Length as) a

Rec is defined in Vinyl: https://hackage.haskell.org/package/vinyl-0.11.0/docs/Data-V…

> has no necessary relationship to what the programmer intended when they defined the type

Right, for instance I wouldn’t be surprised if “the least amount of work” here amounted to reversing both lists as you zip.

> The biggest challenge to software development is not the “How can I avoid silly mistakes?” but rather “How do I capture this extremely complicated real world semantics in code?” and languages can’t really solve that

You’re completely correct in the first half of that but the languages really can help there. Dependent types can be a lot more expressive and capture finer details of your semantics than what you’re probably used to.

I simple example is the `pop` function of a List. In Python, you could put strings in a list element. With Java, you can enforce the type of element that goes in a list. In a language with dependent types, you can ensure at compile-time that `pop` can only be called on a non-empty list and returns a list of `length – 1`.

I’d say the difference between Haskell and C are orders of magnitude, not just a difference between 50% (probably a lot lower) and 90% (which would require you to specify the entire contract, at some point you may end up reimplementing business logic inside the type signature at which point you could have bugs there too and you haven’t won that much. You still need to use it right). In dependently types languages, it’s not unheard of with type signatures longer than the implementation itself.

There is of course no silver bullet, but expressive type systems can protect you against unintended behavior and significantly reduces the number of unit tests you’d need to write.

BTW, buffer overflows etc are a bit orthogonal to this, as that’s related to memory safety, not type safety.

Can a dependently typed list be used in the common case where the length of the list is not known at compile time?
Yes. Just like with generics, you can parametrize them, for example in Idris you have the type `Vect n a`, being a list of type `a` with length `n`. An append function could look like this: app : Vect n a -> Vect m a -> Vect (n + m) a
app Nil ys = ys
app (x :: xs) ys = x :: app xs ys

Source: https://www.idris-lang.org/example/

Yep – we can even do this today in Haskell without -XDependentTypes 🙂 using things like singletons
I sometimes phrase it as “yeah, your program is correct because it’s type-checked, but who has checked that your types are correct”?

With dependent types some of the logic will be coded in types, and we would need to check those.

Indeed, that’s why Dan North, Liz Keogh, et al have been banging the requirements capture drum for the last ten/twenty years.

Doesn’t mean it might not be useful.

At least now we are checking that two pieces of saying what we want a program to do match up. It’s less likely to get both the implementation and specification (in the form of types) wrong. Whenever the program fails to type-check, it will make you think about both bugs in the type and bugs in the code, even if the latter is more likely.
Types can’t encode the specification fully, or require a lot of work to encode a specification (so most people will skip this work).

The simplest example is very common banking and e-commerce logic which is basically a series of checks in the form:

if is consistent with .

That’s the same logic as not writing tests because your tests might have bugs in.
> “How do I capture this extremely complicated real world semantics in code?” and languages can’t really solve that

This isn’t true. It’s possible to write a symbolic differentiation system using, primarily, Scala’s type system. In an elegant fashion.

Similarly, Julia represents math better than, say python or Java. A vector type should not need to know about inner products of vectors, because the math definition of a vector space does not.

This will have large effects on how easy it is to decompose your code into neat little pieces.

> but rather “How do I capture this extremely complicated real world semantics in code?”

I’m not sure that’s even it. It’s “I don’t really understand the requirements until I try to implement them and therefore my first 6 attempts are hopelessly wrong”.

The reason dynamic languages are so popular I think is because they make it so fast to iterate through the first 6 attempts whereas statically typed languages make that 50% slower and that generates a huge feeling of helplessness and frustration that you keep doing all this work to satisfy the compiler and then its thrown away.

Of course one can argue that all these people are stupid and should just understand the requirements in the first place, but it’s a separate point: that is reality – people find it hard to understand requirements.

But it is one reason I like more “incremental” languages like Groovy since I can iterate really fast in dynamic mode and then tighten up all the screws afterwards by converting it to statically typed code.

It probably “A Haskell program compiles does not crash if you don’t use a specific set of functions”.

But it probably could be true for a lot of languages, but for Haskell, the set is much smaller

Dependent type can let you achieve some degree of semantic correctness above that. Either by refined types or proof. But it’s not generally applied because there are a lot of things we just couldn’t prove yet.

The point is that Haskell allows you to push more business logic to the proposition (type) level. Nobody actually thinks Haskell programs can’t include mistakes.

Dependent types aim to address exactly “How do I capture this extremely complicated real world semantics in code?”. Except now you have the option to capture some or all of those semantics in types instead of terms.

> The biggest challenge to software development is not the “How can I avoid silly mistakes?” but rather “How do I capture this extremely complicated real world semantics in code?” and languages can’t really solve that.

Probably a tangent but I feel like this is only partly true. DSLs are specifically built to do that for example. The languages can meet us half-way so to speak.

Really good point here:

> How do I capture this extremely complicated real world semantics in code?

This is why I keep using Python after having dabbled in other more interesting languages: the probability of writing correct code for me and the team I support is fairly high.

Yea…my first language that I got really comfortable in was Python (used Matlab, Assembly, C, Basic…etc in college). I’ve since then dabbled in: C++, Ada, Rust, Nim, D, Haskell, OCaml, F#, C#, Java, Clojure, Scala, Kotlin, Common Lisp, Racket, PicoLisp, Groovy, Powershell, Bash, TCL, Julia, APL, Forth, J, Rebol, VBA…the list goes on. I’ve only ever gotten shallow in these languages, but have always found something missing (I’m sure others have said the same about Python, but I wanted to add my anecdotal experience).

Out of all of those, I’ve pretty much stayed with Python as it keeps letting me get my job done with minimal code and maximum readability. I’ve been almost as efficient in Julia and have gradually picked up Powershell’s quirks and find them all good for scripting tasks where you don’t create too many pages of code.

Would I migrate with a large codebase? It depends on the industry of course, but if it is common line of business applications I’d be hard pressed to find something else. I was really hoping Kotlin could help here with it’s REPL/Scratchpad and being on the JVM, but I didn’t end up being too impressed (it might take more time).

I don’t want to contribute to a static versus dynamic never ending argument. I guess I will say that Haskell, the JVM, and .NET just seem to have to much ceremony around them. I’m sure it’s very powerful, but there is a lot to learn that distracts from just writing code.

What about F*? It has dependant types, and is already in use in production at some places (there is a project of making a proved TLS implementation with it and it is quite advanced iirc).

Also, it is somewhat based on F# which is based on OCaml so I would bet that performance wise it can beat Haskell.

https://www.fstar-lang.org/

I would also say OCaml and F# style is also less foreign than Haskell to most newcomers. It’s almost like (type-hinted) python + match expressions + pipes.
I guess that the author didn’t consider them, because neither of them is lazy by default, at the beginning he mentioned laziness as major Haskell advantage.
There are a lot of exciting new things on the medium-term horizon for Haskell. -XDependentTypes, -XLinearTypes, and a new pluggable concurrent (latency-optimized) GC
How come none of the mainstream static type systems support dependent types? On paper they seem like such a neat idea for defining constraints and preventing bugs. I heard Scala and Haskell have optional/partial support for them, so I suppose it takes a pretty sophisticated type system to begin with?
To implement dependent types, your type system must go much deeper into the theorem prover territory than is normal in mainstream imperative programming languages (whose type systems are often more or less ad-hoc rather than based on rigorous theory). Furthermore, dependent types break the fundamental property that the relationship between types and terms is one-directional: types constrain terms but not the other way around.

As one example, C++ has had non-type template parameters (types parameterized by values) for a long time, but only for compile-time constants. AFAIK there have been no formal proposals to introduce full-blown dependent types, rather the focus has been on improving the facilities for compile-time term-level computation.

People are still figuring out how to make things “easy” with them. The theory says you can prove cool stuff, but the syntax to do that while making it approachable is still something people work on. But that’s an outsider’s perspective.
The problem is that mainstream static types systems have two massive holes in them: nulls and pervasive mutability.

Knowing that a list has length n at compile time isn’t that useful if you’re expecting the program to change the list’s length at runtime. Or maybe there was never a list in the first place because it was null.

Dependent types are only recently becoming well known and popular.

The type systems in mainstream languages were not built with these kind of flexibility required by dependent types.

I spent a long winter understanding dependent types followed by a short summer forgetting everything.

Anyway, I’m here for the koolaid.

I’m curious which language/approach did you take. I recently picked Idris with the book Type-driven development in Idris and so far I’m enjoying the ride (I have some experience in functional programming but not in Haskell).
Err… the problem with this is that the size of a list/array is sometimes determined at runtime. This means that if you use dependent types, you can’t determine whether a program is correctly typed until you run the code.
Let’s say we’re creating an array of strings separated by newlines obtained from reading a file. Call this arr[n] where n is determined at runtime.

Now let’s have String index(arr:int[], i:(Fin(arr):int)), where FinInt(arr) is a dependent type that is an integer between zero and the length of the array. As the code is compiling, we have this

int i = 24951;
String str = index(f, i);

The file exists in the future after the compilation. If the future file has at least 24951+1 lines, then the compiler can verify that i < n and that i is correctly typed. But if the file is short, then the type of i is incorrect. So at compile-type the type of i is indeterminate.

you get a compile time error saying f isn’t known to have enough entries, and you resolve it by adding a decidable check that produces a proof that it does. (the check at runtime is just >)
The solution is to just accept that types might change during runtime. Give up the assumption that types are fixed at compilation and that the compiler can catch all errors.

Computer programs exist inseparably from the operating system. The type of a circle might change to that of an ellipse. The size of a file can only be known at runtime. The only way the compiler can catch all errors is if no object is mutable and everything except for closed variables is allocated on the stack. And both of those preconditions are insane.

This is still superior to dynamic typed languages because it tells you exactly what the type problem is and where.

That’s not true.
Even if you don’t know the size (n) of something at compile. You can always make sure in one of branches of your program that some other number m is equal to n even if they’re unknown.
This the compiler can tell.

For instance to produce an element x of type Fin n from an integer (n known at runtime), the compiler can make sure that you produced a proof that x Maybe (Fin n).
There is then only two possible branches in your program at this point and the compiler enforces it. (Just (Fin n) case and Nothing case).

Lisp is like a religion. Once you get it, you cannot help yourself spreading the Word.
> Haskell, at its core, is simple: it is just a polymorphic lambda calculus with lazy evaluation plus algebraic data types and type classes.

No, it’s not, if you can say this sentence without laugh at it, it is because you developed a deep understanding about all the core concepts, meanings, semantics and most probably now you know how to exploit them for their appropriate usage to solve even more complex problems.

It’s simple as opposed to complex, not simple as opposed to difficult.
Three different areas of computer science, highly related to mathematics AND interrelated. Don’t you think these are not just difficult but also complex since all their rules are combined and compose?
I would put it like this: Haskell derives from (relatively) simple axioms/concepts that have a rich and complex interplay.
And such rich complex interplay isn’t exactly what makes things complex? Interplay plays as synonym for complex in that sentence. Any complex system could be broke down in simple axioms and some concepts and still be complex.

By the way I got severely downvoted for an opinion on what is complex and you were the only one open for speak. Thank you for making a better HN somehow.

“Any complex system could be broke down in simple axioms and some concepts.”

This is the crux, I don’t think that’s generally speaking true. I don’t think Java and C++ could be broken down into a few simple axioms and concepts in the way Haskell can.

Think design by committee, where you often end up with hundreds of concepts and behaviours.

Happy to talk.

Java can do it for sure (at least it was originally thought to be this way). With C++ I think you really found the exception that makes the rule since it was built by adding features on top of features without end or direction.
I can agree that there are systems which behave with more elegance than others, with fewer exceptions and principles and more regular moving parts, with reciprocal rules or consistent approaches, for example Java on its origins what sort of simple. Now it is more a mess than ever before.
Yes this quote got me cringing, these things are definitely not ‘simple’. The post’s title is also presumptuous. While Haskell is a brilliantly elegant language, by capturing abstractions at a high level (i.e. category theory). This is also it’s fallacy, these abstraction are inherently hard to understand because, well, they are so abstract. When writing most software in the real world we need to be pragmatic to be able to recruit people to actually write that software. So will Haskell take over all software engineering, I don’t think so. Fortunately we can, and are already learning a lot from ideas introduced by Haskell.
They are simple, but not easy learn.

Lambda calculus is simple. Incredibly simple. But trying use lamda calculus to do anything with is hard.

Source

Leave a Reply

Your email address will not be published. Required fields are marked *

Human Verification: In order to verify that you are a human and not a spam bot, please enter the answer into the following box below based on the instructions contained in the graphic.