Cheap and Secure Web Hosting Provider : See Now

# [Solved]: Safe way to explicitly define new types instead of using Algebraic data types for my functional language

, ,
Problem Detail:

Question:

As I'm working on a Hindley-Milner typed lambda calculus, in order to make it usable I need to add some types such as list and pairs. The way I currently do it is, I have an unsafe keyword that lets me explicitly set the type of a global function that tells the type checker to just trust the explicitly given type and ignore the term's type. Then I use Church encoding to encode my types, so for instance, I can create my Pair type by encoding it in a (a -> b -> x) -> x function like this:

unsafe pair : a -> b -> (Pair a b) = \a.\b.\f.(f a b) unsafe pairRead : (a -> b -> x) -> (Pair a b) -> x = \f.\p.(p f) 

Then I can define fst and snd like:

fst = (pairRead \a.\b.a) snd = (pairRead \a.\b.b) 

It seems that, since lambda calculus is Turing complete and I'm just wrapping lambda types with explicitly given types, I should be able to emulate any Algebraic data types, and more, in case ADTs have any limitations (seems like they do since there are also GATDs, which I guess I should also be able to emulate).

The obvious problem is that this is not type safe, which defeats the whole purpose of type checking, so I wonder if there is already a way to do this in a safe manner instead of attempting to recreate the wheel. I could also just implement ADTs but, if there's a way to make it safe, I'd like to use this better since it's closer to what I have already implemented, it's simpler (I won't need to also implement pattern matching) and it seems to be a superset of ADTs.

What I've done so far:

I'm pretty sure if I provide a way to encode an existing type into a custom type while keeping all the type variables, and a way to go back to the original type, then I will be safe.

For instance, I could add a keyword using that uses this syntax:

using [CustomTypeName] [FullListOfTypeVariablesInOriginalType] = [OriginalType] {     [FunctionsDefinitions] } 

Inside [FunctionsDefinitions] I'd define my functions and I'd have access to two other new keywords $to and $from, which would have the types of functions that go from the original type to the custom type, and the other way around, respectively, but would behave as an identity function during evaluation. So for instance, I would define my pair functions like this: (note that I'm including the x in the Pair definition):

using Pair a b x = (a -> b -> x) -> x {     pair = \a.\b.($to \f.(f a b)) pairRead = \f.\p.(($from p) f) } 

In this case, $to and $from would be equivalent to these functions (defined using the unsafe keyword I mentioned above):

unsafe $to : ((a -> b -> x) -> x) -> Pair a b x = \x.x unsafe$from : Pair a b x -> ((a -> b -> x) -> x) = \x.x 

So this should be safe since $from will always map back to the original type because the Pair type includes every type variable in the original type, but is still not good. Since I'm including x in the definition of Pair, my Pair type will include the return type of whatever function I pass to pairRead. So if at any point I read a pair p using pairRead with a function that returns a number, from that point on I could only read pair p using functions that return numbers, since the x in Pair would have become a number type. For instance, a function swap = \p.(pair (snd p) (fst p)) would require both types in the pair to be the same. So I'd like to be able to ditch the x in the original type to get Pair a b. So the question would be, when can a variable be safely ditched? I think this would be variables that are not bound to an outer type. In my definition of pair above, the type of \f.(f a b) is really$\forall x (a \rightarrow b \rightarrow x) \rightarrow x$with$a$and$b$bound to the outer type, so if I want to encode this type, I only need to store the types of$a$and$b$to be able to recreate the original type. I think what my type checker needs to do is basically (in the case of Pair), when it sees a $to keyword, first type check its term, then unify its type with (a -> b -> x) -> x (using a fresh copy of x) and finally make sure that no occurrence of x has been unified with any type that exists in the current type context. If that's all good, then return the type Pair a b with a and b taken from the resulting type.

But then again, I'm no mathematician and I don't have a proof of correctness, and somebody may have already came up with a better way to do this.

You're on the right track: people have come up with the same way to do this. The general concept is known as abstract types.

With the Church encoding, the type of a pair of elements of types $a$ and $b$ is polymorphic: it has the type $\forall x, \mathtt{Pair} \, a \, b \, x$ where $x$ is the type of the destructor's continuation. The type family you're making abstract is $\mathtt{Pair} \, a \, b$. You aren't "ditching" the $x$ — you're including it in the type.

Once you've defined a type alias called Pair and some functions whose type refers to this alias, you can "forget" or "hide" the fact that this is an alias. With the alias hidden, values of type Pair can only be constructed and destructed via the functions defined in the "module" or "package" where the alias is not hidden.

To show that this can be included in a type system without breaking anything, make the aliasing explicit: define

unsafe_pair : ∀a, ∀b, Pair a b → ∀x, ((a → b → x) → x) = identity unsafe_pairRead : ∀a, ∀b, (∀x, ((a → b → x) → x)) → Pair a b = identity 

In the place where the Pair alias is transparent, use those constants. You aren't using any kind of abstraction, so you're still using existing type system. In the place where the Pair alias is hidden, forbid the use of those constants. You've effectively created an abstract type without touching the type system.

Zero type theory lets you get started with abstract types, but type theory comes up if you want to model the concept of hiding. A module that declares an abstract type is saying "there's this type, and it has those properties (functions), but I'm not saying exactly what it is" — mathematically, this matches the concept of existential quantification, and indeed abstract types can be modeled with an existential quantifier.

pair_module = ∃ Pair, ((Pair a b → ∀x, ((a → b → x) → x)),                        (∀a, ∀b, (∀x, ((a → b → x) → x)) → Pair a b)) 

It may be difficult to get an intuition of what's going on here, because the pair abstract data type essentially allows any values to be constructed. I encourage you to work out types with constraints, for example an abstract type of pairs where the first element is less than the second, so the conversion function unsafe_increasing_pair is identical to unsafe_pair but the constructor for the IncreasingPair type compares its arguments and swaps them if necessary.

If you want to extend the basic concept to allow other types to depend on abstract types, then the type theory becomes more complex. You enter the realm of modules and functors, about which there is a rich literature, especially based on ML-like languages.

Here are some historical papers that I think are worth reading for you because you're thinking among the same lines. They're fairly readable even without much of a mathematical background since they didn't have much mathematical theory to build on.

• James H. Morris, Jr. Protection in programming languages. Commun. ACM, 16(1):15– 21, 1973. doi:10.1145/361932.361937 — PDF
First read his POPL '73 paper Types are not set, if you can get a copy.
• Barbara Liskov and Stephen Zilles. An Approach to Abstraction. Computation Structures Group Memo 88, MIT (project MAC), Sept. 1973. PDF
And for a broader retrospective A History of CLU.
• David MacQueen. Modules for Standard ML. In LFP '84: Proceedings of the 1984 ACM Symposium on LISP and functional programming, p. 198–207. doi:10.1145/800055.802036PDF
• John C. Mitchell and Gordon D. Plotkin. Abstract Types Have Existential Type. ACM Transactions on Programming Languages and Systems, Vol. 10, No. 3, July 1988, pp. 470–502. doi:10.1145/44501.45065PDF

As usual, Benjamin Pierce's Types and Programming Languages (TAPL) and the second book ATTAPL have relevant material: chapter 24 of TAPL about existential types and their use to model abstract types, and primarily Robert Harper's chapter on modules in ATTAPL.