Instead of giving you an example that anybody actually uses, I'm going to tell you about a cool idea I've been reading about that hasn't gotten much actual use.
The basic idea is to use a generalization of pattern matching. Languages like ML and Haskell support pattern matching, but in rather limited ways. Crucially, patterns are not first-class citizens of the language. (For Haskell, at least, there are some libraries to remedy this, but I don't know how effective they are.)
So how can we generalize pattern matching to help you solve your book problem? Normal patterns allow you to match data types in the form C x₁ x₂... where C is some constructor and x₁ x₂... are either matchable symbols or arbitrary patterns. An example of a pattern would be Chapter (Cons (Section content) rest). We differentiate between the matchable symbols and the constructors on case: lowercase means matchable, uppercase means constructor. This is somewhat limited: you cannot easily write code that is generic over the constructor at the head of the pattern. You could write a function that counts the sections in a chapter, but you could not write a function that counts the sections in anything.
So let's relax the restriction that patterns have to be headed by a constructor. We can now have patterns in the form x y. These are static patterns: you can match data against them without evaluating the pattern. With this, we can imagine writing a function to count sections generically:
count_sections x = 0 -- If this is some terminal, it cannot be a section
count_sections (Section content) = 1 + count_sections content
count_sections (x rest) = count_sections rest
This goes through the entire data type you passed in and counts all the sections it sees. It assumes sections can be nested. This will let you count the sections in a Book or a Chapter or a Series or whatever you want.
So, this is generic over the data you pass in. However, if you wanted a function to count Chapters or Sentences or what have you, you would be forced to write it. This calls for another extension to pattern matching: dynamic patterns. Patterns are now in the form x ŷ where x is a variable and ŷ is a matchable symbol. Constructors are still uppercase, so Section is a constructor and not a variable.
A variable in a pattern can be instantiated with another pattern. So now we can write a truly generic count function:
So now if you want to count chapters in your book, you would just invoke count Chapter book. If you want to count sections in your chapter? Easy: count Section chapter.
You can also use patterns and constructors for polymorphism by overloading functions on different constructors. One interesting idea is to allow adding cases to functions after their definition. This way you could have an existing toString function and then, when you've defined a book, add a new case to it:
toString += | Book title content -> "Book: " ++ title
This way you can have a toString function that works on any type of data.
All my examples are obviously in pseudocode. (And hey, it looks nothing like Python! The whole "runnable pseudocode mantra annoys me.) I haven't covered all the edge-cases, and I haven't even begun talking about a type system for this mess. Happily, there's somebody who has, and wrote a book about it (that's where I got all these ideas): Pattern Calculus by Barry Jay[1].
I'm also not sure whether this is the best possible approach. However, I think it's certainly very neat. If you like this idea, the book is definitely worth a look.
I started reading this comment and immediately knew you were talking about Pattern Calculus. Awesome to see someone else interested in the idea.
I live a couple of blocks over from UTS and went to a programming language event there. I only found out about pattern calculus after talking to Barry Jay. Turns out UTS students have been working on an implementation, called Bondi:
Hmmm, this might be kind of similar to what the Demeter system actually did. With Demeter, you would provide some sort of machine readable descriptions for your data structures. Once you implemented this description, you could ask Demeter to fetch you all sections of a book, or all sentences of it, or what have you. If you were to then change the representation of the book, you would't have to change any of the client code--you'd only have to change the machine-readable description of the data structure. Then when client code asked Demeter to fetch all the sentences, everything would still work fine.
Maybe the best solution is to just resuscitate Demeter! Though apparently nobody actually used it either.
This is the first time I've encountered this and what you wrote packs a lot of ideas in a small space so forgive me if I have misunderstood.
What you write seems like an even more powerful version of the Active Patterns in F#, which are already really powerful. Active Patterns are the closest thing to First class patterns in a production language * .
Haskell has a similar thing in views. But I think another concept, unique to and playing better to its strengths, while attacking this level of problem are Generalized Algebraic Datatypes. I mentions GADTs because it deals specifically with "This is somewhat limited: you cannot easily write code that is generic over the constructor at the head of the pattern. " Like I said, I've never heard of the pattern calculus - is there any problem it solves that could not be solved with GADTs with a similar level of effort?
* certainly biased but I see active pattern use more than I see extractors or views
I'm afraid that I'm not too familiar with either active patterns or GADTs at the moment. However, given the understanding that I do have, I think they are both unrelated to the basic dynamic pattern matching that I was talking about.
As far as I can tell, active patterns only affect the constructor. They let a constructor like Even to run an arbitrary function when it's matched. This lets you customize what a constructor actually means, which seems very useful, but is orthogonal to dynamic patterns. Dynamic patterns affect the patterns and not the constructors, where a pattern is the expression that is actually matched. That is, given a function f (Foo a b) the pattern is Foo a b. Active patterns would let me customize how Foo works, but the full pattern (Foo a b) still looks and acts the same.
Given this meaning of "pattern", I think active patterns do not constitute first-class patterns. You can't take a pattern and pass it into a function or make up a pattern of variables that can be any pattern. With first-class patterns, you should be able to take the pattern (Foo a b) and pass it into a function as a pattern. So a function like
match pattern (pattern x̂) = x
would make sense in a system with first-class patterns. Here, a pattern is an argument to a function and then used to match against the next argument. Hopefully this clarifies the particular thing that dynamic patterns add to the language.
GADTs are also unrelated. For one, GADTs only affect the type of a constructor and not its behavior. I did not talk about types in my post at all--everything I talked about would make sense in a dynamically typed language that had pattern matching. It just so happens that dynamically typed languages tend not to have pattern matching like this, so it's associated with statically typed languages.
Also, GADTs, like active patterns, only affect the constructor. They do not change the actual patterns used to match against values (except by restricting possible types of the matched values).
So really both active patterns and GADTs are somewhat orthogonal to dynamic patterns. When I said a code that is generic over the head of the pattern, I meant a function like:
doFoo (a b) = b
where doFoo will match both Foo x and Bar x and Baz x and extract x in every case. This function is generic over the constructor in the sense that it matches regardless of what the constructor actually is--the constructor becomes just another matchable term.
This sort of function, which just uses static patterns, is already something that a GADT cannot do. Dynamic patterns are even more flexible. My match function above lets you take an arbitrary pattern like (Foo a b) and match (Foo a b x) against some value to get x. You're passing a pattern into a function which then uses that pattern to match against its next argument. This feature has nothing to do with the type of the constructor.
I hope this cleared things up for you. However, it's almost 4 in the morning and I'm not entirely coherent. If you're still confused, or if I made some glaring errors, feel free to email me about this when I'm more awake :P.
Excellent! Thank you, yes I am clearer, the concept is even more awesome than I thought .
>I think active patterns do not constitute first-class patterns.
I agree, but like I said, they are the closest thing in active use, they extend the type of structures one can deconstruct to encapsulated ones while being more flexible in implementation. They fall in the class of techniques which extend and make matching more flexible, other examples would be multimethods and predicate dispatch. So very much orthogonal to approaches in pattern calculi but I think in a subspace in terms of matching.
>GADTs are also unrelated. For one, GADTs only affect the type of a constructor
Yes they are unrelated that's why I focused on expressitivity, and you are right, I misconstrued what you meant by generalizes on constructors. But everything has a cost - runtime, learning, cognitive overhead - what I was curious of is: how much does the pattern calculus buy you? GADTs are another concept that allow one to elegantly tackle problems where it looks like the pattern calculus would help. I was just looking for an example where something like GADTs fumbles in comparison.
It's clearer to me now that the pattern calculus is definitely more flexible in terms of match semantics but I can't think of any situation in practice where this would give an added advantage.
Also, Is this typically done via term rewriting and is there anything on the decidability of this?
I think the main advantage is what I pointed out before--being able write functions that work on a wide variety of data types. I think my count example is something that could not be done with GADTS:
Keep in mind that this is pseudocode, so some of the semantics may not be perfect. The basic idea is this: the first argument count takes is a pattern, like a constructor. The second argument is matched against the dynamic pattern (constructor x̂). This pattern has one matchable variable x̂ and one free variable constructor. This free variable is bound in the previous argument to the function.
So this lets you use the count function to count any sort of sub-pattern in its argument. Let's imagine you have some hierarchy like (Book (Cons (Chapter stuff) (Cons (Chapter stuff) nil))). You could use the count function to count the number of chapters like this:
count Chapter book
Basically, you're parametrizing your count function on the pattern you want to count. So count can work on any data type and any pattern. If you wanted to count sections, you could:
count Section book
Now lets pretend that sections have numbers. Instead of (Section content) we have (Section number content). You can now count something like how many sections have the number 5:
count (Section 5) book
I think you could generalize this even further, but it would be more complicated.
I'm not sure how this is typically implemented, largely because it typically isn't :P. I think there's essentially one implementation in a language called bondi[1] but I don't know how it is implemented.
As for decidability, I'm not sure. The entire pattern calculus--that is, an extension of lambda calculus supporting this sort of pattern matching--is Turing-complete. I don't know if just the pattern matching part is undecidable though.
As I said in my original post, there is a nice book on this called Pattern Calculus. I think you can read it online[2], which is nice because it turns out to be fairly expensive on Amazon.
Thanks for an excellent write up to the idea. That was very clear.
I am very intrigued and was looking at purchasing that book to learn more - but then I saw the price. I'll have to scrounge a copy from a library somewhere. I thought on demand printing was going to reduce the costs of books on the long tail!
Oh yeah, it's more expensive than I thought :/. Happily, I was lent it by a friend.
You might have some luck getting it as an ebook from the library. My university's library seems to only have it as an "electronic resource", and it's a pretty big library.
I like the book, but a good part of it is just background. It goes through a bunch of variations on the lambda calculus in building up to the "pattern calculus". If you're already familiar with the basics, you might be better off just reading some papers on it. (Hopefully there are some you can read for free, but I'm not sure.)
TBH, looks like the only hard bit here is fiting static typing into your system. I believe the dynamic languages with pattern matching facilities (erlang, echeme) should already be able to pass any symbol they want for the constructor name.
Yes, they can use any symbol. But I don't think they can match against a constructor (that is, write a pattern like (x y) where x can match any constructor).
Also, I don't think they can have dynamic patterns. That is, you can't take part of a pattern, pass it into another pattern and use the resulting combination of patterns to match against some value.
These two things are the interesting generalization of pattern matching that I was talking about.
The basic idea is to use a generalization of pattern matching. Languages like ML and Haskell support pattern matching, but in rather limited ways. Crucially, patterns are not first-class citizens of the language. (For Haskell, at least, there are some libraries to remedy this, but I don't know how effective they are.)
So how can we generalize pattern matching to help you solve your book problem? Normal patterns allow you to match data types in the form C x₁ x₂... where C is some constructor and x₁ x₂... are either matchable symbols or arbitrary patterns. An example of a pattern would be Chapter (Cons (Section content) rest). We differentiate between the matchable symbols and the constructors on case: lowercase means matchable, uppercase means constructor. This is somewhat limited: you cannot easily write code that is generic over the constructor at the head of the pattern. You could write a function that counts the sections in a chapter, but you could not write a function that counts the sections in anything.
So let's relax the restriction that patterns have to be headed by a constructor. We can now have patterns in the form x y. These are static patterns: you can match data against them without evaluating the pattern. With this, we can imagine writing a function to count sections generically:
This goes through the entire data type you passed in and counts all the sections it sees. It assumes sections can be nested. This will let you count the sections in a Book or a Chapter or a Series or whatever you want.So, this is generic over the data you pass in. However, if you wanted a function to count Chapters or Sentences or what have you, you would be forced to write it. This calls for another extension to pattern matching: dynamic patterns. Patterns are now in the form x ŷ where x is a variable and ŷ is a matchable symbol. Constructors are still uppercase, so Section is a constructor and not a variable.
A variable in a pattern can be instantiated with another pattern. So now we can write a truly generic count function:
So now if you want to count chapters in your book, you would just invoke count Chapter book. If you want to count sections in your chapter? Easy: count Section chapter.You can also use patterns and constructors for polymorphism by overloading functions on different constructors. One interesting idea is to allow adding cases to functions after their definition. This way you could have an existing toString function and then, when you've defined a book, add a new case to it:
This way you can have a toString function that works on any type of data.All my examples are obviously in pseudocode. (And hey, it looks nothing like Python! The whole "runnable pseudocode mantra annoys me.) I haven't covered all the edge-cases, and I haven't even begun talking about a type system for this mess. Happily, there's somebody who has, and wrote a book about it (that's where I got all these ideas): Pattern Calculus by Barry Jay[1].
[1]: http://www.amazon.com/Pattern-Calculus-Computing-Functions-S...
I'm also not sure whether this is the best possible approach. However, I think it's certainly very neat. If you like this idea, the book is definitely worth a look.