Came over this idea when reading https://arxiv.org/pdf/1608.04112v4.pdf and playing with activity-focused natural language grammar Playing with activity-focused natural language grammar.
What I was first thinking of was how you measure the probability of a sentence in natural language. Then, I was thinking of measuring the probability of a set of sentences, and realized this could be used for problem solving.
The problem is that you need a general way to define a set of sentences. This could be done through a constraint DSL, which defines constraints for a language defined with meta grammar.
Examples of constraints:
- A floating number between 0 and 1
- A positive integer
- Any items in this list
For example:
0 doc = ["(" .$:"x" ", " .$:"y" ")"]
This parses a single line of 2D coordinates, e.g. (2.4, 5.2)
.
The self syntax for the meta grammar can be used to parse how the grammar above is parsed. From this you can learn that in the rule doc
, there is a number x
and y
. So, you can create a document that lists up all “variable slots” of the language, and this document can be modified manually or automatically to control the constraints of the generated text.
It could look something like this:
doc.x: [0, 1)
doc.y: (-1, 1)
A generator takes as input the meta grammar, the constraints, and outputs a random text that satisfy both the grammar and the constraints.
Machine learning
One idea is to combine this with machine learning techniques, for example GANs (Generative Adverserial Networks).
(choose constraints likely to generate text with bad predictions)
| |
constraints + meta grammar |
| |
generator -> text -> test -> reward ---------__ |
| | |
networks -> predicted reward -> update networks
- Network 1 predicts expected rewards using the text as input. To parse the text, it uses the meta grammar.
- Network 2 updates the numbers in the constraints, trying to find combinations that leads to bad predictions in network 1.