> [And:] From my position as a bemused observer, your deliberations over the
relative complexity of Formal Grammars for generating strings have no
intellectual connection to PAS and its status as a fundamental ingredient
of human language. I am fairly certain that you have failed to understand
what claims I was making (as outlined in (1--3) above) and misunderstood
them to be claims about Formal Grammars for generating strings.
…The point being that *we* should now look at the grammars/languages in order to understand them properly, and forget all *that* for the time being; my sincere thanks to Logan for doing so :)
>> [Patrik:] With the grammar S ‒> aS, where a is in S, more plausible object sentences are:
> (((boy (accusative (topic)
> ((((house (inessive (accusative (topic)
> [Logan:] So far, I am getting confused by your unbalanced parentheses. For that
last example, did you mean something like
(boy (accusative (topic (house (inessive (accusative (topic)))))))
(((((((boy) accusative) topic) house) inessive) accusative) topic)
Sorry, my parentheses were all screwed up. Your last example has correct bracketing, but it should be two separate sentences. With no further notational conventions, utterances have to be fragmented for a linear presentation, where certain heading parts must be repeated in order to keep it all unambiguous. However no repetition is necessary in a graph presentation, same as for paaS.
> And if not, could you explain your parenthesization convention, and
how it relates to the FL1 grammar?
You can show which part is in which, as in X is in Y vs. Y is in X. I actually added it to make it easier to read, but I suppose it was just more confusing. Maybe we're just not going to need it…
> Namely, the grammar S ‒> aS, where *a is in a role of S*, both the boy and the house are part of the object, which is one of the roles of (i.e. under) the topic. With this definition we can ditch nominative and accusative altogether and use subject and object roles instead:
> (((assault (verb (topic)
> (((boy (object (topic)
> ((((house (interior (object (topic)
> (((girl (subject (topic)
> I think this is looking a lot neater, check out:
> - assault is in a role of verb
> - boy is in a role of object
> - house is in a role of interior (in-location); and importantly, interior is in a role of the object (it follows logically that if you kick something which is included in the interior of a house, you kick the interior of that house; however the house is not a role of the object.)
> - girl is in a role of subject.
> - all are roles of topic. So the topic has many roles, which are syntactic roles (role1).
> Looking at it this way makes the ambiguity in the syntax-semantics
interpretation rule very, very clear. If we say that the meaning of an
'a' followed by the remainder of an 'S' is given by the assertion that
the referent of 'a' occupies *some* role of the denotation of 'S',
then two important questions remain unanswered:
> 1. Which role is being occupied? The denotation of 'S' could have any
number of open slots for different kinds of additional participants to
be inserted, and the grammar so far gives no way of distinguishing
*which one* is meant in any particular case.
The answer is basically really simple, but you might have to suggest a different wording because I'm having trouble making a distinction between two different types of genitive. X has "the role X" which is a role of/under/within Y. For instance, the topic (Y) can include a subject (X). The role which the subject has is that of the subject, and this role belongs to Y as a part of its structure.
In indexing you could say that X is a hyponym of Y, while Y is the hypernym of X. So there's a genitive both ways, but the hierarchy is unambiguously defined.
Another way to put it is that the function of X is to specify Y, and there may be many others.
> 2. What exactly is the denotation of an embedded 'S' that gets
projected up to the next level? I.e., 'a' occupies some role *of
what*? There are only a few possible options (since a semantic graph
only has so many types of features to pick from).
a) The denotation of an 'S' is a proposition with a truth value. That
must be case at the top level (at least for declarative sentences),
but not for embedded structures, since it just doesn't make sense for
anything to occupy a role of a truth value. So, we know already that
there must be a difference in the interpretation rules for a top-level
structure and an embedded structure.
I think you'd normally look for truth values of the complex structures as a whole, but of course it could be possible to look at the details in a similar manner. If you take for example the phrase "girl kicks a cat" you could say that it's all false, or that the girl subject is true, the kick action is true, but the cat object is false - it's actually a chair. If you're thinking about the fragmented FL1 utterances, I guess you might want to say that each string of aS has its own truth value, but looking at the graph you'll be able to assess complex propositions.
> b) The denotation of an S is one of the arguments in the internal
semantic structure of the S (or a wrapper around such that provides
formal machinery for specifying how it links into higher structures).
That's again a trick question because remember that we can look at the sentences in a linear presentation, or think of them as a graph, like we would do with paaS. While both will ultimately give the same result, the intermediate processes may look different. In an aS structure there's always only one argument - or predicate - for an S, but in the graph there are an unlimited number of predicates and arguments within a proposition.
> c) The denotation of an S is a compound predicate or lambda
expression with open argument places to be filled.
Probably not this. Remember, they are endocentric compounds. It's basic linguistics, not CS.
> Options (b) and (c) could be collapsed into one, given the
parenthetical about wrapping arguments/entities in some larger
expression that would make them equivalent to compound predicates with
open argument slots. Since option (a) is nonsensical, the meaning of
an embedded S *must* be some version of (b) and/or (c). So now we can
ask: exactly how do you determine *which* possible compound predicate,
or which precise set of open argument positions, is exposed by an
embedded 'S' to be available for the 'a' in the next level up to fill?
> If the answers to these questions are "'S' exposes every slot you can
think of" and "'a' occupies whatever role you think makes sense", then
you have Gil's minimal IMA language.
I'm now taking a look at http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.117.3628&rep=rep1&type=pdf
I could use some of it as a source, but it's not directly related. Unlike me, Gil theorises about language evolution. I think his work perhaps builds a bridge between a monocategorial bonobo-related language, which - some function words added - gradually builds up to something like Riau Malay.
I'm working with minimalistic formal grammars, and I find that agglutinative is simpler than isolating. In the article above Gil is by no means constructing a comprehensive IMA grammar. I think what he's arguing for is that nouns and verbs can form a common category of content words. I've used the same argument, so I can't say I disagree, but I think it's wisest to keep Gil out of this. I don't think FL has anything in particular to do with Riau.
FL1 is lexically monocategorial, but the aS structure has two syntactic categories which I will explain below.
> This seems to me to be precisely the *opposite* of your prior
formulation. In this case, it makes sense that the denotation of every
'S' is an entity, and every 'a' is a predicate with a well-defined set
of open argument slots that the denotation of 'S' could fill. This
still presents the same kinds of ambiguity as above, though- we have
no way of knowing *which* argument slot is intended (which role the
entity described by 'S' is intended to be in), nor which entity is
intended to be projected to the next level up. *Unless* all of that
knowledge is lexically encoded in the definition of every 'a'; but
that seems very unwieldy, and impractical for a usable human language.
In fact, there was a Speculative Grammarian article on that very
subject: http://specgram.com/SoLP/07.fasnacht.vov.html .
I ruled out such a possibility when I started ("must be fully expressive and learnable").
The rules of FL1 may allow many different ways to use the language, some of which may be ambiguous, others unambiguous. I think this is true about any language. For instance, if you make a logical proposition about two different cats, but do nothing to indicate which one is which, then it's either ambiguous or erroneous, depending on the view.
We should take it for granted that FL1 can be used ambiguously or erroneously, and narrow our question as to whether it can be used unambiguously. So let's only look for a method that can translate any given logical expression with the same precision. Absolute precision in FL1 is impracticable, because you can add specifiers to make an expression more precise, and you can add n specifiers to further specify each specifier. That means that the more specific you are, the more empty gaps there will be in the semantic structure. I call this vagueness as opposed to ambiguity. It's comparable to a requirement of having n specifying predicates for each argument in PL. So, instead of examining the absolute precision of a statement in FL1, I suggest we anchor the required precision to any given valid logical expression.
> This does bring up the possibility, however, that some of these
semantic selection rules *can* be lexically encoded. But if you manage
to resolve all of the ambiguities through lexically-specified rules
(which is what you seem to be going for with the special words for
"topic", "accusative", "inessive", etc.), then your super-simple
*syntax* isn't really all that impressive anymore.
Exactly, and that's why I abandoned the formulation you're talking about here. To recap, we're looking for just one formulation of FL1 that we can agree is unambiguous in parallel with any logical language - but also one that is not as tedious as the one you described.
Now that I've gained more understanding of the semantics, I suggest we go to back to my previous method of generating FL1 sentences from FL2 trees. I think me and Daniel agreed previously that FL2 is unambiguous (it's a DCFG), and as you're not protesting what I said about the n to the power of n nods, I take it you agree as well.
So, again, let's use a prepositional FL2 phrase and insert it to http://ironcreek.net/phpsyntaxtree/ and this time: don't uncheck color! We're going to use it. The English sentence is: "Girl kicks a boy because milk was stolen from the fridge." We have the intermediate sentence in FL2 (don't use this one!):
[event [subject girl] [object boy] [reason [object [origin [location kitchen] fridge] milk] steal] kick]
and the same expression with a different word order which gives a tree which is nicer to read from RIGHT to left (use this and uncheck auto subscript):
[event [reason [object [origin [location kitchen] fridge] milk] steal] [object boy] [subject girl] kick]
So you see it gives a neat parse tree with red content words and blue grammatical words. I'm still not 100 % per cent sure about the correct semantic formula for FL1, but I know that the FL1 structure must be exactly as unambiguous as the corresponding FL2 sentence. If that' not enough, we can add further specifiers manually (verb?), but I wouldn't do that without a good reason because we might get ourselves confused with the semantics, as happened to me before. I think the wisest thing is to look at the tree, and *knowing* that it is unambiguous, try to understand *why*.
The FL1 sub sentences are (L-R, bottom-up):
kitchen location origin object reason event
fridge origin object reason event
milk object reason event
steal reason event
boy object event
girl subject event
So each a (red) in aS has content semantics, and each S (blue) in aS has structural semantics. This way we can distinguish between two kinds of subjects, for instance, one of which is a syntactic role, and one that is an actual thing we are talking about. Consider the English sentence "An object steals a subject from the event":
[event [object subject] [subject object] [from event] steal]
or, if you like:
[event [object [from event] subject] [subject object] steal]
Of course we could try to establish two distinct lexical-semantic categories for a distinction between content words (e.g. subject) vs. meta-words (e.g. nominative), but as it happens to be that there are two syntactic elements, a and S, we can define the semantics for the grammar in such a way that we can use the same word for both, and yet have different contextual semantics for them. (Such as: "a is an a in the syntactic structure of S"; while the meaning of each word is defined in the dictionary.)
> In that case, you
haven't made an exceptionally simple language- you've just stuffed all
of the complexity into the dictionary instead of the syntax rules. And
if you find that there are a relatively small number of classes of
lexical elements that tend to share similar structure in their
selection rules, then it no longer really makes sense to call it a
monocategorial language- it then has multiple practical parts of
speech distinguished by their selection rules, and it has an
effectively more complicated syntax restricted by the ways in which it
"makes sense" to combine words with different selection classes.
But as you can see above, this not true for S ‒> aS; and that's the reason I suggested you define values likewise for paa and S in paaS. I mean, be careful not to say something that will make your grammar look bad, actually… ;)
>> In other words paaS has one infinite branch. This is modest in comparison to FL2 which allows n items on level 2, each of which have n subitems on level 3 (i.e. n to the power of n). On level 4 each of the n subitems have n subitems of their own etc. (i.e. n to the power of n to the power of n to the power of…).
> Yes. It's just a straight linear list, with no internal branching or
non-trivial recursion at all.
Complex recursion is not trivial when you work with complex sentences. Take the surface structure "Girl kicks boy" and add an argument for each constituent, e.g. "Girl from town, kicks in school, boy from city". Then you add an argument for each, e.g. "girl in hallway", "town in South", "kicks at 12", "school by lake", "boy in shirt", "city in North" etc. etc. etc. PaaS lacks the expressive power that FL2 has, so I'd rather consider it in parallel with FL1.
>> So we're talking about two formal grammars: FL2 which generates endlessly complex trees, and paaS which only generates a very limited tree. That's why I suggest it is best interpreted as a linear FL1 type language with the grammar S -> aaaS, where the proper parse tree can be drawn based on a morphological analysis, rather than an nonlinear FL2 language for which complex parse trees are generated by conventional formal means. How would you address this?
> I would say that a monocategorial reinterpretation of S -> paaS as S
-> aaaS ignores the wildly different lexical properties of predicate
words vs. argument words which justify putting them in different
classes, and in doing so ends up allowing an infinite number of
sentences that, while maintaining the restriction that their word
count be a multiple of three, cannot be assigned any meaning by the
language's interpretation rules. Not just "no sensible meaning" but
"no meaning whatsoever"- sentences for which no semantic graph can be
constructed. It thus makes no sense to claim that that is, in fact,
the grammar of this language.
Although I suggested an ad hoc marking for each syntactic function: predicate, Davidsonian argument and ordinary argument, whereby you can have aaaS. If not, I would suggest abcS, but of course all three grammars may be considered, abbS being an intermediate form.
> *Unless*, of course, you re-interpret 'a's not as separate
argument-words, but as morphological inflections on a 'p', which
should not be reflected at the level of syntax. In which case, the
multiple-of-three restriction is no longer valid, and the full syntax
of the language becomes simply S -> aS | 0; i.e., any list of words
will do. The *interpretation rules* for the syntax of this language
would be, however, *very* different from the interpretation rules for
True, and a morphological parsing would be required, as for the agglutinative S ‒> a.