TODO list for v6.pm, Pugs-Compiler-Perl6 - precedence problem: TimToady: S03 says .<> is tighter than $ - v6.pm translates $$foo to ${ $foo->{qw(bar)} } S03.pod:As with Perl 5, however, C<$$foo[bar]> parses as C<( $($foo) )[bar]> - Rule and Pod grammar from Pugs::Compiler::Rule still need work to compile properly (simplified header, unnecessary use of Data::Bind, wrong compile-time capture count) - 'my A $x' parses to 'my A; $x' ('A' is a Type), reported by ajs - rewrite Term::substitution() using token - this is giving a wrong result: perl -e 'use v6-alpha' - ' grammar E { regex ab { a*b } }; "accaaab"~~ //; $/.perl.say ' - merge in compile-time objects - anon classes, roles - merge http://nothingmuch.woobling.org/MO - merge PIL-Run runtime (lazy lists, junctions) - fix perl5 calling conventions ('use CGI;', etc) - semicolons in array slice [1;2] - 'sub', 'method' are functions - replace 'Regex' matching with 'Token' - implement hash and array - implement statement and operation classes in the grammar; implement macros (see lrep for an existing implementation). - block parsing: t/ref.t: if $string.ref eq Str { say "ok 1 # TODO" } for map {},@x { .say } pre-release - jun/2006 - cache compiled YAPP and Rule code, use Module::Compile instead of Cache::Cache ? - write POD - v6.pm - Pugs::Compiler::Perl6->compile - Pugs::Grammar::Perl6->parse - add an option to emit ast Possible Backend Modules - Moose, Moose::Meta - Class::Multimethods::Pure (older TODO - needs a revision) Priorities - Things that are not giving error messages: (1 if 1) # 'if' is an op - I think this is now legal sub sub xxx ... # 'multi' and 'sub' are parsed the same way - workarounds: - listop; -- say; - $obj.op() -- when 'op' is a declared operator (not plain bareword) - prefix/infix:<&> vs. &code - /rule/ is a term - merge Pod.pm into Perl6.pm - check that these special cases are covered: moose=>1 moose: moose:{antler()} - transform hash collisions into alternations example: short name of prefix:<&> (Prefix.pm) vs. sigil in &name (Var.pm) - missing syntax elsif TODO - implement ordered testing of categories <%a|%b|%c> depending on the parser state - parse the op table from http://svn.perl.org/parrot/trunk/languages/perl6/lib/grammar_optok.pge - use the grammar from http://svn.perl.org/parrot/trunk/languages/perl6/lib/grammar_rules.pge - macro - dynamic grammar - unify syntax in Operator.pm: operators with the same precedence, fixity, and associativity subroutines sub/multi - Implement compile-time dynamic 'add_token' (lexical at run-time) - Implement double-quoted-string split on variables (interpolation) - Implement /rule/ and '/' - see expect... in Expression.pm TimToady: is the lexer the right place to make the '<'/'' distinction? (it is not user-modifyable) fglock: that depends on what you count as part of the lexer. The bottom-up parser knows when it's looking for a <%infix> vs <%prefix> vs <%postfix>, so only those tokens are active that would be valid at that spot. (I oversimplify the hash stuff slightly there.) It's more like: <%infix> vs <%prefix|%term|%circumfix> vs <%postfix|%postcircumfix> assuming we adopt the new <%a|%b|%c> notation to combine longest-token processing of multiple hashes. fglock: for speed one could cache all the hash keys for all the hashes in a trie or some similar structure. Just have to be careful that longest key wins regardless of hash, and in case of tie first hash wins. 'course you have to recalculate if any of the hashes is modified... can probably treat alphanumeric sub names specially so that you don't have to recalculate on every sub declaration. if you assume that no "foo" prefix operator or term can match if the next char is alphanumeric. maybe just run the prescanned identifier down a different trie than the non-alpha ops. actually, if you know the length then the ident one doesn't need a tree. Just a hash would work. since you know its length already. TimToady: what if both postcircumfix and infix are expected? then the op is chosen based on if there is whitespace or not? like in %ENV vs. %ENV <... <%postcircumfix|%infix> is what you look for before whitespace, and <%infix> after. that's why we completely outlawed whitespace before postfix. hmm, that doesn't quite work. I think at postfix location you actually look for <%postfix|%postcircumfix>|<%infix> becuase you don't want the %infix participating in longest token there. $x<=2 is an error, but $x <= 2 is okay. or looking at it in terms of whitespace, if you don't get any match on a postfix, then you can pretend there was whitespace even if there wasn't, and try %infix. I think I'll need to do some tests ... - how about /rule/ vs. division? is it just that rule is a term and division is an op? yeah, that's just simple term vs op expectation. just as in P5. It's really only the postfix category that's new to P6 TimToady: is specifying an 'end token' parameter the way to go in interfacing the rule-based statement parser to the bottom-up parser? "an" end token is probably limiting it unnecessarily. A set of end tokens is more like it. But a match can match a set of tokens, so passing in an end match is probably sufficient. The real problem with lvalue substr is that Perl 5 fakes it without a real COW engine underneath. are the things inside brackets parsed using subrules? like in (@a;@b); - so that the 'end' is not matched in the wrong place As a basic rule of thumb, if you have brackets, you probably want a subrule though it's possible to do with op prec if you fiddle with infinities in the relative precedence inside and outside. but it's difficult to detect the (@a;@b] error without knowing the "right" terminator. is sub 'circumfix:( ]' a valid spec? It's sort of like the difference between which things you need to use recursive regex on vs which things you can match with a flat regex (in P5 terms). syntactically, yes, that's valid. You're nuts if you do it though. And you'll likely get a warning anyway if you define two circumfixes with the same left side. s/two/a second/ but there's nothing says a circumfix can't be - Implement [op]@list - Implement expressions inside names - like: prefix:{'+'} - Make the tokenizer match eagerly (faster) - Implement the "magic hash" dispatcher TimToady on #perl6 - xxx:<+> has to be considered just a funny looking name. It's the grammar's responsibility (somehow) to pull in any existing xxx and newly created xxx and combine them into any rules or %hash that references them. supposing a grammatical category shows up in %xxx, then we need two ways to deal with it. first, if we want one category to hide another, you can get away with a mixin style of rule { %xxx | %yyy | %zzz } but there are some syntactic categories that have to be magically combined like compile-time roles: rule { %xxx_or_yyy_or_zzz } that is, the longest-token rule is applies in parallel across all the categories simultaneously. that's why the magic hash was invented (or more accurately, is scheduled to be invented :) - the tokenizer should get tokens lazily ? - is 'space-{' is found, is sent to the opp - if the opp is expecting an operator, it means end-of-expression TimToady in #perl6 - space + block is a top-level block only where an operator is expected, and you're not in brackets. where a term is expected, it's just a closure argument. (or a hash composer) - Specify/generate AST P::C::R BUGS - A Match doesn't stringify if there is a capture - needs (update: implemented in :ratchet) - cleanup Pugs::AST::Expression (no longer used?)