Discussion:
[perl.git] branch blead, updated. v5.17.3-203-g7d1328b
(too old to reply)
Father Chrysostomos
2012-08-31 01:18:33 UTC
Permalink
In perl.git, the branch blead has been updated

<http://perl5.git.perl.org/perl.git/commitdiff/7d1328bb7c26d556809b1aed184cec377b18f20c?hp=9aa00c80e0d30d8ecc84072be5c253a368aa5653>

- Log -----------------------------------------------------------------
commit 7d1328bb7c26d556809b1aed184cec377b18f20c
Author: Father Chrysostomos <***@cpan.org>
Date: Thu Aug 30 18:01:27 2012 -0700

[perl #114410] Reset utf8 pos cache on get

If a scalar is gmagical, then the string buffer could change without
the utf8 pos cache being updated.

So it should respond to get-magic, not just set-magic. Actually add-
ing get-magic to the utf8 magic vtable would cause all scalars with
this magic to be flagged gmagical. Instead, in magic_get, we can call
magic_setutf8.

M mg.c
M t/op/utf8cache.t

commit 4785469e43ed59eb07455c31ce1079ada2c9f91f
Author: Father Chrysostomos <***@cpan.org>
Date: Thu Aug 30 16:42:30 2012 -0700

utf8cache.t: Skip only the XS-dependent test

M t/op/utf8cache.t

commit f12ade25b656371aa9c4ec20f48e785f031b811b
Author: Father Chrysostomos <***@cpan.org>
Date: Thu Aug 30 16:40:48 2012 -0700

test.pl: Add skip_without_dynamic_extension

M t/test.pl

commit 6b00f562eddf90e215cb117d990dd4595e072f29
Author: Father Chrysostomos <***@cpan.org>
Date: Thu Aug 30 16:09:58 2012 -0700

Break s//3}->{3/e

This should never have worked:

%_=(_,"Just another ");
$_="Perl hacker,\n";
s//_}->{_/e;print

M t/base/lex.t
M toke.c

commit 9c74ccc90967bf358f43629a26ebd70e7f15f83a
Author: Father Chrysostomos <***@cpan.org>
Date: Thu Aug 30 15:57:18 2012 -0700

Fix two minor s//.../e parsing bugs

It may be an odd place to allow comments, but s//"" # hello/e has\
always worked, *unless* there happens to be a null before the first #.

scan_subst in toke.c wraps the replacement text in do { ... } when the
/e flag is present.

It was adding a line break before the final } if the replacement text
contained #, because otherwise the } would be commented out.

But to find the # it was using strchr, which stops at the first null.
So eval "s//'\0'#/e" would fail.

It makes little sense to me to check whether the replacement contains
# before adding the line break. It would be faster just to add the
line break without checking.

But then I discovered this bug:

s//"#" . <<END/e;
foo
END
__END__
Can't find string terminator "END" anywhere before EOF at - line 1.

So now I have two bugs to fix.

The easiest solution seems to be to omit the line break and make the
comment parser skip the } at the end of a s///e replacement.

M t/base/lex.t
M toke.c

commit 5c49e90fd624f3ab1cdb1f1d8e4f0525d7881b99
Author: Father Chrysostomos <***@cpan.org>
Date: Thu Aug 30 13:34:14 2012 -0700

toke.c: PL_in_eval purge

Many uses of PL_in_eval in toke.c are redundant.

PL_in_eval indicates not that we are parsing a string eval, but that
we are being called from an eval, whether stringy on not. Even if
PL_in_eval were only for string eval, it would still not indicate that
we are parsing a string eval, because of eval 'require'.

This commit removes redundant uses of it (making things theoretically
slightly faster).

M toke.c

commit 19bbc0d7f761c11d620948dac95143548cf2fa7b
Author: Father Chrysostomos <***@cpan.org>
Date: Wed Aug 29 22:35:27 2012 -0700

toke.c:scan_heredoc: comments, comments

M toke.c

commit 4efe39d21e072e88e12e308ed1f068461f8ef778
Author: Father Chrysostomos <***@cpan.org>
Date: Wed Aug 29 22:07:18 2012 -0700

toke.c:scan_heredoc: Merge similar code

The code for looking in outer lexing scopes was mostly identical to
the code for looking in PL_linestr.

M toke.c

commit 64d1236453b904858b3d262dfdb71aac934623d0
Author: Father Chrysostomos <***@cpan.org>
Date: Wed Aug 29 20:43:05 2012 -0700

toke.c:scan_heredoc: Remove incorrect part of comment

I missed this in 60f40a3895 when I stopped abusing IVX and NVX.

M toke.c

commit 6ac5e9b91adf56444c756d04c84fe7c503b4d46c
Author: Father Chrysostomos <***@cpan.org>
Date: Wed Aug 29 20:41:09 2012 -0700

toke.c:scan_heredoc: Merge two adjacent #ifdefs

M toke.c

commit a2c844182cbce22d3a164097c0d85c0375664d85
Author: Father Chrysostomos <***@cpan.org>
Date: Wed Aug 29 20:39:55 2012 -0700

toke.c:scan_heredoc: Remove unnecessary assignment

Updating PL_bufend after lex_next_chunk is not necessary, as
lex_next_chunk itself does it.

M toke.c

commit 074b1c594a0c131c2fee2e237282c7fc3bc00586
Author: Father Chrysostomos <***@cpan.org>
Date: Wed Aug 29 20:37:44 2012 -0700

toke.c:scan_heredoc: less pointer fiddling; one less SV

The loop for reading lines of input to find the end of a here-doc has
always checked to see whether the cursor (s) was at the end of the
current buffer:

while (s >= PL_bufend) { /* multiple line string? */

(Actually, when it was added in perl 3.000, it was in scanstr and
that loop was not specific to here-docs, but also applied to multi-
line strings.)

The code inside the loop ends up fiddling with s by setting it explic-
itly to the end of the buffer or the end of the here-doc marker, minus
one to make sure it does not coincide with the end of the buffer.

This doesn’t make any sense, and it makes the rest of this function
more complicated.

Because the loop used to be outside the else block, it was also
reached for a here-doc inside a string eval, but the code for that
ensured the condition for the while loop was never true.

Since the while loop set s to one less than it needed to be set to,
in order to break out of it, it had to have s++ just after the loop.
That s++ was reached also by the eval code, which, consequently, had
to adjust its value of s.

That adjustment actually took place farther up in the function, where
the herewas SV was assigned to. (herewas contains the text after the
here-doc marker to the end of the line.) The beginning of herewas
would point to the last character of the here-doc marker inside an
eval, so that subtracting SvCUR(herewas) from the buffer end would
result in an adjusted pointer.

herewas is currently not actually used, except for the length. Until
recently, the text inside it would be copied back into PL_linestr to
recreate where the lexer needed to continue (because PL_linestr was
being clobbered). That no longer happens.

So we can get rid of herewas altogether. Since it is in an else
block, the stream-based parser does not need to fiddle pointers to
exit the loop. It can just break explicitly. So the s++ can also
go, requiring changes (and simplifications) to the eval code. The
comment about it being a multiline string is irrelevant and can go,
too. It dates from when that line was actually in scanstr and applied
to quoted strings containing line breaks.

M toke.c

commit 932d0cf189ceadef2911d6249bb485588aa52c95
Author: Father Chrysostomos <***@cpan.org>
Date: Wed Aug 29 17:58:33 2012 -0700

toke.c:S_scan_heredoc: put the croaking code in one spot

M toke.c

commit f20c3a2bbc5915fc17f30c1fc6e9e811fa7ccafa
Author: Father Chrysostomos <***@cpan.org>
Date: Wed Aug 29 13:10:01 2012 -0700

Make eval "s//<<END/e" slightly faster

The code that peeks into an outer linestr buffer to find the heredoc
body has to modify that buffer and remove the heredoc body from it.

It copies the text after the quote-like operator up to the end of the
line into a new SV, concatenates the text after the heredoc body into
a new SV, and then copies it back to linestr right after the quote-
like operator.

So, in this example:

eval "s//<<END/e; # jiggles\nfoo\nEND\ndie;"

It ends up copying this:

"; # jiggles\ndie;\n;"

into this at the position shown:

eval "s//<<END/e; # jiggles\nfoo\nEND\ndie;\n;"
^

There is no need for two copies. And there is no need to copy the
rest of the line where the heredoc marker is.

M toke.c

commit 7d66e4bd750064b354bdc26d4e38be5956b64d22
Author: Father Chrysostomos <***@cpan.org>
Date: Wed Aug 29 12:49:56 2012 -0700

lex.t: Mangle obscenity (albeit euphemistic)

It is harder to hack on perl with someone looking over one’s shoulder
when there are comments like this, even when it is euphemistic in its
use of voiced dental stops instead of the voiceless kind.

M t/base/lex.t

commit 76f9939ee09e325216d1b8afce3e3d7183c5eeb8
Author: Father Chrysostomos <***@cpan.org>
Date: Wed Aug 29 12:47:32 2012 -0700

Fix here-doc body extraction in eval 's//<<END/'

Outside of string eval, this:

s//<<END/e; print "a
END
b\n";

prints this:

a
b

But when we have string eval involved,

eval 's//<<END/e; print "a
END
b\n"';

we get this:

a

b

Amazing!

The buggy code in question goes back to commit 0244c3a403.

Since PL_linestr only contains the contents of the replacement
("<<END"), it peeks into the outer lexing scope’s linestr buffer, mod-
ifying it in place to remove the here-doc body, by copying everything
after the here-doc back to the spot where the body begins.

It was off by one, however, and left an extra line break.

When the code in question is reached, the variables are set as follows:

bufptr = "; print \"a"... (just after the s///)
s = "\nb\\n\"" (newline after the heredoc terminator)

The herewas variable already contains everything after the quote-
like operator containing the <<heredoc marker to the end of the line
including the \n ("; print \"a\n").

But then we concatenate everything from s onwards. So we end up with
the \n before the here-doc body and the \n from after the here-doc
terminator juxtaposed.

So after using s to extract the re-eval string, we increment s so it
points afer the final newline.

M t/base/lex.t
M toke.c

commit 3328ab5af72319f76fe9be3910a8e07d38b14de2
Author: Father Chrysostomos <***@cpan.org>
Date: Wed Aug 29 12:35:49 2012 -0700

Finish fixing here-docs in re-evals

This commit fixes here-docs in single-line re-evals in files (as
opposed to evals) and here-docs in single-line quote-like operators
inside re-evals.

In both cases, the here-doc parser has to look into an outer
lexing scope to find the here-doc body. And in both cases it
was stomping on PL_linestr (the current line buffer) while
PL_sublex_info.re_eval_start was pointing to an offset in that buffer.
(re_eval_start is used to construct the string to include in the
regexp’s stringification once the lexer reaches the end of the
re-eval.)

Fixing this entails moving re_eval_start and re_eval_str to
PL_parser->lex_shared, making the pre-localised values visible.
This is so that the code that peeks into an outer linestr buffer to
steal the here-doc body can set up re_eval_str in the right scope.
(re_eval_str is used to store the re-eval text when the here-
oc parser has no choice but to modify linestr; see also commit
db4442662555874019.)

It also entails making the stream-based parser (i.e., that reads from
an input stream) leave PL_linestr alone, instead of clobbering it and
then reconstructing part of it afterwards.

M parser.h
M perl.h
M t/base/lex.t
M toke.c

commit a7922135a13574b7c0cd4fa4acc00114f1197ab9
Author: Father Chrysostomos <***@cpan.org>
Date: Wed Aug 29 08:41:41 2012 -0700

toke.c:S_scan_heredoc: Put stream-based parser in else block

We currently have the code laid out like this:

if (peek) {
... peek inside the parent linestr buffer
}
else if (eval) {
... grab the heredoc body from linestr ...
}
else
start with an empty string for the heredoc body

... parse the body of the heredoc from the input stream ...

The final bit is inside a while loop whose condition is never true
after either of the first two branches of the if/else has executed.
But the code is very hard to read, and it is difficult to fix bugs, as
code cannot be added before the while loop, and the while loop condi-
tion cannot change, without affecting heredocs in string eval.

So put the final parser inside the else. Future commits will
depend on this.

M toke.c

commit 956be2d47c8be1238119303e892c672d5de736ad
Author: Father Chrysostomos <***@cpan.org>
Date: Wed Aug 29 08:36:40 2012 -0700

Avoid uninit warning for qq|${\<<FOO}|

If a here-doc occurs inside a single-line quote-like operator inside
a file (as opposed to an eval), it produces an uninitialized warning.
The goto I added in commit 99bd9d90 wentto the wrong place.

M t/op/heredoc.t
M toke.c

commit f68f7dc11e746ce69ed4eb3843aa2002a9487c20
Author: Father Chrysostomos <***@cpan.org>
Date: Tue Aug 28 22:37:10 2012 -0700

toke.c: S_scan_heredoc: prune dead code

This incorrect code (using a pointer after finding it to be null)
is the result of the refactoring in 60f40a3895. It was trying to
account for a string eval with no line break in it. But that can’t
happen as of 11076590 (if it could it would crash).

So remove it and add an assertion, along with a comment explaining the
assertion.

M toke.c
-----------------------------------------------------------------------

Summary of changes:
mg.c | 4 +
parser.h | 2 +
perl.h | 2 -
t/base/lex.t | 36 +++++++-
t/op/heredoc.t | 8 ++-
t/op/utf8cache.t | 34 +++++++-
t/test.pl | 26 ++++--
toke.c | 260 +++++++++++++++++++++++------------------------------
8 files changed, 211 insertions(+), 161 deletions(-)

diff --git a/mg.c b/mg.c
index 3972a87..089f9c6 100644
--- a/mg.c
+++ b/mg.c
@@ -213,6 +213,10 @@ Perl_mg_get(pTHX_ SV *sv)
if (mg->mg_flags & MGf_GSKIP)
(SSPTR(mgs_ix, MGS *))->mgs_magical = 0;
}
+ else if (vtbl == &PL_vtbl_utf8) {
+ /* get-magic can reallocate the PV */
+ magic_setutf8(sv, mg);
+ }

mg = nextmg;

diff --git a/parser.h b/parser.h
index 95083d6..97e016d 100644
--- a/parser.h
+++ b/parser.h
@@ -27,6 +27,8 @@ typedef struct yy_lexshared {
struct yy_lexshared *ls_prev;
SV *ls_linestr; /* mirrors PL_parser->linestr */
char *ls_bufptr; /* mirrors PL_parser->bufptr */
+ char *re_eval_start; /* start of "(?{..." text */
+ SV *re_eval_str; /* "(?{...})" text */
line_t herelines; /* number of lines in here-doc */
} LEXSHARED;

diff --git a/perl.h b/perl.h
index 7f907df..f42849c 100644
--- a/perl.h
+++ b/perl.h
@@ -3456,8 +3456,6 @@ struct _sublex_info {
U8 super_state; /* lexer state to save */
U16 sub_inwhat; /* "lex_inwhat" to use */
OP *sub_op; /* "lex_op" to use */
- char *re_eval_start;/* start of "(?{..." text */
- SV *re_eval_str; /* "(?{...})" text */
SV *repl; /* replacement of s/// or y/// */
};

diff --git a/t/base/lex.t b/t/base/lex.t
index 4c4981d..1201436 100644
--- a/t/base/lex.t
+++ b/t/base/lex.t
@@ -1,6 +1,6 @@
#!./perl

-print "1..66\n";
+print "1..72\n";

$x = 'x';

@@ -197,7 +197,7 @@ my $test = 42;
local $_ = "not ok ";
eval q{
s/^not /<<EOT/e and T '^main:\(eval \d+\):2$', $test++;
-# fuggedaboudit
+# uggedaboudit
EOT
print $_, $test++, "\n";
T('^main:\(eval \d+\):6$', $test++);
@@ -326,3 +326,35 @@ END

eval 'print qq ;ok 66 - eval ending with semicolon\n;'
or print "not ok 66 - eval ending with semicolon\n";
+
+print "not " unless qr/(?{<<END})/ eq '(?^:(?{<<END}))';
+foo
+END
+print "ok 67 - here-doc in single-line re-eval\n";
+
+$_ = qr/(?{"${<<END}"
+foo
+END
+})/;
+print "not " unless /foo/;
+print "ok 68 - here-doc in quotes in multiline re-eval\n";
+
+eval 's//<<END/e if 0; $_ = "a
+END
+b"';
+print "not " if $_ =~ /\n\n/;
+print "ok 69 - eval 's//<<END/' does not leave extra newlines\n";
+
+$_ = a;
+eval "s/a/'b\0'#/e";
+print 'not ' unless $_ eq "b\0";
+print "ok 70 - # after null in s/// repl\n";
+
+s//"#" . <<END/e;
+foo
+END
+print "ok 71 - s//'#' . <<END/e\n";
+
+eval "s//3}->{3/e";
+print "not " unless $@;
+print "ok 72 - s//3}->{3/e\n";
diff --git a/t/op/heredoc.t b/t/op/heredoc.t
index 0230d88..08b0af2 100644
--- a/t/op/heredoc.t
+++ b/t/op/heredoc.t
@@ -7,7 +7,7 @@ BEGIN {
}

use strict;
-plan(tests => 8);
+plan(tests => 9);


# heredoc without newline (#65838)
@@ -51,6 +51,12 @@ HEREDOC
{ switches => ['-X'] },
"crlf-terminated heredoc"
);
+ fresh_perl_is(
+ "print qq|\${\\<<foo}|\nick and queasy\nfoo\n",
+ 'ick and queasy',
+ { switches => ['-w'], stderr => 1 },
+ 'no warning for qq|${\<<foo}| in file'
+ );
}


diff --git a/t/op/utf8cache.t b/t/op/utf8cache.t
index 83ad4e8..a9e88a6 100644
--- a/t/op/utf8cache.t
+++ b/t/op/utf8cache.t
@@ -5,12 +5,14 @@ BEGIN {
chdir 't' if -d 't';
@INC = '../lib';
require './test.pl';
- skip_all_without_dynamic_extension('Devel::Peek');
}

use strict;

-plan(tests => 2);
+plan(tests => 5);
+
+SKIP: {
+skip_without_dynamic_extension("Devel::Peek");

my $pid = open CHILD, '-|';
die "kablam: $!\n" unless defined $pid;
@@ -36,6 +38,8 @@ my $utf8magic = qr{ ^ \s+ MAGIC \s = .* \n

unlike($_, qr{ $utf8magic $utf8magic }x);

+} # SKIP
+
# With bad caching, this code used to go quadratic and take 10s of minutes.
# The 'test' in this case is simply that it doesn't hang.

@@ -47,3 +51,29 @@ unlike($_, qr{ $utf8magic $utf8magic }x);
}
pass("quadratic pos");
}
+
+# Get-magic can reallocate the PV. Check that the cache is reset in
+# such cases.
+
+# Regexp vars
+"\x{100}" =~ /(.+)/;
+() = substr $1, 0, 1;
+"a\x{100}" =~ /(.+)/;
+is ord substr($1, 1, 1), 0x100, 'get-magic resets utf8cache on match vars';
+
+# Substr lvalues
+my $x = "a\x{100}";
+my $l = \substr $x, 0;
+() = substr $$l, 1, 1;
+substr $x, 0, 1, = "\x{100}";
+is ord substr($$l, 1, 1), 0x100, 'get-magic resets utf8cache on LVALUEs';
+
+# defelem magic
+my %h;
+sub {
+ $_[0] = "a\x{100}";
+ () = ord substr $_[0], 1, 1;
+ $h{k} = "\x{100}"x2;
+ is ord substr($_[0], 1, 1), 0x100,
+ 'get-magic resets uf8cache on defelems';
+}->($h{k});
diff --git a/t/test.pl b/t/test.pl
index bd5ff3b..aee1d24 100644
--- a/t/test.pl
+++ b/t/test.pl
@@ -109,6 +109,16 @@ sub _comment {
map { split /\n/ } @_;
}

+sub _have_dynamic_extension {
+ my $extension = shift;
+ unless (eval {require Config; 1}) {
+ warn "test.pl had problems loading Config: $@";
+ return 1;
+ }
+ $extension =~ s!::!/!g;
+ return 1 if ($Config::Config{extensions} =~ /\b$extension\b/);
+}
+
sub skip_all {
if (@_) {
_print "1..0 # Skip @_\n";
@@ -123,14 +133,9 @@ sub skip_all_if_miniperl {
}

sub skip_all_without_dynamic_extension {
- my $extension = shift;
+ my ($extension) = @_;
skip_all("no dynamic loading on miniperl, no $extension") if is_miniperl();
- unless (eval {require Config; 1}) {
- warn "test.pl had problems loading Config: $@";
- return;
- }
- $extension =~ s!::!/!g;
- return if ($Config::Config{extensions} =~ /\b$extension\b/);
+ return if &_have_dynamic_extension;
skip_all("$extension was not built");
}

@@ -454,6 +459,13 @@ sub skip_if_miniperl {
skip(@_) if is_miniperl();
}

+sub skip_without_dynamic_extension {
+ my ($extension) = @_;
+ skip("no dynamic loading on miniperl, no $extension") if is_miniperl();
+ return if &_have_dynamic_extension;
+ skip_all("$extension was not built");
+}
+
sub todo_skip {
my $why = shift;
my $n = @_ ? shift : 1;
diff --git a/toke.c b/toke.c
index c41a614..933503b 100644
--- a/toke.c
+++ b/toke.c
@@ -930,8 +930,8 @@ Perl_lex_grow_linestr(pTHX_ STRLEN len)
linestart_pos = PL_parser->linestart - buf;
last_uni_pos = PL_parser->last_uni ? PL_parser->last_uni - buf : 0;
last_lop_pos = PL_parser->last_lop ? PL_parser->last_lop - buf : 0;
- re_eval_start_pos = PL_sublex_info.re_eval_start ?
- PL_sublex_info.re_eval_start - buf : 0;
+ re_eval_start_pos = PL_parser->lex_shared->re_eval_start ?
+ PL_parser->lex_shared->re_eval_start - buf : 0;

buf = sv_grow(linestr, len);

@@ -944,8 +944,8 @@ Perl_lex_grow_linestr(pTHX_ STRLEN len)
PL_parser->last_uni = buf + last_uni_pos;
if (PL_parser->last_lop)
PL_parser->last_lop = buf + last_lop_pos;
- if (PL_sublex_info.re_eval_start)
- PL_sublex_info.re_eval_start = buf + re_eval_start_pos;
+ if (PL_parser->lex_shared->re_eval_start)
+ PL_parser->lex_shared->re_eval_start = buf + re_eval_start_pos;
return buf;
}

@@ -2471,8 +2471,6 @@ S_sublex_push(pTHX)
SAVEI32(PL_lex_starts);
SAVEI8(PL_lex_state);
SAVESPTR(PL_lex_repl);
- SAVEPPTR(PL_sublex_info.re_eval_start);
- SAVESPTR(PL_sublex_info.re_eval_str);
SAVEVPTR(PL_lex_inpat);
SAVEI16(PL_lex_inwhat);
SAVECOPLINE(PL_curcop);
@@ -2499,8 +2497,6 @@ S_sublex_push(pTHX)
PL_lex_repl = PL_sublex_info.repl;
PL_lex_stuff = NULL;
PL_sublex_info.repl = NULL;
- PL_sublex_info.re_eval_start = NULL;
- PL_sublex_info.re_eval_str = NULL;

PL_bufend = PL_bufptr = PL_oldbufptr = PL_oldoldbufptr = PL_linestart
= SvPVX(PL_linestr);
@@ -4683,7 +4679,7 @@ Perl_yylex(pTHX)
}
/* Convert (?{...}) and friends to 'do {...}' */
if (PL_lex_inpat && *PL_bufptr == '(') {
- PL_sublex_info.re_eval_start = PL_bufptr;
+ PL_parser->lex_shared->re_eval_start = PL_bufptr;
PL_bufptr += 2;
if (*PL_bufptr != '{')
PL_bufptr++;
@@ -4742,28 +4738,30 @@ Perl_yylex(pTHX)
re_eval_str. If the here-doc body’s length equals the previous
value of re_eval_start, re_eval_start will now be null. So
check re_eval_str as well. */
- if (PL_sublex_info.re_eval_start || PL_sublex_info.re_eval_str) {
+ if (PL_parser->lex_shared->re_eval_start
+ || PL_parser->lex_shared->re_eval_str) {
SV *sv;
if (*PL_bufptr != ')')
Perl_croak(aTHX_ "Sequence (?{...}) not terminated with ')'");
PL_bufptr++;
/* having compiled a (?{..}) expression, return the original
* text too, as a const */
- if (PL_sublex_info.re_eval_str) {
- sv = PL_sublex_info.re_eval_str;
- PL_sublex_info.re_eval_str = NULL;
- SvCUR_set(sv, PL_bufptr - PL_sublex_info.re_eval_start);
+ if (PL_parser->lex_shared->re_eval_str) {
+ sv = PL_parser->lex_shared->re_eval_str;
+ PL_parser->lex_shared->re_eval_str = NULL;
+ SvCUR_set(sv,
+ PL_bufptr - PL_parser->lex_shared->re_eval_start);
SvPV_shrink_to_cur(sv);
}
- else sv = newSVpvn(PL_sublex_info.re_eval_start,
- PL_bufptr - PL_sublex_info.re_eval_start);
+ else sv = newSVpvn(PL_parser->lex_shared->re_eval_start,
+ PL_bufptr - PL_parser->lex_shared->re_eval_start);
start_force(PL_curforce);
/* XXX probably need a CURMAD(something) here */
NEXTVAL_NEXTTOKE.opval =
(OP*)newSVOP(OP_CONST, 0,
sv);
force_next(THING);
- PL_sublex_info.re_eval_start = NULL;
+ PL_parser->lex_shared->re_eval_start = NULL;
PL_expect = XTERM;
return REPORT(',');
}
@@ -5252,8 +5250,8 @@ Perl_yylex(pTHX)
PL_faketokens = 0;
#endif
if (PL_lex_state != LEX_NORMAL ||
- (PL_in_eval && !PL_rsfp && !PL_parser->filtered)) {
- if (*s == '#' && s == PL_linestart && PL_in_eval
+ (!PL_rsfp && !PL_parser->filtered)) {
+ if (*s == '#' && s == PL_linestart
&& !PL_rsfp && !PL_parser->filtered) {
/* handle eval qq[#line 1 "foo"\n ...] */
CopLINE_dec(PL_curcop);
@@ -5261,10 +5259,11 @@ Perl_yylex(pTHX)
}
if (PL_madskills && !PL_lex_formbrack && !PL_in_eval) {
s = SKIPSPACE0(s);
- if (!PL_in_eval || PL_rsfp || PL_parser->filtered)
+ if (PL_rsfp || PL_parser->filtered)
incline(s);
}
else {
+ const bool in_comment = *s == '#';
d = s;
while (d < PL_bufend && *d != '\n')
d++;
@@ -5278,7 +5277,11 @@ Perl_yylex(pTHX)
PL_thiswhite = newSVpvn(s, d - s);
#endif
s = d;
- incline(s);
+ if (in_comment && d == PL_bufend
+ && PL_lex_state == LEX_INTERPNORMAL
+ && PL_lex_inwhat == OP_SUBST && PL_lex_repl
+ && SvEVALED(PL_lex_repl) && d[-1] == '}') s--;
+ else incline(s);
}
if (PL_lex_formbrack && PL_lex_brackets <= PL_lex_formbrack) {
PL_lex_state = LEX_FORMLINE;
@@ -5929,7 +5932,10 @@ Perl_yylex(pTHX)
#endif
return yylex(); /* ignore fake brackets */
}
- if (*s == '-' && s[1] == '>')
+ if (PL_lex_inwhat == OP_SUBST && PL_lex_repl
+ && SvEVALED(PL_lex_repl))
+ PL_lex_state = LEX_INTERPEND;
+ else if (*s == '-' && s[1] == '>')
PL_lex_state = LEX_INTERPENDMAYBE;
else if (*s != '[' && *s != '{')
PL_lex_state = LEX_INTERPEND;
@@ -6041,7 +6047,7 @@ Perl_yylex(pTHX)
if (PL_expect == XSTATE && isALPHA(tmp) &&
(s == PL_linestart+1 || s[-2] == '\n') )
{
- if ((PL_in_eval && !PL_rsfp && !PL_parser->filtered)
+ if ((!PL_rsfp && !PL_parser->filtered)
|| PL_lex_state != LEX_NORMAL) {
d = PL_bufend;
while (s < d) {
@@ -9363,8 +9369,6 @@ S_scan_subst(pTHX_ char *start)
}
sv_catpvs(repl, "{");
sv_catsv(repl, PL_sublex_info.repl);
- if (strchr(SvPVX(PL_sublex_info.repl), '#'))
- sv_catpvs(repl, "\n");
sv_catpvs(repl, "}");
SvEVALED_on(repl);
SvREFCNT_dec(PL_sublex_info.repl);
@@ -9480,36 +9484,28 @@ S_scan_trans(pTHX_ char *start)
a whole string being evalled, or the contents of the current quote-
like operator.

- The three methods are:
- - Steal lines from the input stream (stream)
- - Scan the heredoc in PL_linestr and remove it therefrom (linestr)
- - Peek at the PL_linestr of outer lexing scopes (peek)
-
- They are used in these cases:
- file scope or filtered eval stream
- string eval linestr
- multiline quoted construct linestr
- single-line quoted construct in file stream
- single-line quoted construct in eval or quote peek
+ The two basic methods are:
+ - Steal lines from the input stream
+ - Scan the heredoc in PL_linestr and remove it therefrom

- Single-line also applies to heredocs that begin on the last line of a
- quote-like operator.
+ In a file scope or filtered eval, the first method is used; in a
+ string eval, the second.

- Peeking within a quote also involves falling back to the stream method,
- if the outer quote-like operators are all on one line (or the heredoc
- marker is on the last line).
+ In a quote-like operator, we have to choose between the two,
+ depending on where we can find a newline. We peek into outer lex-
+ ing scopes until we find one with a newline in it. If we reach the
+ outermost lexing scope and it is a file, we use the stream method.
+ Otherwise it is treated as an eval.
*/

STATIC char *
S_scan_heredoc(pTHX_ register char *s)
{
dVAR;
- SV *herewas;
I32 op_type = OP_SCALAR;
I32 len;
SV *tmpstr;
char term;
- const char *found_newline = 0;
char *d;
char *e;
char *peek;
@@ -9592,18 +9588,6 @@ S_scan_heredoc(pTHX_ register char *s)
s = olds;
}
#endif
- if ((infile && !PL_lex_inwhat)
- || !(found_newline = (char*)memchr((void*)s, '\n', PL_bufend - s))) {
- herewas = newSVpvn(s,PL_bufend-s);
- }
- else {
-#ifdef PERL_MAD
- herewas = newSVpvn(s-1,found_newline-s+1);
-#else
- s--;
- herewas = newSVpvn(s,found_newline-s);
-#endif
- }
#ifdef PERL_MAD
if (PL_madskills) {
tstart = SvPVX(PL_linestr) + stuffstart;
@@ -9612,14 +9596,8 @@ S_scan_heredoc(pTHX_ register char *s)
else
PL_thisstuff = newSVpvn(tstart, s - tstart);
}
-#endif
- s += SvCUR(herewas);

-#ifdef PERL_MAD
stuffstart = s - SvPVX(PL_linestr);
-
- if (found_newline)
- s--;
#endif

tmpstr = newSV_type(SVt_PVIV);
@@ -9635,34 +9613,45 @@ S_scan_heredoc(pTHX_ register char *s)

PL_multi_start = CopLINE(PL_curcop) + 1;
PL_multi_open = PL_multi_close = '<';
- if (PL_lex_inwhat && !found_newline) {
- /* Peek into the line buffer of the parent lexing scope, going up
- as many levels as necessary to find one with a newline after
- bufptr. See the comments in sublex_push for how IVX and NVX
- are abused.
- */
+ /* inside a string eval or quote-like operator */
+ if (!infile || PL_lex_inwhat) {
SV *linestr;
char *bufptr, *bufend;
- char * const olds = s - SvCUR(herewas);
- char * const real_olds = s;
+ char * const olds = s;
PERL_CONTEXT * const cx = &cxstack[cxstack_ix];
- do {
+ /* These two fields are not set until an inner lexing scope is
+ entered. But we need them set here. */
+ shared->ls_bufptr = s;
+ shared->ls_linestr = PL_linestr;
+ if (PL_lex_inwhat)
+ /* Look for a newline. If the current buffer does not have one,
+ peek into the line buffer of the parent lexing scope, going
+ up as many levels as necessary to find one with a newline
+ after bufptr.
+ */
+ while (!(s = (char *)memchr(
+ (void *)shared->ls_bufptr, '\n',
+ SvEND(shared->ls_linestr)-shared->ls_bufptr
+ ))) {
shared = shared->ls_prev;
+ /* shared is only null if we have gone beyond the outermost
+ lexing scope. In a file, we will have broken out of the
+ loop in the previous iteration. In an eval, the string buf-
+ fer ends with "\n;", so the while condition below will have
+ evaluated to false. So shared can never be null. */
+ assert(shared);
/* A LEXSHARED struct with a null ls_prev pointer is the outer-
most lexing scope. In a file, shared->ls_linestr at that
level is just one line, so there is no body to steal. */
if (infile && !shared->ls_prev) {
- s = real_olds;
+ s = olds;
goto streaming;
}
- else if (!shared) {
- s = SvEND(shared->ls_linestr);
- break;
- }
- } while (!(s = (char *)memchr(
- (void *)shared->ls_bufptr, '\n',
- SvEND(shared->ls_linestr)-shared->ls_bufptr
- )));
+ }
+ else { /* eval */
+ s = (char*)memchr((void*)s, '\n', PL_bufend - s);
+ assert(s);
+ }
bufptr = shared->ls_bufptr;
linestr = shared->ls_linestr;
bufend = SvEND(linestr);
@@ -9673,42 +9662,7 @@ S_scan_heredoc(pTHX_ register char *s)
++shared->herelines;
}
if (s >= bufend) {
- SvREFCNT_dec(herewas);
- SvREFCNT_dec(tmpstr);
- CopLINE_set(PL_curcop, (line_t)PL_multi_start-1);
- missingterm(PL_tokenbuf + 1);
- }
- if (CxTYPE(cx) == CXt_EVAL && CxOLD_OP_TYPE(cx) == OP_ENTEREVAL
- && cx->blk_eval.cur_text == linestr) {
- cx->blk_eval.cur_text = newSVsv(linestr);
- SvSCREAM_on(cx->blk_eval.cur_text);
- }
- sv_setpvn(herewas,bufptr,d-bufptr+1);
- sv_setpvn(tmpstr,d+1,s-d);
- s += len - 1;
- sv_catpvn(herewas,s,bufend-s);
- Copy(SvPVX_const(herewas),bufptr,SvCUR(herewas) + 1,char);
- SvCUR_set(linestr,
- bufptr-SvPVX_const(linestr)
- + SvCUR(herewas));
-
- s = olds;
- goto retval;
- }
- else if (!infile || found_newline) {
- char * const olds = s - SvCUR(herewas);
- PERL_CONTEXT * const cx = &cxstack[cxstack_ix];
- d = s;
- while (s < PL_bufend &&
- (*s != '\n' || memNE(s,PL_tokenbuf,len)) ) {
- if (*s++ == '\n')
- ++shared->herelines;
- }
- if (s >= PL_bufend) {
- SvREFCNT_dec(herewas);
- SvREFCNT_dec(tmpstr);
- CopLINE_set(PL_curcop, (line_t)PL_multi_start-1);
- missingterm(PL_tokenbuf + 1);
+ goto interminable;
}
sv_setpvn(tmpstr,d+1,s-d);
#ifdef PERL_MAD
@@ -9727,33 +9681,46 @@ S_scan_heredoc(pTHX_ register char *s)
/* s now points to the newline after the heredoc terminator.
d points to the newline before the body of the heredoc.
*/
+
+ /* We are going to modify linestr in place here, so set
+ aside copies of the string if necessary for re-evals or
+ (caller $n)[6]. */
/* See the Paranoia note in case LEX_INTERPEND in yylex, for why we
- check PL_sublex_info.re_eval_str. */
- if (PL_sublex_info.re_eval_start || PL_sublex_info.re_eval_str) {
+ check shared->re_eval_str. */
+ if (shared->re_eval_start || shared->re_eval_str) {
/* Set aside the rest of the regexp */
- if (!PL_sublex_info.re_eval_str)
- PL_sublex_info.re_eval_str =
- newSVpvn(PL_sublex_info.re_eval_start,
- PL_bufend - PL_sublex_info.re_eval_start);
- PL_sublex_info.re_eval_start -= s-d;
+ if (!shared->re_eval_str)
+ shared->re_eval_str =
+ newSVpvn(shared->re_eval_start,
+ bufend - shared->re_eval_start);
+ shared->re_eval_start -= s-d;
}
if (CxTYPE(cx) == CXt_EVAL && CxOLD_OP_TYPE(cx) == OP_ENTEREVAL
- && cx->blk_eval.cur_text == PL_linestr) {
- cx->blk_eval.cur_text = newSVsv(PL_linestr);
+ && cx->blk_eval.cur_text == linestr) {
+ cx->blk_eval.cur_text = newSVsv(linestr);
SvSCREAM_on(cx->blk_eval.cur_text);
}
/* Copy everything from s onwards back to d. */
- Move(s,d,PL_bufend-s + 1,char);
- SvCUR_set(PL_linestr, SvCUR(PL_linestr) - (s-d));
- PL_bufend = SvPVX(PL_linestr) + SvCUR(PL_linestr);
+ Move(s,d,bufend-s + 1,char);
+ SvCUR_set(linestr, SvCUR(linestr) - (s-d));
+ /* Setting PL_bufend only applies when we have not dug deeper
+ into other scopes, because sublex_done sets PL_bufend to
+ SvEND(PL_linestr). */
+ if (shared == PL_parser->lex_shared) PL_bufend = SvEND(linestr);
s = olds;
}
else
- sv_setpvs(tmpstr,""); /* avoid "uninitialized" warning */
- streaming:
- term = PL_tokenbuf[1];
- len--;
- while (s >= PL_bufend) { /* multiple line string? */
+ {
+ SV *linestr_save;
+ streaming:
+ sv_setpvs(tmpstr,""); /* avoid "uninitialized" warning */
+ term = PL_tokenbuf[1];
+ len--;
+ linestr_save = PL_linestr; /* must restore this afterwards */
+ d = s; /* and this */
+ PL_linestr = newSVpvs("");
+ PL_bufend = SvPVX(PL_linestr);
+ while (1) {
#ifdef PERL_MAD
if (PL_madskills) {
tstart = SvPVX(PL_linestr) + stuffstart;
@@ -9763,15 +9730,13 @@ S_scan_heredoc(pTHX_ register char *s)
PL_thisstuff = newSVpvn(tstart, PL_bufend - tstart);
}
#endif
- PL_bufptr = s;
+ PL_bufptr = PL_bufend;
CopLINE_set(PL_curcop,
PL_multi_start + shared->herelines);
if (!lex_next_chunk(LEX_NO_TERM)
&& (!SvCUR(tmpstr) || SvEND(tmpstr)[-1] != '\n')) {
- SvREFCNT_dec(herewas);
- SvREFCNT_dec(tmpstr);
- CopLINE_set(PL_curcop, (line_t)PL_multi_start - 1);
- missingterm(PL_tokenbuf + 1);
+ SvREFCNT_dec(linestr_save);
+ goto interminable;
}
CopLINE_set(PL_curcop, (line_t)PL_multi_start - 1);
if (!SvCUR(PL_linestr) || PL_bufend[-1] != '\n') {
@@ -9783,7 +9748,6 @@ S_scan_heredoc(pTHX_ register char *s)
stuffstart = s - SvPVX(PL_linestr);
#endif
shared->herelines++;
- PL_bufend = SvPVX(PL_linestr) + SvCUR(PL_linestr);
PL_last_lop = PL_last_uni = NULL;
#ifndef PERL_STRICT_CR
if (PL_bufend - PL_linestart >= 2) {
@@ -9801,25 +9765,22 @@ S_scan_heredoc(pTHX_ register char *s)
PL_bufend[-1] = '\n';
#endif
if (*s == term && memEQ(s,PL_tokenbuf + 1,len)) {
- STRLEN off = PL_bufend - 1 - SvPVX_const(PL_linestr);
- *(SvPVX(PL_linestr) + off ) = ' ';
- lex_grow_linestr(SvCUR(PL_linestr) + SvCUR(herewas) + 1);
- sv_catsv(PL_linestr,herewas);
+ SvREFCNT_dec(PL_linestr);
+ PL_linestr = linestr_save;
+ PL_linestart = SvPVX(linestr_save);
PL_bufend = SvPVX(PL_linestr) + SvCUR(PL_linestr);
- s = SvPVX(PL_linestr) + off; /* In case PV of PL_linestr moved. */
+ s = d;
+ break;
}
else {
- s = PL_bufend;
sv_catsv(tmpstr,PL_linestr);
}
+ }
}
- s++;
-retval:
PL_multi_end = CopLINE(PL_curcop);
if (SvCUR(tmpstr) + 5 < SvLEN(tmpstr)) {
SvPV_shrink_to_cur(tmpstr);
}
- SvREFCNT_dec(herewas);
if (!IN_BYTES) {
if (UTF && is_utf8_string((U8*)SvPVX_const(tmpstr), SvCUR(tmpstr)))
SvUTF8_on(tmpstr);
@@ -9829,6 +9790,11 @@ retval:
PL_lex_stuff = tmpstr;
pl_yylval.ival = op_type;
return s;
+
+ interminable:
+ SvREFCNT_dec(tmpstr);
+ CopLINE_set(PL_curcop, (line_t)PL_multi_start - 1);
+ missingterm(PL_tokenbuf + 1);
}

/* scan_inputsymbol

--
Perl5 Master Repository
Jerry D. Hedden
2012-08-31 20:50:19 UTC
Permalink
Post by Father Chrysostomos
+sub skip_without_dynamic_extension {
+ skip("no dynamic loading on miniperl, no $extension") if is_miniperl();
+ return if &_have_dynamic_extension;
+ skip_all("$extension was not built");
+}
Function should end with skip(); not skip_all().

Patch submitted here:
https://rt.perl.org:443/rt3/Ticket/Display.html?id=114686

Loading...