Class: RE2::Regexp

Inherits:
Object show all
Defined in:
ext/re2/re2.cc

Class Method Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(pattern) ⇒ RE2::Regexp #initialize(pattern, options) ⇒ RE2::Regexp

Returns a new RE2::Regexp object with a compiled version of pattern stored inside.

Overloads:

  • #initialize(pattern) ⇒ RE2::Regexp

    Returns a new RE2::Regexp object with a compiled version of pattern stored inside with the default options.

    Parameters:

    • pattern (String)

      the pattern to compile

    Raises:

    • (NoMemoryError)

      if memory could not be allocated for the compiled pattern

  • #initialize(pattern, options) ⇒ RE2::Regexp

    Returns a new RE2::Regexp object with a compiled version of pattern stored inside with the specified options.

    Parameters:

    • pattern (String)

      the pattern to compile

    • options (Hash)

      the options with which to compile the pattern

    Options Hash (options):

    • :utf8 (Boolean) — default: true

      text and pattern are UTF-8; otherwise Latin-1

    • :posix_syntax (Boolean) — default: false

      restrict regexps to POSIX egrep syntax

    • :longest_match (Boolean) — default: false

      search for longest match, not first match

    • :log_errors (Boolean) — default: true

      log syntax and execution errors to ERROR

    • :max_mem (Integer)

      approx. max memory footprint of RE2

    • :literal (Boolean) — default: false

      interpret string as literal, not regexp

    • :never_nl (Boolean) — default: false

      never match n, even if it is in regexp

    • :case_sensitive (Boolean) — default: true

      match is case-sensitive (regexp can override with (?i) unless in posix_syntax mode)

    • :perl_classes (Boolean) — default: false

      allow Perl’s d s w D S W when in posix_syntax mode

    • :word_boundary (Boolean) — default: false

      allow b B (word boundary and not) when in posix_syntax mode

    • :one_line (Boolean) — default: false

      ^ and $ only match beginning and end of text when in posix_syntax mode

    Raises:

    • (NoMemoryError)

      if memory could not be allocated for the compiled pattern



887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
# File 'ext/re2/re2.cc', line 887

static VALUE re2_regexp_initialize(int argc, VALUE *argv, VALUE self) {
  VALUE pattern, options;
  re2_pattern *p;

  rb_scan_args(argc, argv, "11", &pattern, &options);

  /* Ensure pattern is a string. */
  StringValue(pattern);

  TypedData_Get_Struct(self, re2_pattern, &re2_regexp_data_type, p);

  if (RTEST(options)) {
    RE2::Options re2_options;
    parse_re2_options(&re2_options, options);

    p->pattern = new(std::nothrow) RE2(RSTRING_PTR(pattern), re2_options);
  } else {
    p->pattern = new(std::nothrow) RE2(RSTRING_PTR(pattern));
  }

  if (p->pattern == 0) {
    rb_raise(rb_eNoMemError, "not enough memory to allocate RE2 object");
  }

  return self;
}

Class Method Details

.compileObject

.escape(unquoted) ⇒ String

Returns a version of str with all potentially meaningful regexp characters escaped. The returned string, used as a regular expression, will exactly match the original string.

Examples:

RE2::Regexp.escape("1.5-2.0?")    #=> "1\.5\-2\.0\?"

Parameters:

  • unquoted (String)

    the unquoted string

Returns:

  • (String)

    the escaped string



1585
1586
1587
1588
1589
1590
1591
# File 'ext/re2/re2.cc', line 1585

static VALUE re2_QuoteMeta(VALUE, VALUE unquoted) {
  StringValue(unquoted);

  std::string quoted_string = RE2::QuoteMeta(RSTRING_PTR(unquoted));

  return rb_str_new(quoted_string.data(), quoted_string.size());
}

.quote(unquoted) ⇒ String

Returns a version of str with all potentially meaningful regexp characters escaped. The returned string, used as a regular expression, will exactly match the original string.

Examples:

RE2::Regexp.escape("1.5-2.0?")    #=> "1\.5\-2\.0\?"

Parameters:

  • unquoted (String)

    the unquoted string

Returns:

  • (String)

    the escaped string



1585
1586
1587
1588
1589
1590
1591
# File 'ext/re2/re2.cc', line 1585

static VALUE re2_QuoteMeta(VALUE, VALUE unquoted) {
  StringValue(unquoted);

  std::string quoted_string = RE2::QuoteMeta(RSTRING_PTR(unquoted));

  return rb_str_new(quoted_string.data(), quoted_string.size());
}

Instance Method Details

#===(text) ⇒ Boolean

Returns true or false to indicate a successful match. Equivalent to re2.match(text, 0).

Returns:

  • (Boolean)

    whether the match was successful



1446
1447
1448
1449
1450
# File 'ext/re2/re2.cc', line 1446

static VALUE re2_regexp_match_p(const VALUE self, VALUE text) {
  VALUE argv[2] = { text, INT2FIX(0) };

  return re2_regexp_match(2, argv, self);
}

#=~(text) ⇒ Boolean

Returns true or false to indicate a successful match. Equivalent to re2.match(text, 0).

Returns:

  • (Boolean)

    whether the match was successful



1446
1447
1448
1449
1450
# File 'ext/re2/re2.cc', line 1446

static VALUE re2_regexp_match_p(const VALUE self, VALUE text) {
  VALUE argv[2] = { text, INT2FIX(0) };

  return re2_regexp_match(2, argv, self);
}

#case_insensitive?Boolean

Returns whether or not the regular expression re2 was compiled with the case_sensitive option set to false.

Examples:

re2 = RE2::Regexp.new("woo?", :case_sensitive => true)
re2.case_insensitive?    #=> false
re2.casefold?    #=> false

Returns:

  • (Boolean)

    the inverse of the case_sensitive option



1114
1115
1116
# File 'ext/re2/re2.cc', line 1114

static VALUE re2_regexp_case_insensitive(const VALUE self) {
  return BOOL2RUBY(re2_regexp_case_sensitive(self) != Qtrue);
}

#case_sensitive?Boolean

Returns whether or not the regular expression re2 was compiled with the case_sensitive option set to true.

Examples:

re2 = RE2::Regexp.new("woo?", :case_sensitive => true)
re2.case_sensitive?    #=> true

Returns:

  • (Boolean)

    the case_sensitive option



1097
1098
1099
1100
1101
1102
# File 'ext/re2/re2.cc', line 1097

static VALUE re2_regexp_case_sensitive(const VALUE self) {
  re2_pattern *p;
  TypedData_Get_Struct(self, re2_pattern, &re2_regexp_data_type, p);

  return BOOL2RUBY(p->pattern->options().case_sensitive());
}

#casefold?Boolean

Returns whether or not the regular expression re2 was compiled with the case_sensitive option set to false.

Examples:

re2 = RE2::Regexp.new("woo?", :case_sensitive => true)
re2.case_insensitive?    #=> false
re2.casefold?    #=> false

Returns:

  • (Boolean)

    the inverse of the case_sensitive option



1114
1115
1116
# File 'ext/re2/re2.cc', line 1114

static VALUE re2_regexp_case_insensitive(const VALUE self) {
  return BOOL2RUBY(re2_regexp_case_sensitive(self) != Qtrue);
}

#errorString?

If the RE2 could not be created properly, returns an error string otherwise returns nil.

Returns:

  • (String, nil)

    the error string or nil



1172
1173
1174
1175
1176
1177
1178
1179
1180
1181
# File 'ext/re2/re2.cc', line 1172

static VALUE re2_regexp_error(const VALUE self) {
  re2_pattern *p;
  TypedData_Get_Struct(self, re2_pattern, &re2_regexp_data_type, p);

  if (p->pattern->ok()) {
    return Qnil;
  } else {
    return rb_str_new(p->pattern->error().data(), p->pattern->error().size());
  }
}

#error_argString?

If the RE2 could not be created properly, returns the offending portion of the regexp otherwise returns nil.

Note RE2 only supports UTF-8 and ISO-8859-1 encoding so strings will be returned in UTF-8 by default or ISO-8859-1 if the :utf8 option for the RE2::Regexp is set to false (any other encoding’s behaviour is undefined).

Returns:

  • (String, nil)

    the offending portion of the regexp or nil



1193
1194
1195
1196
1197
1198
1199
1200
1201
1202
1203
1204
# File 'ext/re2/re2.cc', line 1193

static VALUE re2_regexp_error_arg(const VALUE self) {
  re2_pattern *p;
  TypedData_Get_Struct(self, re2_pattern, &re2_regexp_data_type, p);

  if (p->pattern->ok()) {
    return Qnil;
  } else {
    return encoded_str_new(p->pattern->error_arg().data(),
        p->pattern->error_arg().size(),
        p->pattern->options().encoding());
  }
}

#inspectString

Returns a printable version of the regular expression re2.

Note RE2 only supports UTF-8 and ISO-8859-1 encoding so strings will be returned in UTF-8 by default or ISO-8859-1 if the :utf8 option for the RE2::Regexp is set to false (any other encoding’s behaviour is undefined).

Examples:

re2 = RE2::Regexp.new("woo?")
re2.inspect    #=> "#<RE2::Regexp /woo?/>"

Returns:

  • (String)

    a printable version of the regular expression



926
927
928
929
930
931
932
933
934
935
936
937
# File 'ext/re2/re2.cc', line 926

static VALUE re2_regexp_inspect(const VALUE self) {
  re2_pattern *p;

  TypedData_Get_Struct(self, re2_pattern, &re2_regexp_data_type, p);

  std::ostringstream output;

  output << "#<RE2::Regexp /" << p->pattern->pattern() << "/>";

  return encoded_str_new(output.str().data(), output.str().length(),
      p->pattern->options().encoding());
}

#literal?Boolean

Returns whether or not the regular expression re2 was compiled with the literal option set to true.

Examples:

re2 = RE2::Regexp.new("woo?", :literal => true)
re2.literal?    #=> true

Returns:

  • (Boolean)

    the literal option



1065
1066
1067
1068
1069
1070
# File 'ext/re2/re2.cc', line 1065

static VALUE re2_regexp_literal(const VALUE self) {
  re2_pattern *p;
  TypedData_Get_Struct(self, re2_pattern, &re2_regexp_data_type, p);

  return BOOL2RUBY(p->pattern->options().literal());
}

#log_errors?Boolean

Returns whether or not the regular expression re2 was compiled with the log_errors option set to true.

Examples:

re2 = RE2::Regexp.new("woo?", :log_errors => true)
re2.log_errors?    #=> true

Returns:

  • (Boolean)

    the log_errors option



1033
1034
1035
1036
1037
1038
# File 'ext/re2/re2.cc', line 1033

static VALUE re2_regexp_log_errors(const VALUE self) {
  re2_pattern *p;
  TypedData_Get_Struct(self, re2_pattern, &re2_regexp_data_type, p);

  return BOOL2RUBY(p->pattern->options().log_errors());
}

#longest_match?Boolean

Returns whether or not the regular expression re2 was compiled with the longest_match option set to true.

Examples:

re2 = RE2::Regexp.new("woo?", :longest_match => true)
re2.longest_match?    #=> true

Returns:

  • (Boolean)

    the longest_match option



1017
1018
1019
1020
1021
1022
# File 'ext/re2/re2.cc', line 1017

static VALUE re2_regexp_longest_match(const VALUE self) {
  re2_pattern *p;
  TypedData_Get_Struct(self, re2_pattern, &re2_regexp_data_type, p);

  return BOOL2RUBY(p->pattern->options().longest_match());
}

#match(text) ⇒ RE2::MatchData, Boolean #match(text, 0) ⇒ Boolean #match(text, number_of_submatches) ⇒ RE2::MatchData

Match the pattern against the given text and return either a boolean (if no submatches are required) or a MatchData instance with the specified number of submatches (defaults to the total number of capturing groups).

The number of submatches has a significant impact on performance: requesting one submatch is much faster than requesting more than one and requesting zero submatches is faster still.

Overloads:

  • #match(text) ⇒ RE2::MatchData, Boolean

    Returns an MatchData containing the matching pattern and all submatches resulting from looking for the regexp in text if the pattern contains capturing groups.

    Returns either true or false indicating whether a successful match was made if the pattern contains no capturing groups.

    Examples:

    Matching with capturing groups

    r = RE2::Regexp.new('w(o)(o)')
    r.match('woo')    #=> #<RE2::MatchData "woo" 1:"o" 2:"o">

    Matching without capturing groups

    r = RE2::Regexp.new('woo')
    r.match('woo')    #=> true

    Parameters:

    • text (String)

      the text to search

    Returns:

    • (RE2::MatchData)

      if the pattern contains capturing groups

    • (Boolean)

      if the pattern does not contain capturing groups

    Raises:

    • (NoMemoryError)

      if there was not enough memory to allocate the submatches

  • #match(text, 0) ⇒ Boolean

    Returns either true or false indicating whether a successful match was made.

    Examples:

    r = RE2::Regexp.new('w(o)(o)')
    r.match('woo', 0) #=> true
    r.match('bob', 0) #=> false

    Parameters:

    • text (String)

      the text to search

    Returns:

    • (Boolean)

      whether the match was successful

    Raises:

    • (NoMemoryError)

      if there was not enough memory to allocate the submatches

  • #match(text, number_of_submatches) ⇒ RE2::MatchData

    See match(text) but with a specific number of submatches returned (padded with nils if necessary).

    Examples:

    r = RE2::Regexp.new('w(o)(o)')
    r.match('woo', 1) #=> #<RE2::MatchData "woo" 1:"o">
    r.match('woo', 3) #=> #<RE2::MatchData "woo" 1:"o" 2:"o" 3:nil>

    Parameters:

    • text (String)

      the text to search

    • number_of_submatches (Integer)

      the number of submatches to return

    Returns:

    Raises:

    • (ArgumentError)

      if given a negative number of submatches

    • (NoMemoryError)

      if there was not enough memory to allocate the matches

Returns:



1368
1369
1370
1371
1372
1373
1374
1375
1376
1377
1378
1379
1380
1381
1382
1383
1384
1385
1386
1387
1388
1389
1390
1391
1392
1393
1394
1395
1396
1397
1398
1399
1400
1401
1402
1403
1404
1405
1406
1407
1408
1409
1410
1411
1412
1413
1414
1415
1416
1417
1418
1419
1420
1421
1422
1423
1424
1425
1426
1427
1428
1429
1430
1431
1432
1433
1434
1435
1436
1437
1438
# File 'ext/re2/re2.cc', line 1368

static VALUE re2_regexp_match(int argc, VALUE *argv, const VALUE self) {
  re2_pattern *p;
  re2_matchdata *m;
  VALUE text, number_of_submatches;

  rb_scan_args(argc, argv, "11", &text, &number_of_submatches);

  /* Ensure text is a string. */
  StringValue(text);

  TypedData_Get_Struct(self, re2_pattern, &re2_regexp_data_type, p);

  int n;

  if (RTEST(number_of_submatches)) {
    n = NUM2INT(number_of_submatches);

    if (n < 0) {
      rb_raise(rb_eArgError, "number of matches should be >= 0");
    }
  } else {
    if (!p->pattern->ok()) {
      return Qnil;
    }

    n = p->pattern->NumberOfCapturingGroups();
  }

  if (n == 0) {
#ifdef HAVE_ENDPOS_ARGUMENT
    bool matched = p->pattern->Match(RSTRING_PTR(text), 0,
        RSTRING_LEN(text), RE2::UNANCHORED, 0, 0);
#else
    bool matched = p->pattern->Match(RSTRING_PTR(text), 0, RE2::UNANCHORED,
        0, 0);
#endif
    return BOOL2RUBY(matched);
  } else {
    /* Because match returns the whole match as well. */
    n += 1;

    VALUE matchdata = rb_class_new_instance(0, 0, re2_cMatchData);
    TypedData_Get_Struct(matchdata, re2_matchdata, &re2_matchdata_data_type, m);
    m->matches = new(std::nothrow) re2::StringPiece[n];
    RB_OBJ_WRITE(matchdata, &m->regexp, self);
    if (!RTEST(rb_obj_frozen_p(text))) {
      text = rb_str_freeze(rb_str_dup(text));
    }
    RB_OBJ_WRITE(matchdata, &m->text, text);

    if (m->matches == 0) {
      rb_raise(rb_eNoMemError,
               "not enough memory to allocate StringPieces for matches");
    }

    m->number_of_matches = n;

#ifdef HAVE_ENDPOS_ARGUMENT
    bool matched = p->pattern->Match(RSTRING_PTR(m->text), 0,
        RSTRING_LEN(m->text), RE2::UNANCHORED, m->matches, n);
#else
    bool matched = p->pattern->Match(RSTRING_PTR(m->text), 0,
        RE2::UNANCHORED, m->matches, n);
#endif
    if (matched) {
      return matchdata;
    } else {
      return Qnil;
    }
  }
}

#match?(text) ⇒ Boolean

Returns true or false to indicate a successful match. Equivalent to re2.match(text, 0).

Returns:

  • (Boolean)

    whether the match was successful



1446
1447
1448
1449
1450
# File 'ext/re2/re2.cc', line 1446

static VALUE re2_regexp_match_p(const VALUE self, VALUE text) {
  VALUE argv[2] = { text, INT2FIX(0) };

  return re2_regexp_match(2, argv, self);
}

#max_memInteger

Returns the max_mem setting for the regular expression re2.

Examples:

re2 = RE2::Regexp.new("woo?", :max_mem => 1024)
re2.max_mem    #=> 1024

Returns:

  • (Integer)

    the max_mem option



1049
1050
1051
1052
1053
1054
# File 'ext/re2/re2.cc', line 1049

static VALUE re2_regexp_max_mem(const VALUE self) {
  re2_pattern *p;
  TypedData_Get_Struct(self, re2_pattern, &re2_regexp_data_type, p);

  return INT2FIX(p->pattern->options().max_mem());
}

#named_capturing_groupsHash

Returns a hash of names to capturing indices of groups.

Note RE2 only supports UTF-8 and ISO-8859-1 encoding so strings will be returned in UTF-8 by default or ISO-8859-1 if the :utf8 option for the RE2::Regexp is set to false (any other encoding’s behaviour is undefined).

Returns:

  • (Hash)

    a hash of names to capturing indices



1294
1295
1296
1297
1298
1299
1300
1301
1302
1303
1304
1305
1306
1307
1308
1309
# File 'ext/re2/re2.cc', line 1294

static VALUE re2_regexp_named_capturing_groups(const VALUE self) {
  re2_pattern *p;

  TypedData_Get_Struct(self, re2_pattern, &re2_regexp_data_type, p);
  const std::map<std::string, int>& groups = p->pattern->NamedCapturingGroups();
  VALUE capturing_groups = rb_hash_new();

  for (std::map<std::string, int>::const_iterator it = groups.begin(); it != groups.end(); ++it) {
    rb_hash_aset(capturing_groups,
        encoded_str_new(it->first.data(), it->first.size(),
          p->pattern->options().encoding()),
        INT2FIX(it->second));
  }

  return capturing_groups;
}

#never_nl?Boolean

Returns whether or not the regular expression re2 was compiled with the never_nl option set to true.

Examples:

re2 = RE2::Regexp.new("woo?", :never_nl => true)
re2.never_nl?    #=> true

Returns:

  • (Boolean)

    the never_nl option



1081
1082
1083
1084
1085
1086
# File 'ext/re2/re2.cc', line 1081

static VALUE re2_regexp_never_nl(const VALUE self) {
  re2_pattern *p;
  TypedData_Get_Struct(self, re2_pattern, &re2_regexp_data_type, p);

  return BOOL2RUBY(p->pattern->options().never_nl());
}

#number_of_capturing_groupsInteger

Returns the number of capturing subpatterns, or -1 if the regexp wasn’t valid on construction. The overall match ($0) does not count: if the regexp is “(a)(b)”, returns 2.

Returns:

  • (Integer)

    the number of capturing subpatterns



1278
1279
1280
1281
1282
1283
# File 'ext/re2/re2.cc', line 1278

static VALUE re2_regexp_number_of_capturing_groups(const VALUE self) {
  re2_pattern *p;
  TypedData_Get_Struct(self, re2_pattern, &re2_regexp_data_type, p);

  return INT2FIX(p->pattern->NumberOfCapturingGroups());
}

#ok?Boolean

Returns whether or not the regular expression re2 was compiled successfully or not.

Examples:

re2 = RE2::Regexp.new("woo?")
re2.ok?    #=> true

Returns:

  • (Boolean)

    whether or not compilation was successful



969
970
971
972
973
974
# File 'ext/re2/re2.cc', line 969

static VALUE re2_regexp_ok(const VALUE self) {
  re2_pattern *p;
  TypedData_Get_Struct(self, re2_pattern, &re2_regexp_data_type, p);

  return BOOL2RUBY(p->pattern->ok());
}

#one_line?Boolean

Returns whether or not the regular expression re2 was compiled with the one_line option set to true.

Examples:

re2 = RE2::Regexp.new("woo?", :one_line => true)
re2.one_line?    #=> true

Returns:

  • (Boolean)

    the one_line option



1159
1160
1161
1162
1163
1164
# File 'ext/re2/re2.cc', line 1159

static VALUE re2_regexp_one_line(const VALUE self) {
  re2_pattern *p;
  TypedData_Get_Struct(self, re2_pattern, &re2_regexp_data_type, p);

  return BOOL2RUBY(p->pattern->options().one_line());
}

#optionsHash

Returns a hash of the options currently set for re2.

Returns:

  • (Hash)

    the options



1226
1227
1228
1229
1230
1231
1232
1233
1234
1235
1236
1237
1238
1239
1240
1241
1242
1243
1244
1245
1246
1247
1248
1249
1250
1251
1252
1253
1254
1255
1256
1257
1258
1259
1260
1261
1262
1263
1264
1265
1266
1267
1268
1269
# File 'ext/re2/re2.cc', line 1226

static VALUE re2_regexp_options(const VALUE self) {
  re2_pattern *p;

  TypedData_Get_Struct(self, re2_pattern, &re2_regexp_data_type, p);
  VALUE options = rb_hash_new();

  rb_hash_aset(options, ID2SYM(id_utf8),
      BOOL2RUBY(p->pattern->options().encoding() == RE2::Options::EncodingUTF8));

  rb_hash_aset(options, ID2SYM(id_posix_syntax),
      BOOL2RUBY(p->pattern->options().posix_syntax()));

  rb_hash_aset(options, ID2SYM(id_longest_match),
      BOOL2RUBY(p->pattern->options().longest_match()));

  rb_hash_aset(options, ID2SYM(id_log_errors),
      BOOL2RUBY(p->pattern->options().log_errors()));

  rb_hash_aset(options, ID2SYM(id_max_mem),
      INT2FIX(p->pattern->options().max_mem()));

  rb_hash_aset(options, ID2SYM(id_literal),
      BOOL2RUBY(p->pattern->options().literal()));

  rb_hash_aset(options, ID2SYM(id_never_nl),
      BOOL2RUBY(p->pattern->options().never_nl()));

  rb_hash_aset(options, ID2SYM(id_case_sensitive),
      BOOL2RUBY(p->pattern->options().case_sensitive()));

  rb_hash_aset(options, ID2SYM(id_perl_classes),
      BOOL2RUBY(p->pattern->options().perl_classes()));

  rb_hash_aset(options, ID2SYM(id_word_boundary),
      BOOL2RUBY(p->pattern->options().word_boundary()));

  rb_hash_aset(options, ID2SYM(id_one_line),
      BOOL2RUBY(p->pattern->options().one_line()));

  /* This is a read-only hash after all... */
  rb_obj_freeze(options);

  return options;
}

#patternString

Returns a string version of the regular expression re2.

Note RE2 only supports UTF-8 and ISO-8859-1 encoding so strings will be returned in UTF-8 by default or ISO-8859-1 if the :utf8 option for the RE2::Regexp is set to false (any other encoding’s behaviour is undefined).

Examples:

re2 = RE2::Regexp.new("woo?")
re2.to_s    #=> "woo?"

Returns:

  • (String)

    a string version of the regular expression



951
952
953
954
955
956
957
958
# File 'ext/re2/re2.cc', line 951

static VALUE re2_regexp_to_s(const VALUE self) {
  re2_pattern *p;
  TypedData_Get_Struct(self, re2_pattern, &re2_regexp_data_type, p);

  return encoded_str_new(p->pattern->pattern().data(),
      p->pattern->pattern().size(),
      p->pattern->options().encoding());
}

#perl_classes?Boolean

Returns whether or not the regular expression re2 was compiled with the perl_classes option set to true.

Examples:

re2 = RE2::Regexp.new("woo?", :perl_classes => true)
re2.perl_classes?    #=> true

Returns:

  • (Boolean)

    the perl_classes option



1127
1128
1129
1130
1131
1132
# File 'ext/re2/re2.cc', line 1127

static VALUE re2_regexp_perl_classes(const VALUE self) {
  re2_pattern *p;
  TypedData_Get_Struct(self, re2_pattern, &re2_regexp_data_type, p);

  return BOOL2RUBY(p->pattern->options().perl_classes());
}

#posix_syntax?Boolean

Returns whether or not the regular expression re2 was compiled with the posix_syntax option set to true.

Examples:

re2 = RE2::Regexp.new("woo?", :posix_syntax => true)
re2.posix_syntax?    #=> true

Returns:

  • (Boolean)

    the posix_syntax option



1001
1002
1003
1004
1005
1006
# File 'ext/re2/re2.cc', line 1001

static VALUE re2_regexp_posix_syntax(const VALUE self) {
  re2_pattern *p;
  TypedData_Get_Struct(self, re2_pattern, &re2_regexp_data_type, p);

  return BOOL2RUBY(p->pattern->options().posix_syntax());
}

#program_sizeInteger

Returns the program size, a very approximate measure of a regexp’s “cost”. Larger numbers are more expensive than smaller numbers.

Returns:

  • (Integer)

    the regexp “cost”



1213
1214
1215
1216
1217
1218
# File 'ext/re2/re2.cc', line 1213

static VALUE re2_regexp_program_size(const VALUE self) {
  re2_pattern *p;
  TypedData_Get_Struct(self, re2_pattern, &re2_regexp_data_type, p);

  return INT2FIX(p->pattern->ProgramSize());
}

#scan(text) ⇒ Object

Returns a Scanner for scanning the given text incrementally.

Examples:

c = RE2::Regexp.new('(\w+)').scan("Foo bar baz")


1458
1459
1460
1461
1462
1463
1464
1465
1466
1467
1468
1469
1470
1471
1472
1473
1474
1475
1476
1477
1478
1479
1480
1481
1482
# File 'ext/re2/re2.cc', line 1458

static VALUE re2_regexp_scan(const VALUE self, VALUE text) {
  /* Ensure text is a string. */
  StringValue(text);

  re2_pattern *p;
  re2_scanner *c;

  TypedData_Get_Struct(self, re2_pattern, &re2_regexp_data_type, p);
  VALUE scanner = rb_class_new_instance(0, 0, re2_cScanner);
  TypedData_Get_Struct(scanner, re2_scanner, &re2_scanner_data_type, c);

  c->input = new(std::nothrow) re2::StringPiece(RSTRING_PTR(text));
  RB_OBJ_WRITE(scanner, &c->regexp, self);
  RB_OBJ_WRITE(scanner, &c->text, text);

  if (p->pattern->ok()) {
    c->number_of_capturing_groups = p->pattern->NumberOfCapturingGroups();
  } else {
    c->number_of_capturing_groups = 0;
  }

  c->eof = false;

  return scanner;
}

#sourceString

Returns a string version of the regular expression re2.

Note RE2 only supports UTF-8 and ISO-8859-1 encoding so strings will be returned in UTF-8 by default or ISO-8859-1 if the :utf8 option for the RE2::Regexp is set to false (any other encoding’s behaviour is undefined).

Examples:

re2 = RE2::Regexp.new("woo?")
re2.to_s    #=> "woo?"

Returns:

  • (String)

    a string version of the regular expression



951
952
953
954
955
956
957
958
# File 'ext/re2/re2.cc', line 951

static VALUE re2_regexp_to_s(const VALUE self) {
  re2_pattern *p;
  TypedData_Get_Struct(self, re2_pattern, &re2_regexp_data_type, p);

  return encoded_str_new(p->pattern->pattern().data(),
      p->pattern->pattern().size(),
      p->pattern->options().encoding());
}

#to_sString

Returns a string version of the regular expression re2.

Note RE2 only supports UTF-8 and ISO-8859-1 encoding so strings will be returned in UTF-8 by default or ISO-8859-1 if the :utf8 option for the RE2::Regexp is set to false (any other encoding’s behaviour is undefined).

Examples:

re2 = RE2::Regexp.new("woo?")
re2.to_s    #=> "woo?"

Returns:

  • (String)

    a string version of the regular expression



951
952
953
954
955
956
957
958
# File 'ext/re2/re2.cc', line 951

static VALUE re2_regexp_to_s(const VALUE self) {
  re2_pattern *p;
  TypedData_Get_Struct(self, re2_pattern, &re2_regexp_data_type, p);

  return encoded_str_new(p->pattern->pattern().data(),
      p->pattern->pattern().size(),
      p->pattern->options().encoding());
}

#to_strString

Returns a string version of the regular expression re2.

Note RE2 only supports UTF-8 and ISO-8859-1 encoding so strings will be returned in UTF-8 by default or ISO-8859-1 if the :utf8 option for the RE2::Regexp is set to false (any other encoding’s behaviour is undefined).

Examples:

re2 = RE2::Regexp.new("woo?")
re2.to_s    #=> "woo?"

Returns:

  • (String)

    a string version of the regular expression



951
952
953
954
955
956
957
958
# File 'ext/re2/re2.cc', line 951

static VALUE re2_regexp_to_s(const VALUE self) {
  re2_pattern *p;
  TypedData_Get_Struct(self, re2_pattern, &re2_regexp_data_type, p);

  return encoded_str_new(p->pattern->pattern().data(),
      p->pattern->pattern().size(),
      p->pattern->options().encoding());
}

#utf8?Boolean

Returns whether or not the regular expression re2 was compiled with the utf8 option set to true.

Examples:

re2 = RE2::Regexp.new("woo?", :utf8 => true)
re2.utf8?    #=> true

Returns:

  • (Boolean)

    the utf8 option



985
986
987
988
989
990
# File 'ext/re2/re2.cc', line 985

static VALUE re2_regexp_utf8(const VALUE self) {
  re2_pattern *p;
  TypedData_Get_Struct(self, re2_pattern, &re2_regexp_data_type, p);

  return BOOL2RUBY(p->pattern->options().encoding() == RE2::Options::EncodingUTF8);
}

#word_boundary?Boolean

Returns whether or not the regular expression re2 was compiled with the word_boundary option set to true.

Examples:

re2 = RE2::Regexp.new("woo?", :word_boundary => true)
re2.word_boundary?    #=> true

Returns:

  • (Boolean)

    the word_boundary option



1143
1144
1145
1146
1147
1148
# File 'ext/re2/re2.cc', line 1143

static VALUE re2_regexp_word_boundary(const VALUE self) {
  re2_pattern *p;
  TypedData_Get_Struct(self, re2_pattern, &re2_regexp_data_type, p);

  return BOOL2RUBY(p->pattern->options().word_boundary());
}