l i n u x - u s e r s - g r o u p - o f - d a v i s
L U G O D
 
Next Meeting:
November 4: Social gathering
Next Installfest:
TBD
Latest News:
Oct. 24: LUGOD election season has begun!
Page last updated:
2001 Dec 30 17:10

The following is an archive of a post made to our 'vox-tech mailing list' by one of its subscribers.

Report this post as spam:

(Enter your email address)
Re: [vox-tech] Help with regular expressions needed.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [vox-tech] Help with regular expressions needed.



Everbody wrote lots of sh*t:

My 2 cents on the banter.


This really isn't a difficult problem.  (Now I hope I'm right as I don't 
use kmail;
if not I'm certainly going to look the fool)  :-)


I have worked up a quick expression checker in perl:

A data file:
------------------------------------------------------------------------------
one Capitals in a row in one place
two CApitals in a row in one place
three CAPitals in a row in one place
four CAPItals in a row in one place

One Capitals in a row in Two places
TWo Capitals in a row in TWo places
THRee Capitals in a row in TWO places
FOUR CAPItals in a row in Two places

One Capitals in a row in Three places
TWo CApitals in a row in THree places
THRee CAPitals in a row in THRee places
FOUR CAPItals in a row in THREe places
------------------------------------------------------------------------------


A perl program:
------------------------------------------------------------------------------
#!/usr/bin/perl -w
#
#  An example script to check for patterns of capital letters in perl.
#
while(<>)
{
  chomp();

  # NO lower case letters
  if( /^.*[a-z]+.*$/ )

  # 2 capitals in a row
#  if(! /^.*[A-Z]{2,}.*$/ )

  # 3 capitals in a row
#  if(! /^.*[A-Z]{3,}.*$/ )
  
  # 4 capitals in a row
#  if(! /^.*[A-Z]{4,}.*$/ )

  # At least 2 caps in a row in at least two places
#  if(! /^.*[A-Z]{2,}.+[A-Z]{2,}.*$/ )

  # At least 3 caps in a row in at least two places
#  if(! /^.*[A-Z]{3,}.+[A-Z]{3,}.*$/ )

  # At least 4 caps in a row in at least two places
#  if(! /^.*[A-Z]{4,}.+[A-Z]{4,}.*$/ )

  # At least 2 caps in a row in at least three places
#  if(! /^.*[A-Z]{2,}.+[A-Z]{2,}.+[A-Z]{2,}.*$/ )

  # At least 3 caps in a row in at least three places
#  if(! /^.*[A-Z]{3,}.+[A-Z]{3,}.+[A-Z]{3,}.*$/ )

  # At least 4 caps in a row in at least three places
  if(! /^.*[A-Z]{4,}.+[A-Z]{4,}.+[A-Z]{4,}.*$/ )

  {
    print "Good: $_\n";
  }
  else
  {  
    print "Bad:  $_\n";
  }
} 
------------------------------------------------------------------------------


A sample run for the rule in question:        (the first one)
------------------------------------------------------------------------------
tres@ares:~/program/quick$ ./regexp1.pl regexp1.dat
Good: one Capitals in a row in one place
Good: two CApitals in a row in one place
Good: three CAPitals in a row in one place
Good: four CAPItals in a row in one place
Bad: 
Good: One Capitals in a row in Two places
Good: TWo Capitals in a row in TWo places
Good: THRee Capitals in a row in TWO places
Good: FOUR CAPItals in a row in Two places
Bad: 
Good: One Capitals in a row in Three places
Good: TWo CApitals in a row in THree places
Good: THRee CAPitals in a row in THRee places
Good: FOUR CAPItals in a row in THREe places
Bad: 
Bad:  NO LOWER CASE AT ALL
tres@ares:~/program/quick$
------------------------------------------------------------------------------



A sample run (for the last rule):
------------------------------------------------------------------------------
tres@ares:~/program/quick$ ./regexp1.pl regexp1.dat
Good: one Capitals in a row in one place
Good: two CApitals in a row in one place
Good: three CAPitals in a row in one place
Good: four CAPItals in a row in one place
Good:
Good: One Capitals in a row in Two places
Good: TWo Capitals in a row in TWo places
Good: THRee Capitals in a row in TWO places
Good: FOUR CAPItals in a row in Two places
Good:
Good: One Capitals in a row in Three places
Good: TWo CApitals in a row in THree places
Good: THRee CAPitals in a row in THRee places
Bad:  FOUR CAPItals in a row in THREe places
tres@ares:~/program/quick$
------------------------------------------------------------------------------


A Line disected:

        if(! /^.*[A-Z]{4,}.+[A-Z]{2,}.+[A-Z]{4,}.*$/ )

        This line looks for at least 4 capital letters in a row at least 
3 times in a line by doing:
            Find the beginning of the line
            Find at least 0 occurances of any character
            Find at least 4 cap in a row
            Find at least 1 occurance of any character
            Find at least 4 caps in a row
            Find at least 1 occurance of any character
            Find at least 4 caps in a row
            Find at least 0 occurances of any character
            Find the end of the line
        Then report success/failure.

As you can see This will fail for the case of a blank subject or in the 
case that Micah brought
up.  Some of the following rules can handle some more elaborate cases.

The next challenge is to change the step of repeating

[A-Z]{4,}.+

to account for multiple occurances.

This was supposed to take 10 minutes, 40 for the prog, testing, and 
email isn't too bad though.

Cheers,
Tres




LinkedIn
LUGOD Group on LinkedIn
Sign up for LUGOD event announcements
Your email address:
facebook
LUGOD Group on Facebook
'Like' LUGOD on Facebook:

Hosting provided by:
Sunset Systems
Sunset Systems offers preconfigured Linux systems, remote system administration and custom software development.

LUGOD: Linux Users' Group of Davis
PO Box 2082, Davis, CA 95617
Contact Us

LUGOD is a 501(c)7 non-profit organization
based in Davis, California
and serving the Sacramento area.
"Linux" is a trademark of Linus Torvalds.

Sponsored in part by:
O'Reilly and Associates
For numerous book donations.