l i n u x - u s e r s - g r o u p - o f - d a v i s
L U G O D
 
Next Meeting:
December 2: Social gathering
Next Installfest:
TBD
Latest News:
Nov. 18: Club officer elections
Page last updated:
2008 Dec 02 05:57

The following is an archive of a post made to our 'vox-tech mailing list' by one of its subscribers.

Report this post as spam:

(Enter your email address)
[vox-tech] ARE (Tcl / Postgresql) REGEX question
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[vox-tech] ARE (Tcl / Postgresql) REGEX question



Hi,

I have a rather complex (for me) regular expression that I am trying to figure 
out.

Here is an example that works just fine:

-- I am trying to extract the two colors:
-- 10YR 6/4 and 7.5YR 4/4 from the following block of text
SELECT regexp_matches('B11t Light yellowish brown (10YR 6/4) gravelly clay 
loam, brown to dark brown (7.5YR 4/4) moist; weak coarse subangular blocky; 
hard, friable, sticky and plastic; few very fine and many fine and medium 
roots; many very fine and fine interstital and tubular pores; few thin clay 
films lining pores; pH 5.4; clear smooth boundary.' , E'([0-9]?[\\.]?[0-9][Y|
y|R|r]+[ ]+?[0-9]/[0-9]).*?([0-9]?[\\.]?[0-9][Y|y|R|r]+[ ]+?[0-9]/[0-9])') ;

      regexp_matches      
--------------------------
 {"10YR 6/4","7.5YR 4/4"}



However, this pattern does not work when there is only one color:

SELECT regexp_matches('B11t Light yellowish brown (10YR 6/4) gravelly clay 
loam; weak coarse subangular blocky; hard, friable, sticky and plastic; few 
very fine and many fine and medium roots; many very fine and fine interstital 
and tubular pores; few thin clay films lining pores; pH 5.4; clear smooth 
boundary.' , E'([0-9]?[\\.]?[0-9][Y|y|R|r]+[ ]+?[0-9]/[0-9]).*?([0-9]?[\\.]?
[0-9][Y|y|R|r]+[ ]+?[0-9]/[0-9])') ;


I have tried making the second capturing clause optional by appending the '?' 
operator. This causes the single color example to be parsed correctly, but 
now the double color example does not work:

SELECT regexp_matches('B11t Light yellowish brown (10YR 6/4) gravelly clay 
loam, brown to dark brown (7.5YR 4/4) moist; weak coarse subangular blocky; 
hard, friable, sticky and plastic; few very fine and many fine and medium 
roots; many very fine and fine interstital and tubular pores; few thin clay 
films lining pores; pH 5.4; clear smooth boundary.' , E'([0-9]?[\\.]?[0-9][Y|
y|R|r]+[ ]+?[0-9]/[0-9]).*?([0-9]?[\\.]?[0-9][Y|y|R|r]+[ ]+?[0-9]/[0-9])?') ;

  regexp_matches   
-------------------
 {"10YR 6/4",NULL}


Any ideas on how to improve this regex?

Thanks!

Dylan


-- 
Dylan Beaudette
Soil Resource Laboratory
http://casoilresource.lawr.ucdavis.edu/
University of California at Davis
530.754.7341
_______________________________________________
vox-tech mailing list
vox-tech@lists.lugod.org
http://lists.lugod.org/mailman/listinfo/vox-tech



LinkedIn
LUGOD Group on LinkedIn
Sign up for LUGOD event announcements
Your email address:
facebook
LUGOD Group on Facebook
'Like' LUGOD on Facebook:

Hosting provided by:
Sunset Systems
Sunset Systems offers preconfigured Linux systems, remote system administration and custom software development.

LUGOD: Linux Users' Group of Davis
PO Box 2082, Davis, CA 95617
Contact Us

LUGOD is a 501(c)7 non-profit organization
based in Davis, California
and serving the Sacramento area.
"Linux" is a trademark of Linus Torvalds.

Sponsored in part by:
Appahost Applications
For a significant contribution towards our projector, and a generous donation to allow us to continue meeting at the Davis Library.