l i n u x - u s e r s - g r o u p - o f - d a v i s
L U G O D
 
Next Meeting:
September 2: Social gathering
Next Installfest:
TBD
Latest News:
Aug. 18: Discounts to "Velocity" in NY; come to tonight's "Photography" talk
Page last updated:
2008 Jan 20 08:42

The following is an archive of a post made to our 'vox-tech mailing list' by one of its subscribers.

Report this post as spam:

(Enter your email address)
Re: [vox-tech] Finding the right tool for parsing
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [vox-tech] Finding the right tool for parsing



On Sat, 19 Jan 2008 13:50:08 -0800
Alex Mandel <tech_dev@wildintellect.com> wrote:

> I've got a big text file to parse(example below)
> Now I was about to figure out how to parse what I need from this file 
> using python since it's what I'm used to but I realized it was going
> to be hard and some of you seem to love sed, awk etc. Which I have no
> idea how to use.

How about in ruby:

if your data is in the constant TEXT, then you can parse it into nice
Directions structures:

Directions=Struct.new(:start,:distance,:time,:directions,:destination)

result=TEXT.split(/\s+\.\s+Map data ©2008 Tele Atlas\s+/m).collect do |record|
   lines=record.split("\n")
   distance,time=lines[1].match(/^(.+)\((.+)\)/).captures
   Directions.new(lines[0],distance,time,lines[2..-2],lines[-1])
end

the results would be:

[#<struct Directions
  address="618 Lessley Pl, Davis, CA 95616",
  distance="2.0 mi ",
  time="about 9 mins",
  directions=
   ["1.\tHead north on Lessley Pl toward Lehigh Dr\t335 ft",
    "2.\tTurn left at Lehigh Dr\t0.1 mi",
    "3.\tTurn right at Colgate Dr\t0.2 mi",
    "4.\tTurn left at L St\t0.2 mi",
    "5.\tTurn right at 5th St\t0.6 mi",
    "6.\tTurn left at B St\t427 ft",
    "7.\tTurn right at 4th St\t367 ft",
    "8.\tTurn left at University Ave\t499 ft",
    "9.\tTurn right at 3rd St\t0.2 mi",
    "10.\tTurn left at E Quad\t0.2 mi",
    "11.\tTurn right at Peter J Shields Ave/Shields Ave\t0.2 mi"],
  destination="\t1 Shields Ave, Davis, CA 95616">,
 #<struct Directions
  address="618 Lessley Pl, Davis, CA 95616",
  distance="2.0 mi ",
  time="about 9 mins",
  directions=
   ["1.\tHead north on Lessley Pl toward Lehigh Dr\t335 ft",
    "2.\tTurn left at Lehigh Dr\t0.1 mi",
    "3.\tTurn right at Colgate Dr\t0.2 mi",
    "4.\tTurn left at L St\t0.2 mi",
    "5.\tTurn right at 5th St\t0.6 mi",
    "6.\tTurn left at B St\t427 ft",
    "7.\tTurn right at 4th St\t367 ft",
    "8.\tTurn left at University Ave\t499 ft",
    "9.\tTurn right at 3rd St\t0.2 mi",
    "10.\tTurn left at E Quad\t0.2 mi",
    "11.\tTurn right at Peter J Shields Ave/Shields Ave\t0.2 mi"],
  destination="\t1 Shields Ave, Davis, CA 95616">,
 #<struct Directions
  address="1600 Amphitheatre Pkwy, Mountain View, CA 94043",
  distance="105 mi ",
  time="about 1 hour 54 mins",
  directions=
   ["1.\tHead west on Amphitheatre Pkwy toward Garcia Ave\t0.5 mi",
    "2.\tMerge onto US-101 N via the ramp to San Francisco\t4.7 mi",
    "3.\tExit onto CA-114/Willow Rd toward Fremont/State Hwy 84 E\t1.0 mi",
    "4.\tSlight right toward Bayfront Expy/CA-84\t495 ft",
    "5.\tSlight right at Bayfront Expy/CA-84 Continue to follow CA-84\t8.4 mi",
    "6.\tMerge onto I-880 N via the ramp to Oakland\t22 mi",
    "7.\tSlight right at I-980 E (signs for I-980/State Hwy 24/Walnut Creek) 1.5 mi",
    "8.\tTake the exit onto I-580 W toward San Francisco\t5.9 mi",
    "9.\tContinue on I-80 E (signs for Vallejo/Sacramento) Partial toll road\t60 mi",
    "10.\tTake exit 72 for Richards Blvd\t0.3 mi",
    "11.\tTurn right at Richards Blvd (signs for Davis)\t0.3 mi",
    "12.\tSlight right to stay on Richards Blvd\t0.2 mi",
    "13.\tTurn left at 1st St\t0.3 mi",
    "14.\tTurn right at A St\t0.2 mi",
    "15.\tTurn left at 3rd St\t0.1 mi",
    "16.\tTurn left at E Quad\t0.2 mi",
    "17.\tTurn right at Peter J Shields Ave/Shields Ave\t0.2 mi"],
  destination="\t1 Shields Ave, Davis, CA 95616">]

-- 
Ken (Chanoch) Bloom. PhD candidate. Linguistic Cognition Laboratory.
Department of Computer Science. Illinois Institute of Technology.
http://www.iit.edu/~kbloom1/

Attachment: signature.asc
Description: PGP signature

_______________________________________________
vox-tech mailing list
vox-tech@lists.lugod.org
http://lists.lugod.org/mailman/listinfo/vox-tech


LinkedIn
LUGOD Group on LinkedIn
Sign up for LUGOD event announcements
Your email address:
facebook
LUGOD Group on Facebook
'Like' LUGOD on Facebook:

Hosting provided by:
Sunset Systems
Sunset Systems offers preconfigured Linux systems, remote system administration and custom software development.

LUGOD: Linux Users' Group of Davis
PO Box 2082, Davis, CA 95617
Contact Us

LUGOD is a 501(c)7 non-profit organization
based in Davis, California
and serving the Sacramento area.
"Linux" is a trademark of Linus Torvalds.

Sponsored in part by:
O'Reilly and Associates
For numerous book donations.