l i n u x - u s e r s - g r o u p - o f - d a v i s
L U G O D
 
Next Meeting:
April 21: Google Glass
Next Installfest:
TBD
Latest News:
Mar. 18: Google Glass at LUGOD's April meeting
Page last updated:
2003 Jun 11 13:35

The following is an archive of a post made to our 'vox-tech mailing list' by one of its subscribers.

Report this post as spam:

(Enter your email address)
Re: [vox-tech] Parsing Html
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [vox-tech] Parsing Html



--7ohyzAr2DuZRs7WU
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Wed, Jun 11, 2003 at 03:08:27PM -0500, Jay Strauss wrote:
> I have the mongo piece of html I read from an online source.  I want to
> parse it, particularly I'm interested in a specific table (one of many
> within the html).  I'd like to get at that table and basically turn it in=
to
> a perl data structure I can use
> like: array of array refs, that is an element for each row that points a =
an
> array of cells
>=20
> I tried to read and use HTML::Parser but I was overwhelmed.  Anyone know =
an
> easy way to do this?

need more detail.

- can you give the url to the table?
  or
  if you can't give the url explain generally give a=20
    sample of one or two table elements.

- are you interested in a one time parse=20
  or=20
  will you be re-running this on the changing contents of the page every da=
y?


  normally I just use s/aaa/bbb/ expressions to chop out the html crap
from a table, then a m/()()()/ style thing to convert the elements into
useful data structure...

--=20
GPG key: http://simons-clan.com/~msimons/gpg/msimons.asc
Fingerprint: 524D A726 77CB 62C9 4D56  8109 E10C 249F B7FA ACBE

--7ohyzAr2DuZRs7WU
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.6 (GNU/Linux)
Comment: For info see http://www.gnupg.org

iD8DBQE+545C4Qwkn7f6rL4RAtuWAJ9qD8zYNB/aSXCH1SsHwF4y7PN/PACdETZ5
z4Fxjn+/CTLfds0z0zSETZ4=
=Gh5y
-----END PGP SIGNATURE-----

--7ohyzAr2DuZRs7WU--
_______________________________________________
vox-tech mailing list
vox-tech@lists.lugod.org
http://lists.lugod.org/mailman/listinfo/vox-tech








LinkedIn
LUGOD Group on LinkedIn
Sign up for LUGOD event announcements
Your email address:
facebook
LUGOD Group on Facebook
'Like' LUGOD on Facebook:

Hosting provided by:
Sunset Systems
Sunset Systems offers preconfigured Linux systems, remote system administration and custom software development.

LUGOD: Linux Users' Group of Davis
PO Box 2082, Davis, CA 95617
Contact Us

LUGOD is a 501(c)7 non-profit organization
based in Davis, California
and serving the Sacramento area.
"Linux" is a trademark of Linus Torvalds.

Sponsored in part by:
Sunset Systems
Who graciously hosts our website & mailing lists!