[kwlug-disc] Help!
William Park
opengeometry at yahoo.ca
Wed Dec 31 09:10:42 EST 2014
On Wed, Dec 31, 2014 at 08:42:15AM -0500, Joe Wennechuk wrote:
> Hello All,
> Slightly off topic, but I know you guys can help. I have applied for a
> job, and they have asked me to write a java class that searches html
> from websites for links. I am using this regex ...(Pattern pattern =
> Pattern.compile("<a[^>]*>(.*?)</a>", Pattern.DOTALL |
> Pattern.CASE_INSENSITIVE);) to find them but based on the constraints
> I don't think I'm doing it right, as I am not finding all of the
> links. Here are the constraints.. Can anyone help?? Implementation
> constrains: * For simplification assume that the link is defined as
> '<[whitespace]a[whitespace]' or '<[whitespace]A[whitespace]'.
> ('<a ', '< a h', '<A >', '<a attr=' are all valid links)
Are they testing your Java knowledge?
- You are supposed to account for whitespaces. That may be the
problem.
Or, do they just want the list of links?
- Here, there are better ways to get the list of links, eg.
lynx -dump -listonly http://...
--
William
More information about the kwlug-disc
mailing list