[kwlug-disc] BASH compare items in two files
John Van Ostrand
john at netdirect.ca
Wed Nov 3 14:37:47 EDT 2010
----- Original Message -----
> ----- Original Message -----
> > ----- Original Message -----
> > > A higher level language (python, perl, C, etc) program would be
> > > your
> > > best bet as they already have the XML parsing and data handling
> > > libraries that will make this task a cinch.
> > >
> > > 1. Load the list of IDs into a searchable list
> > > 2. Using a SAX parser compare the ID of every node against the
> > > list
> > > 3. Done!
> >
> > Oops I should read more carefully. He asked what a real programmer
> > would do.
>
> So my take on it is this. Use the document object model in some
> language. I presume that's what sax does. Javascript can do this
> (jQuery), as can the perl-XML-XPathEngine. I suspect php-domxml can as
> well. One uses a query language to search for the XML tags and
> iterates through them looking for matches or misses.
Here is a PHP version:
#!/usr/bin/php
<?php
# Load Id file
if (!$fd = fopen("/tmp/id.txt", 'r')) {
die("Unable to open /tmp/id.txt");
}
while (!feof($fd)) {
# Get line
$line = fgets($fd);
# Trim whitespace
$line = trim($line);
# Add to array
$id_list{$line} = true;
}
# Load the XML file
$d = DomDocument::load("/tmp/id.xml");
# Get a list of all <foo> tags
$nl = $d->GetElementsByTagName("foo");
# Cycle through them
$l = $nl->length;
for ($i = 0; $i < $l; $i++) {
# Extract ID
$id = $nl->item($i)->getAttribute("id");
# Check array:
if (isset($id_list{$id})) {
$match++;
} else {
$nomatch++;
}
}
echo "Matching : $match\n";
echo "Non-matching: $nomatch\n";
--
John Van Ostrand
CTO, co-CEO
Net Direct Inc.
564 Weber St. N. Unit 12, Waterloo, ON N2L 5C6
Ph: 866-883-1172 x5102
Fx: 519-883-8533
Linux Solutions / IBM Hardware
More information about the kwlug-disc
mailing list