[kwlug-disc] wget and variable assignment
John Van Ostrand
john at netdirect.ca
Thu Jun 3 18:53:18 EDT 2010
----- Original Message -----
> I have a simple screen-scrape to do.
>
> >From the command line it works fine
>
> wget -q -O - http://www.openstreetmap.org/stats/data_stats.html| grep
> "<td>Number of users" | sed -e 's/[:a-zA-Z <>/:]//g'
>
> it returns the plain number
>
> 262086
>
> Cool, now to add it to a script
>
> This works fine
> GETTEE=`wget -q -O -
> http://www.openstreetmap.org/stats/data_stats.html| grep "<td>Number
> of users" | sed -e 's/[:a-zA-Z <>/:]//g'`
> echo "GETTEE = $GETTEE"
>
> gives: GETTEE = 262086
>
> But. I want to grab some other data from the same page, so I want to
> wget once, then grep / sed a couple of times. And I'm breaking it.
> The page appears to have been stripped of its \n and so grepping the
> line I want is failing.
>
> GETTEE=`wget -q -O -
> http://www.openstreetmap.org/stats/data_stats.html` echo "GETTEE =
> $GETTEE"
>
> This returns a mess.
>
> The quick and dirty is to wget four times for four numbers, but I
> don't want to do that. How do I assign the wget to a variable and
> keep \n ?
>
That's because IFS includes newlines which are being stripped. If you set IFS to just space and tab it works:
PAGE=`wget -q -O - http://www.openstreetmap.org/stats/data_stats.html`
export IFS=" "
echo "$PAGE" | grep "<td>Number of users" | sed -e 's/[:a-zA-Z <>/:]//g'
echo "$PAGE" | grep "<td>Number of users" | sed -e 's/[:a-zA-Z <>/:]//g'
echo "$PAGE" | grep "<td>Number of users" | sed -e 's/[:a-zA-Z <>/:]//g'
echo "$PAGE" | grep "<td>Number of users" | sed -e 's/[:a-zA-Z <>/:]//g'
unset IFS
--
John Van Ostrand
CTO, co-CEO
Net Direct Inc.
564 Weber St. N. Unit 12, Waterloo, ON N2L 5C6
Ph: 866-883-1172 x5102
Fx: 519-883-8533
Linux Solutions / IBM Hardware
More information about the kwlug-disc
mailing list