I recently discovered Wikipedia’s Current Events page, and find it a nice source of international news headlines. Too bad they don’t have an RSS feed, I thought.
A bit of PHP and “screen scraping”, and I came up with wikifeed.rss.
Feel free to subscribe to it yourselves (don’t be abusive, or I’ll have to take it down). Or, better yet, install your own version with the source file, wikifeed.php.
< php
define('NL', "\n");
$root_url = 'http://en.wikipedia.org';
$source_url = $root_url . '/wiki/Current_events';
$source = file_get_contents($source_url);
$build_date = date('r');
header('Content-type: application/rss+xml');
echo <<< EOB
< xml version="1.0" ><rss version="2.0">
<channel><title>WikiFeed: Current Events</title>
<link>{$source_url}</link>
<description>Current events, from Wikipedia, the free encyclopedia</description>
<language>en-us</language>
<lastBuildDate>{$build_date}</lastBuildDate>
<copyright>http://www.gnu.org/copyleft/fdl.html</copyright>
<generator>Colin Viebrock</generator>
EOB;
for ($i=0; $i<3; $i++) {
$now = mktime(0,0,0,date('m'),date('d')-$i,date('Y'));
$key = date('j_F_Y_.28l.29', $now);
$guid = date('Ymd', $now);
$pos = strpos($source, '/w/index.php title=Current_events' );
$pos = strpos($source, $key, $pos);
$start = strpos($source,'<ul>', $pos);
$end = strpos($source,'</ul>', $start);
$data = trim(substr($source,$start+4,$end-$start));
if (preg_match_all('/<li>(.* )<\/li>/', $data, $matches)) {
$j = count($matches[1]);
foreach($matches[1] as $match) {
$clean = strip_tags($match);
$relinked = str_replace('href="/wiki', 'href="' . $root_url . '/wiki', $match);
if (strlen($clean)>50) {
$pos = strpos($clean, ' ', 50);
} else {
$pos = strlen($clean);
}
echo '<item>' . NL;
echo '<title>' . htmlentities(substr($clean,0,$pos)) . ' ...</title>' . NL;
echo '<description>' . htmlentities($relinked) . '</description>' . NL;
echo '<link>' . $root_url . '#' . $key . '</link>' . NL;
echo '<guid isPermaLink="false">' . $guid . '-' . $j . '</guid>' . NL;
echo '</item>' . NL;
$j--;
}
}
echo NL;
}
echo NL . '</channel></rss>';
>
Comments and improvements welcome.
Copyright © 2000-2012 Colin Viebrock • All Rights Reserved
17 October 2005, 16:08 • PermaLink