Wednesday, 25 December 2013
PHP 101 (part 14): No News is Good News
A Difficult Choice
After the workout I gave you last time, you’re probably either chomping at the
bit to build another PHP application or you’ve decided to give up PHP programming
and try growing cucumbers instead. If it’s the latter, you should stop reading
right now, because I can guarantee you that this concluding installment of
PHP 101 has absolutely nothing to teach you about vegetable farming.
bit to build another PHP application or you’ve decided to give up PHP programming
and try growing cucumbers instead. If it’s the latter, you should stop reading
right now, because I can guarantee you that this concluding installment of
PHP 101 has absolutely nothing to teach you about vegetable farming.
If it’s the former, however, then you’re going to enjoy what’s coming up. Over the
next few pages, I’m going to be building a simple RSS news aggregator using
PHP, SQLite and SimpleXML. With this news aggregator, you can plug into RSS news
feeds from all over the web, creating a newscast that reflects your needs and
interests for your website. The best part: it updates itself automatically with
the latest stories every time you view it!
next few pages, I’m going to be building a simple RSS news aggregator using
PHP, SQLite and SimpleXML. With this news aggregator, you can plug into RSS news
feeds from all over the web, creating a newscast that reflects your needs and
interests for your website. The best part: it updates itself automatically with
the latest stories every time you view it!
Come on in, and let’s get this show on the road!
Alphabet Soup
I’ll start with the basics. What the heck is RSS anyhow?
RSS (the acronym stands for RDF Site Summary) is a format originally devised
by Netscape to distribute information about the content on its My.Netscape.Com
portal. The format has gone through many iterations since its introduction in
early 1997 (take a look at http://backend.userland.com/stories/rss091 for information on RSS’s
long and complicated history) but most feeds use RSS 1.0 or RSS 0.91, both of
which are lightweight yet full-featured.
by Netscape to distribute information about the content on its My.Netscape.Com
portal. The format has gone through many iterations since its introduction in
early 1997 (take a look at http://backend.userland.com/stories/rss091 for information on RSS’s
long and complicated history) but most feeds use RSS 1.0 or RSS 0.91, both of
which are lightweight yet full-featured.
RSS makes it possible for webmasters to publish and distribute information about
what’s new and interesting on a particular site at a particular time. This
information, which could range from a list of news articles to stock market
data or weather forecasts, is published as a well-formed XML document, and can
therefore be parsed, processed and rendered by any XML parser – including the
SimpleXML parser that is part of PHP 5.
what’s new and interesting on a particular site at a particular time. This
information, which could range from a list of news articles to stock market
data or weather forecasts, is published as a well-formed XML document, and can
therefore be parsed, processed and rendered by any XML parser – including the
SimpleXML parser that is part of PHP 5.
Quite a few popular web sites make an RSS or RDF news feed available to the public at
large. Freshmeat and
Slashdot both have
one, and so do many others, including the "_blank">PEAR, PECL and
large. Freshmeat and
Slashdot both have
one, and so do many others, including the "_blank">PEAR, PECL and
Zend sites. A quick Google search
for public RSS feeds will get you more links than you can shake a stick at.
for public RSS feeds will get you more links than you can shake a stick at.
An RSS document typically contains a list of resources (URLs), marked up with
descriptive metadata. Here’s an example:
descriptive metadata. Here’s an example:
</item> |
As you can see, an RDF file is split up into clearly demarcated sections. First
comes the document prolog, namespace declarations, and root element.
This is followed by a
information on the channel that is described by this RDF file. In the example
above, the channel is Melonfire’s Trog column, which gets updated every week with new
technical articles and tutorials.
comes the document prolog, namespace declarations, and root element.
This is followed by a
<channel> block, which contains generalinformation on the channel that is described by this RDF file. In the example
above, the channel is Melonfire’s Trog column, which gets updated every week with new
technical articles and tutorials.
The
which contains a sequential list of all the resources described within the RDF document.
Every resource in this block corresponds to a resource described in greater detail in a
subsequent
describes a single resource in greater detail, providing a title, an URL and a description
of that resource. It’s this information that our application will use to generate a
personalized news feed.
<channel> block contains an <items> block,which contains a sequential list of all the resources described within the RDF document.
Every resource in this block corresponds to a resource described in greater detail in a
subsequent
<item> block. Every <item> blockdescribes a single resource in greater detail, providing a title, an URL and a description
of that resource. It’s this information that our application will use to generate a
personalized news feed.
Laying the Foundation
Now that you know what RSS and RDF are all about, it’s time to start work. I’ll begin by
sitting down at a table near the window and doodling aimlessly on a sheet of paper until
I figure out exactly what my application is supposed to do, piece by piece (actually, in
this case, the requirements are actually pretty basic):
sitting down at a table near the window and doodling aimlessly on a sheet of paper until
I figure out exactly what my application is supposed to do, piece by piece (actually, in
this case, the requirements are actually pretty basic):
- The application must support one or more RSS-compliant news feeds. On start-up, the
application should retrieve the latest versions of these feeds, parse them and display
their contents in an easy-to-read manner. A SQLite database is a good choice to store
this list of feeds. - The user should be able to control the number of stories s/he picks up from each
feed. For example, a user might want to display more science and health news than
business news. - The application should offer the user a web-based interface to add or delete news
feeds. This interface will use PHP’s SQLite API to run appropriate SQL queries on the
SQLite database file and alter the information stored in the database.
Keeping these requirements in mind, it’s possible to design a simple database
table to hold the (user-configurable) list of RSS news feeds. Here’s what it might
look like:
look like:
CREATE TABLE rss ( id INTEGER NOT NULL PRIMARY KEY, title varchar(255) NOT NULL, url varchar(255) NOT NULL, count INTEGER NOT NULL );
From the table above, it’s clear that every news feed will have three attributes: a
descriptive title, the URL to the feed itself, and a value indicating how many of the
stories in the feed you would like to see displayed in your own custom news page.
descriptive title, the URL to the feed itself, and a value indicating how many of the
stories in the feed you would like to see displayed in your own custom news page.
Let’s add some data to get things started:
INSERT INTO rss VALUES(1, 'Slashdot', 'http://slashdot.org/slashdot.rdf', 5);
INSERT INTO rss VALUES(2, 'Wired News', 'http://www.wired.com/news_drop/netcenter/netcenter.rdf', 5);
INSERT INTO rss VALUES(3, 'Business News', 'http://www.npr.org/rss/rss.php?topicId=6', 3);
INSERT INTO rss VALUES(4, 'Health News',
'http://news.bbc.co.uk/rss/newsonline_world_edition/health/rss091.xml', 3);
INSERT INTO rss VALUES(5, 'Freshmeat', 'http://www.freshmeat.net/backend/fm-releases.rdf', 5);
You can create all this directly from the schema file rss.sql using the SQLite
command
from Part Nine. In fact, now would be a good time for you to
command
.read from the command-line client, if you still have that on boardfrom Part Nine. In fact, now would be a good time for you to
download all the source code for this application, so that
you can check it out and refer to it easily throughout this tutorial. Note that you will
need a PHP 5-enabled web server to run this code.
you can check it out and refer to it easily throughout this tutorial. Note that you will
need a PHP 5-enabled web server to run this code.
Top Story
With the database safely in its web-inaccessible directory, the next step is to write
the code that uses the data inside it to connect to each news feed, parse it for news
data, and present a customized news page.
the code that uses the data inside it to connect to each news feed, parse it for news
data, and present a customized news page.
Here’s what that code, user.php, looks like:
|
Here’s what the output might look like (note that there will a time lag in producing
the page, because PHP will be silently opening HTTP connections to each URL to retrieve
the corresponding RSS feed):
the page, because PHP will be silently opening HTTP connections to each URL to retrieve
the corresponding RSS feed):
The code to accomplish this might look simple, but there’s actually a lot going on behind
the scenes. The first step is to obtain a list of the RSS feeds configured by the user
from the SQLite database. To accomplish this, a SQLite database handle is initialized, and
a SQL
through the resulting record collection.
the scenes. The first step is to obtain a list of the RSS feeds configured by the user
from the SQLite database. To accomplish this, a SQLite database handle is initialized, and
a SQL
SELECT query is executed. A while() loop is used to iteratethrough the resulting record collection.
For each URL thus obtained, the
retrieve and read the RSS feed. Depending on the number of stories to be displayed, a
elements in the feed are parsed. Notice that the path to access an
differs depending on whether the feed is RSS 0.91 or RSS 1.0.
simplexml_load_file() function is used toretrieve and read the RSS feed. Depending on the number of stories to be displayed, a
for() loop is executed and the appropriate number of <item>elements in the feed are parsed. Notice that the path to access an
<item>differs depending on whether the feed is RSS 0.91 or RSS 1.0.
Note that if the database is empty, an error message will appear. In this example,
since I’ve already inserted a bunch of records into the database, you’ll never see
the error message at all; however, it’s good programming practice to ensure that all
eventualities are accounted for, even remote ones.
since I’ve already inserted a bunch of records into the database, you’ll never see
the error message at all; however, it’s good programming practice to ensure that all
eventualities are accounted for, even remote ones.
As before, the file config.php is included at the top of every script. This file
contains database access parameters, as below:
contains database access parameters, as below:
?> |
Point and Click
With the news display out of the way, all that’s left is to add a simple administrative
tool to manipulate the contents of the SQLite database. The code here is going to be
very similar to what you saw in PHP 101 Part 14: a start
page called admin.php that provides a snapshot of the current database, and a
form to add new entries. Here it is in full:
tool to manipulate the contents of the SQLite database. The code here is going to be
very similar to what you saw in PHP 101 Part 14: a start
page called admin.php that provides a snapshot of the current database, and a
form to add new entries. Here it is in full:
</body> |
Here’s what it looks like:
As you can see, there are two sections in this script. The first half connects to the
database and prints a list of all the currently configured news feeds, with a “delete”
link next to each. The second half contains a form for the administrator to add a new
feed, together with its attributes.
database and prints a list of all the currently configured news feeds, with a “delete”
link next to each. The second half contains a form for the administrator to add a new
feed, together with its attributes.
Once the form is submitted, the data gets
add.php, which validates it and saves it to the database. Here’s the code for
add.php:
POST-ed to the scriptadd.php, which validates it and saves it to the database. Here’s the code for
add.php:
?> |
The lower half of the script should be familiar to you: it contains the usual function
calls to open an SQLite database and execute an
user’s data to the database. What’s interesting, though, is the top half of the script,
which contains a number of input tests to ensure that the data being saved doesn’t
contain gibberish.
calls to open an SQLite database and execute an
INSERT query to save theuser’s data to the database. What’s interesting, though, is the top half of the script,
which contains a number of input tests to ensure that the data being saved doesn’t
contain gibberish.
There are three tests here. One checks for the presence of a descriptive title, another
uses the
story count is a valid number, and the third uses the
check the format of the URL. If you read Part 13, you’ll
know all about the importance of validating user input; here’s that theory going into
action.
uses the
is_numeric() function to verify that the value entered for thestory count is a valid number, and the third uses the
ereg() function tocheck the format of the URL. If you read Part 13, you’ll
know all about the importance of validating user input; here’s that theory going into
action.
That takes care of adding new RSS feeds. Now, what about removing them?
Remember how, in admin.php, each feed displayed in the list had a “delete” link,
which pointed to the script delete.php. This delete.php script takes care
of deleting a news feed from the table, given the feed ID (which is passed through the
link). Take a look at the code, and things will become clearer:
which pointed to the script delete.php. This delete.php script takes care
of deleting a news feed from the table, given the feed ID (which is passed through the
link). Take a look at the code, and things will become clearer:
<h2>Feed Manager</h2>
?> |
The record ID passed through the URL
delete.php, and used with a
corresponding record. Try it out and see for yourself!
GET method is retrieved bydelete.php, and used with a
DELETE SQL query to erase thecorresponding record. Try it out and see for yourself!












0 Comments:
Post a Comment