July 15th 2006 12:24 am
A study on RSS – Part 1 XML DOM
RSS, today someone asked me, “what is this?”
Well, I know what it is, but at that moment the words failed me, especially the “non-programmer” words, so as to explain to “the common folk” the true meaning of it and some of its uses. So i decided to write this series of posts, don’t know how many or which sequence I’ll follow but i guarantee i will try to cover all the basics.
The wikipedia defines RSS as a xml specification of web feeds used for web syndication of site content.
“So what?” you say, WTF you exclaim!.. well let’s take it in small steps..
XML is a web format largely used in web environments (technical information: used in general for encoding any structured data), a kind of global language defined for communication between sites, and systems. Being a standard means everyone understands its syntax, something like grammar i would say.
So that doesn’t get you gears going? Guess you are not a developer, so I’ll explain what it’s good for, but in case you are a developer, stick around and I’ll initiate the road to building a RSS Feed.
RSS is the “language” spoken by feeds, which are really like a summary of a site’s contents (RSS – Rich Site Summary), it sends summarized data (title, description and link) of the site contents. For example, in the site ComuniWEB (Brazilian news site) where I’m currently employed the RSS feed dishes out the latest news published on the site. Considering this i ca use a “RSS Reader” and view the latest headlines, and in case any of them catches my eye, i just click on it and it sends me directly to the full story.
You can check out an example in the sidebar under the title “ComuniWeb – Últimas”. This section reads the last 5 headlines from ComuniWEB.
Ok, so now you maybe understand a little bit more about RSS Feeds, go play around for a bit find a RSS Reader like Google Reader, try out a few feeds, you can even add them to personal sites like myYahoo.
Ok so you are a developer, you have a blog, a news site or some content you like to share? Want to know how to do it? Ok I’ll start you on your way. In today’s post I’m going to start on the basics, creating a XML file from PHP data. In the next one I should get into the RSS file structure and then who knows? parsing RSS.
Fisrt Step:
XML Compatibility on PHP
PHP has a few options for reading and writing XML, from the bedrock basic direct string creation to solid objects like DOM. Well if i have to pick one, DOM is the winner, it’s stable and has a simple but powerful logic, once you get the hold of it.
Let’s create a file with the following structure
< ?xml version="1.0" encoding="utf-8"?>
<news>
<item id="162696">
<title>Brazil loses the World Cup</title>
<link>index.php?idpag=20&idmat=162696</link>
</item>
</news>
First we need to create a new XML Doc via DOM
$xmlDoc = new DOMDocument('1.0', 'utf-8');
$xmlDoc->formatOutput = true;
Now the $xmlDoc variable contains a DOM Object instance and using the “formatOutput” function we tell it to come out tidy-looking. Next we will create the root element, called “news” in the example:
$news = $xmlDoc->createElement('news');
$news = $xmlDoc->appendChild($news);
In the fist line we create a element called news from the main XML Doc, it has no content. In the second line we attach that new node to the root node, sending it back to a $news variable so as to have a up-to-date reference to the node.
Now we need to create an item node with the attribute “id”, check it out:
$item = $xmlDoc->createElement('item');
$item->setAttribute('id','162696');
$item = $news->appendChild($item);
Once more we create a node, always from the root Doc. Next we set the value of the attribute and its name. In the following line we connect it to the parent node, in this case $news, noticed the difference? This means the item node is now a child of the news node created before.
Now we simply add the Title and the Link nodes to the item node finalizing the addition process, but notice the slight difference here:
$title = $xmlDoc->createElement('title',utf8_encode('Brazil loses the World Cup'));
$title = $item->appendChild($title);
$link = $xmlDoc->createElement('link',htmlentities('index.php?idpag=20&idmat=162696'));
$link = $item->appendChild($link);
Notice that this time around we set a value to the node. For my Portuguese example I note a few points:
1 – Convert latin chars to UTF-8, here using the utf8_encode();
2 – Convert HTML entities, XML has problems with & because it represents an entity, use this syntax & amp; instead;
Ok, almost there. Now we need to finalize and spit it out to the screen:
header("Content-type:application/xml; charset=utf-8");
echo $xmlDoc->saveXML();
First we send out a header to guarantee the browser interprets the right content and echo out the result using the saveXML function, which can spit out a string or save a file.
Check the complete script below and a working example here
< ?
$xmlDoc = new DOMDocument('1.0', 'utf-8');
$xmlDoc->formatOutput = true;
$news = $xmlDoc->createElement('news');
$news = $xmlDoc->appendChild($news);
$item = $xmlDoc->createElement('item');
$item->setAttribute('id','162696');
$item = $news->appendChild($item);
$title = $xmlDoc->createElement('title',utf8_encode('Brazil loses the World Cup'));
$title = $item->appendChild($title);
$link = $xmlDoc->createElement('link',htmlentities('index.php?idpaginas=20&idmaterias=162696'));
$link = $item->appendChild($link);
header("Content-type:application/xml; charset=utf-8");
echo $xmlDoc->saveXML();
?>
So I’ll wrap it up here for now. Hope this simples RSS introduction has already cleared some of your questions and doubts in the XML and DOM fields. Next time around i’ll get into the RSS Specifications and it’s evolution from 0.91 to 2.0 know now as Really Simple Syndication.
14 Comments »











(7 votos, média: 4.14 de 5)






php haxor for life on 16 Sep 2006 at 14:53 #
I was looking at the code in your banner image at the top of the page. You should escape user input (the $_POST variable) before you stick it into an SQL query.
Yousef Ourabi on 16 Sep 2006 at 19:58 #
Well, not only that but I believe the proper HTTP Content-Type header for rss is application/rss+xml and not just xml…
Cobol_l33t on 16 Sep 2006 at 20:39 #
pwned..
What is RSS? on 16 Sep 2006 at 20:54 #
[...] Since i began my job in TSSG little under two months ago, several people have asked me what exactly I do in my job. Without going into too much detail, it revolves greatly around the technology os RSS. Even when I say “RSS” to people who also came from the same college course as I did or folk that have other degree’s in computing, a question that more often than not is thrown back at me is “What is RSS?”. I came accross this post today by Rafael Dohms who explained pretty well from both a developer and a non-developers point of view, some of the features of the technology and how you yourself can embrace its power! This is the first of a series of posts on the topic so I’ll keep trackof it to see if Rafael comes up with any more pearls of wisdom on the area [...]
Rafael Dohms on 16 Sep 2006 at 21:06 #
Yes you are actually right about the content-type, even though it works both ways, the proper is to add the rss, my bad.
Thanks for the feedback.
Part 2 and 3 of the article are already written, but in portuguese, I’ll take some time this week to translate and publish them inenglish too.
Thank you all for the feedbacks and trackbacks.
Me on 17 Sep 2006 at 15:41 #
Sweet, exactly waht I was looking for !!!!
its about time» Blog Archive » links for 2006-09-17 on 17 Sep 2006 at 20:35 #
[...] Rafael Dohms » A study on RSS – Part 1 XML DOM (tags: article feed howto RSS tutorial programming webdev web2.0 xml php blog tech scripting) [...]
Rafael Dohms » A study on RSS - Part 2: The RSS format on 18 Sep 2006 at 20:22 #
[...] Part 1: What is RSS and how do I build and XML using XML DOM? DOM PHP RSS Tecnologia WEB 2.0 XML (Sem votos registrados) Loading … [...]
PabloG » Blog Archive » links for 2006-09-19 on 18 Sep 2006 at 22:19 #
[...] Rafael Dohms » A study on RSS – Part 1 XML DOM (tags: rss php rssreaders programming tutorial XML) [...]
EveryDigg » Blog Archive » A study on RSS - XML DOM on 04 Oct 2006 at 8:59 #
[...] Well, I know what it is, but at that moment the words failed me, especially the "non-programmer" words, so as to explain to "the common folk" the true meaning of it and some of its uses.read more | digg story [...]
BrokenToy on 26 Oct 2006 at 8:49 #
although this works fine on apache I’ve had a strange problems running this through lighttpd+fastCGI. For some reason the optional BOM marker ef bb bf is turning up at the end of the script. This means the output doesn’t validate. I’ve checked the script on other machines running apache and the output valiadtes.
Any ideas?
Additonally. I thought
$this->outputDocument->save(’php://output’);
is a lot neater than echo’ing (apparently it’s faster too, but I’m not testing XML doc’s big enough)
Romina on 03 Mar 2007 at 13:11 #
Beautiful site!
emurhfkq on 21 Jun 2007 at 13:18 #
people are stranger
Alexwebmaster on 03 Mar 2009 at 7:36 #
Hello webmaster
I would like to share with you a link to your site
write me here preonrelt@mail.ru