Like I promised to my brother, I started working on a news feed for the marketplace of jzr-threewheeler.de. Contrary to my expectations the problem in this case was not so much the caching mechanism I had to implement. I had done that before and could reuse most of the code. What took me so long was the fact, that the marketplace page seems to be created and updated manually – resulting in an chaotic structure of both content and HTML. While containing over 181 000 lines only about 1 200 of them contain more than whitespaces actually. The content doesn’t follow any strict rule either. All this made it quite complex to write a good parser to process the information into a nice (and valid) feed.
Well, I managed to get it done and here it is:
I hadn’t finished the work on this, when Adrian came up with the next task for me. He wanted to have a news feed to follow all recent posts at simsonforum.de. This was a way easier job to accomplish. Message board software does follow strict rules when compiling the pages so parsing them is quite easy because you don’t need to take all kinds of exceptions into account. So a set of simple rules is sufficient to process all the data. Using the board’s search function my PHP script downloads a list of all recently changed threads and generates the feed based on it. Because this produces quite heavy load on both involved servers (the forum’s and mine) results are cached for at least an hour before an update is made – another script using the caching code. 🙂
You can find this feed here: