Archive for the ‘Internet’ Category

Facebook Applications not working properly? Wont submit forms?

Wednesday, August 29th, 2007

This is due to a recent change (this morning) to FBML using the new login procedures. However, they appear to be broken. Simple fix: add requirelogin="0" to your <form> tags.

Hopefully facebook will fix this issue soon.

(For more info, see here: http://soton.facebook.com/developers/message.php#msg_60)

Benjie.

Bookmark and Share

SimplePie Memory Leak

Monday, August 27th, 2007

Quote from SimplePie wiki:

When processing a large number of feeds (via a cron job or MySQL loop), a memory leak can occur causing PHP to run out of memory. This is due to PHP Bug #33595 where PHP doesn’t release memory when making recursive (i.e. self-referential) object calls.

This bug has troubled me for a while with Blog Friends, and caused me to stay with older versions of SimplePie. However, today I think I may have fixed it.

I have put up details of my fix on the SimplePie wiki page linked above, but reproduce them here (without syntax highlighting) for your convenience…

Possibly Solution

The problem is due to recursive references within SimplePie (and PHP’s poor handling of said references). A solution that works for me is patching the vanilla SimplePie 1.0.1 with this:

<code>
— simplepie1/simplepie.inc (revision 528)
+++ simplepie1/simplepie.inc (working copy)
@@ -668,6 +668,12 @@
$this->init();
}
}
+ function __destruct() {
+ if (isset($this->data['items']) && is_array($this->data['items'])) foreach (array_keys($this->data['items']) as $k) {
+ $this->data['items'][$k]->__destruct();
+ unset($this->data['items'][$k]);
+ }
+ }

/**
* Used for converting object to a string
@@ -1521,6 +1527,7 @@
return false;
}
}
+ $locate->__destruct();
$locate = null;

$headers = $file->headers;
@@ -2703,6 +2710,9 @@
$this->feed = $feed;
$this->data = $data;
}
+ function __destruct() {
+ unset($this->feed);
+ }

function __toString()
{
@@ -10013,6 +10023,9 @@
$this->timeout = $timeout;
$this->max_checked_feeds = $max_checked_feeds;
}
+ function __destruct() {
+ unset($this->file);
+ }

function find($type = SIMPLEPIE_LOCATOR_ALL)
{
</code>

Then all you have to do in your code is ensure that you call $sp->__destruct() once you are done with the SimplePie instance.

If you don’t need multiple concurrent SimplePie classes (i.e. you use SimplePie in a serial fashion) you could use a custom loader to do that… for example:

<code php>
function SimplePie_Loader($url) {
static $sp = NULL;
if ($sp !== NULL) {
$sp->__destruct();
$sp = NULL;
}
$sp = new SimplePie($url);
return $sp;
}
</code>

I hope this helps someone.

Bookmark and Share

Blog Friends reaches 100 users!

Wednesday, July 18th, 2007

Less than 24 hours after release, Blog Friends has reached 100 users!

Blognation have done a review of Blog Friends here.

If you haven’t already, why not read my previous article: Blog Friends RELEASED!

Bookmark and Share

Blog Friends RELEASED!

Tuesday, July 17th, 2007

Brain Bakery Ltd. have been working for the last two weeks on a facebook application called Blog Friends, and finally it has been released! I have been logging the hours I spent on the project and it appears that in the last 14 days I have worked the equivalent of almost 4 standard 40-hour weeks – and that’s just me, the others at Brain Bakery have also worked quite a bit! Thats pretty hardcore, I think, so I will be chilling out a bit now…

Anyway – if you have a blog, and you would like your friends blog posts filtered for you according to a list of your interests/dislikes, why not give Blog Friends a try? If you have problems with the software, please leave a comment or email me and I will get back to you ASAP. Also, if you have suggestions for version two, feel free to post them here, though Blog Friends has absolutely loads of features in the pipeline…

Hope you enjoy it!

Bookmark and Share

Killer recursive download command

Sunday, July 15th, 2007

I made this command to download a series of websites including all files, and to do so without being stopped by any automated protection methods (e.g. robots.txt, request frequency analysis, …). It served its purpose well. The command is this:

wget -o log -erobots=off -r -t 3 -N -w 2 --random-wait --retry-connrefused --protocol-directories --ignore-length --user-agent="Mozilla/5.0 (X11; U; Linux i686; en-GB; rv:1.8.1.2) Gecko/20060601 Firefox/2.0.0.2 (Ubuntu-edgy)" -l 100 -E -k -K -p http://web.site.here/

Quite long, don’t ya think? Read on for a description of what it does (that is, if you don’t have the wget manpage memorized…)

Basically it does this:

  • -o log –> outputs messages to log file "log" instead of the terminal. Allows for easier debugging (and you can tail -F it in another terminal anyway…)
  • -erobots=off –> tells wget to NOT respect the robots.txt file – i.e. abuse webservers. This is naughty, but was necessary for the sites I wanted.
  • -r –> recursive
  • -t 3 –> retry each URL 3 times
  • -N –> turn on timestamping
  • -w 2 –> wait 2 seconds between requests
  • –random-wait –> instead of waiting 2 seconds, wait a random time, that averages to 2 seconds
  • –retry-connrefused –> for sites that go down frequently (e.g. ones on home computers) use this to retry if the connection is refused, rather than just skipping
  • –protocol-directories –> makes files from http://web.site.here/dir/ be laid out as ./http/web.site.here/dir/ e.g. puts https files in a different directory to http files
  • –ingore-length –> If the webserver gives the wrong Length: header, ignore it.
  • –user-agent="" –> pretend to be Firefox rather than wget – prevents connections being refused by anti-spidering measures that filter just by user agent.
  • -l 100 –> go up to 100 levels deep
  • -E –> put .html on the end of files that don’t end with HTML but are HTML files. This is useful so that you can download a dynamic site and stick it in a static webserver and it still works
  • -k –> convert links – VERY ADVANCED – edits the HTML files and changes the links to be relative and point to the right file names (e.g. adds .html for the last option). NOTE: this only works for HTML files – you will have to modify the CSS or JavaScript yourself.
  • -K –> when doing -k, keep a backup of the original file, without links converted.
  • -p –> download everything needed to properly display the files you download. e.g. all style sheets, JS files, images, …

So as you can see, a lot of work went into making it (I read the entire wget man page) but I think it was worth it. I now keep it in a file so whenever I need to do the same thing again, I have it. And now I have a blog, I thought "why not share?!" :-)

Enjoy. And please tell me if you find it useful.

Bookmark and Share

How I made an AJAX chat client more efficient.

Friday, June 29th, 2007

I am currently working on making AJAX chat. This has had many issues associated with it, the major one being bandwidth usage. By default the headers sent/received in a basic AJAX request under Firefox with mootools are are around 1kb! That is huge, and will put massive stretches on bandwidth. For this reason, I tried to minimize the headers as best I could, and got them down to around 150bytes each way, total. Read on for what I did…

I used the following options for the client (mootools, but you should be able to get the gist):

Client headers (JavaScript with Mootools):

new XHR({method:’get’, headers:{’Accept’:”, ‘Accept-Language’:”, ‘Accept-Encoding’:”, ‘Accept-Charset’:”, ‘User-Agent’:” }});

This saved a lot of the unnecessary details being sent (if you require these headers, why not cache them in your $_SESSION variable in PHP?). The biggest of these headers was actually the User-Agent header. Turning this off gave me issues with sessions under CakePHP though. It turns out that Cake uses the User-Agent string to increase the security of Sessions. Well, for now I class the bandwidth saved more important than the security implications, so I added the following lines to /app/webroot/index.php in cake:

Fix sessions (Apache server with CakePHP):

$_SERVER['OLD_HTTP_USER_AGENT'] = $_SERVER['HTTP_USER_AGENT'];
$_SERVER['HTTP_USER_AGENT'] ="REMOVED";

Which made Cake "think" that all web browsers were called "REMOVED." An alternative method could be to introduce a new session purely for usage through the chat interface, but that seemed like too much hassle to me.

I also removed the following headers from the server side, using PHP code:

Server headers (Apache, PHP and CakePHP):

header(’X-Powered-By:’);
header(’Cache-Control:’);
header(’Expires:’);
header(’Vary:’);
header(’P3P:’);
header(’Server:’,true);
header(’Pragma:’);

And edited my Apache config to reduce the size of the ‘Server’ header (as apparently I cannot remove it using PHP):

Reducing Apache signature header:

ServerSignature Off
ServerTokens Prod

The ServerSignature option is more of a security thing – stopping people from knowing exactly what versions of software my server runs.

This worked well. I then compressed all my XML outputs as far as I could, using one character tags with one character properties, and reducing empty f tags to <f/> wherever possible. This worked fine. But I realised that we were still sending 0.5kb/second/client. It was at this point that I thought

Why not get the server to wait for new content?

So by using the php usleep() function I had the data fetching function poll the database every $delay milliseconds. If there were changes, it would output the changes and return. If there were no changes after 30 seconds, it would exit anyway with an "empty" message: <f/>. If the client disconnected it would discontinue.

This seemed to be working great, except new posts that I wrote to the chat were not appearing until the aforementioned 30seconds had expired. It took me a while to track down the reason, but it is due to CakePHP and PHP serializing session access so as to prevent race conditions. Well my chat function just cached all the session data (it did not need to edit it) and then ran

Releasing the session (so other connections work):

session_write_close();

in order to release the session, so other connections could continue. Now I am aware of this issue, I think I will be building session_write_close() into many more of my functions!

But now I have a chat solution with variable delays (currently set to 0.5s) that allows you to have AJAX chat without requiring huge amounts of bandwidth. I can even increase the 30 seconds by changing PHP’s max_execution_time.

What if the server suddenly comes under attack?

I didn’t mention: I also (optionally) send out the delay number to the scripts, so if I want to I can change the delay from 0.5s to 5s across all currently connected clients within 30 seconds, without any of them having to reload, or even realise the change! This means that if bandwidth is suddenly a problem, I can ask the web browsers to reduce their pounding of the servers, and they will almost instantly. Pretty neat, eh?

I am beginning to think that AJAX is a viable solution to chat, so long as you think about it enough.

Keywords for Google:

Increasing AJAX efficiency, making efficient AJAX code, make AJAX more efficient, reduce header count, decrease number of headers, remove headers, reduce AJAX bandwidth

Ideas for the future:

Currently the chat is stored in tables in a MySQL database, but I think we will be using memcached to improve efficiency some time soon. We will also probably remove the overhead of CakePHP as a framework just for the processor intensive chatroom feed script.

Bookmark and Share

Facebook Applications – Feeds and SMS

Tuesday, June 26th, 2007

I wrote two facebook applications yesterday. Yes, you heard me – two, yesterday. Admittedly I did work for 12 hours almost solid yesterday.

One is an application that lists new posts on blog feeds that you are interested in, and the other allows you to SMS your facebook page to tell your mates "I’m down the pub, why don’t you join me?" They are both, obviously, in their infancy, and the SMS one took me another days coding a few weeks ago (by day, I mean 10-12 hours…) to get SMSs sent to my phone to hook up to my computer well… but it works very well now, tested it last night!

So… watch this space. When they get a little more professional I will consider releasing them. Before then, I need a new phone that I can dedicate to the purpose, or a Nokia usb data cable for my 3310.

Bookmark and Share