Archive for the ‘Code’ Category

PHP Segfault-ing: preg_replace

Friday, September 28th, 2007

Update: issue now fixed.


ARGH! I cannot yet figure this one out. This code segfaults (gives memory allocation errors to) my copy of PHP, and also that on simplepie.org, but not that at php5.simplepie.org (the dev server). So there must be a difference in the PHP compilation/dependencies (as I am told that the versions of PHP are the same).

And yes, the & stuff is deliberate.

This is really frustrating. Ideas, anyone?

<?php
$attrib = "id";
$data = <<<END
&amp;amp;amp;amp;amp;nbsp; width=&amp;amp;amp;amp;amp;quot;340&amp;amp;amp;amp;amp;quot; height=&amp;amp;amp;amp;amp;quot;289&amp;amp;amp;amp;amp;quot; id=&amp;amp;amp;amp;amp;quot;player&amp;amp;amp;amp;amp;quot; align=&amp;amp;amp;amp;amp;quot;middle&amp;amp;amp;amp;amp;quot;&amp;amp;amp;amp;amp;gt;
&amp;amp;amp;amp;amp;nbsp; &amp;amp;amp;amp;amp;nbsp; &amp;amp;amp;amp;amp;lt;param name=&amp;amp;amp;amp;amp;quot;movie&amp;amp;amp;amp;amp;quot; value=&amp;amp;amp;amp;amp;quot;http://cdn.last.fm/videoplayer/21/VideoPlayer.swf&amp;amp;amp;amp;amp;quot; /&amp;amp;amp;amp;amp;gt;
&amp;amp;amp;amp;amp;nbsp; &amp;amp;amp;amp;amp;nbsp; &amp;amp;amp;amp;amp;lt;param name=&amp;amp;amp;amp;amp;quot;menu&amp;amp;amp;amp;amp;quot; value=&amp;amp;amp;amp;amp;quot;false&amp;amp;amp;amp;amp;quot; /&amp;amp;amp;amp;amp;gt;
&amp;amp;amp;amp;amp;nbsp; &amp;amp;amp;amp;amp;nbsp; &amp;amp;amp;amp;amp;lt;param name=&amp;amp;amp;amp;amp;quot;quality&amp;amp;amp;amp;amp;quot; value=&amp;amp;amp;amp;amp;quot;high&amp;amp;amp;amp;amp;quot; /&amp;amp;amp;amp;amp;gt;
&amp;amp;amp;amp;amp;nbsp; &amp;amp;amp;amp;amp;nbsp; &amp;amp;amp;amp;amp;lt;param name=&amp;amp;amp;amp;amp;quot;bgcolor&amp;amp;amp;amp;amp;quot; value=&amp;amp;amp;amp;amp;quot;#000000&amp;amp;amp;amp;amp;quot; /&amp;amp;amp;amp;amp;gt;
&amp;amp;amp;amp;amp;nbsp; &amp;amp;amp;amp;amp;nbsp; &amp;amp;amp;amp;amp;lt;param name=&amp;amp;amp;amp;amp;quot;allowFullScreen&amp;amp;amp;amp;amp;quot; value=&amp;amp;amp;amp;amp;quot;true&amp;amp;amp;amp;amp;quot; /&amp;amp;amp;amp;amp;gt;
&amp;amp;amp;amp;amp;nbsp; &amp;amp;amp;amp;amp;nbsp; &amp;amp;amp;amp;amp;lt;param name=&amp;amp;amp;amp;amp;quot;flashvars&amp;amp;amp;amp;amp;quot; value=&amp;amp;amp;amp;amp;quot;creator=The+Chemical+Brothers&amp;amp;amp;amp;amp;amp;title=Do+It+Again&amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;uniqueName=Do+It+Again&amp;amp;amp;amp;amp;amp;albumArt=http://cdn.last.fm/coverart/130×130/3341430-870824884.jpg&amp;amp;amp;amp;amp;amp;album=We+Are+The+Night&amp;amp;amp;amp;amp;amp;duration=278&amp;amp;amp;amp;amp;amp;image=http://panther3.last.fm/storable/videocap/7808/0/original.jpg&amp;amp;amp;amp;amp;amp;FSSupport=true&amp;amp;amp;amp;amp;quot; /&amp;amp;amp;amp;amp;gt;
&amp;amp;amp;amp;amp;nbsp; &amp;amp;amp;amp;amp;nbsp; &amp;amp;amp;amp;amp;lt;embed src=&amp;amp;amp;amp;amp;quot;http://cdn.last.fm/videoplayer/21/VideoPlayer.swf&amp;amp;amp;amp;amp;quot;
&amp;amp;amp;amp;amp;nbsp; &amp;amp;amp;amp;amp;nbsp; menu=&amp;amp;amp;amp;amp;quot;false&amp;amp;amp;amp;amp;quot;
&amp;amp;amp;amp;amp;nbsp; &amp;amp;amp;amp;amp;nbsp; quality=&amp;amp;amp;amp;amp;quot;high&amp;amp;amp;amp;amp;quot;
&amp;amp;amp;amp;amp;nbsp; &amp;amp;amp;amp;amp;nbsp; bgcolor=&amp;amp;amp;amp;amp;quot;#000000&amp;amp;amp;amp;amp;quot;
&amp;amp;amp;amp;amp;nbsp; &amp;amp;amp;amp;amp;nbsp; width=&amp;amp;amp;amp;amp;quot;340&amp;amp;amp;amp;amp;quot; height=&amp;amp;amp;amp;amp;quot;289&amp;amp;amp;amp;amp;quot;
&amp;amp;amp;amp;amp;nbsp; &amp;amp;amp;amp;amp;nbsp; name=&amp;amp;amp;amp;amp;quot;player&amp;amp;amp;amp;amp;quot;
&amp;amp;amp;amp;amp;nbsp; &amp;amp;amp;amp;amp;nbsp; align=&amp;amp;amp;amp;amp;quot;middle&amp;amp;amp;amp;amp;quot;
&amp;amp;amp;amp;amp;nbsp; &amp;amp;amp;amp;amp;nbsp; allowFullScreen=&amp;amp;amp;amp;amp;quot;true&amp;amp;amp;amp;amp;quot;
&amp;amp;amp;amp;amp;nbsp; &amp;amp;amp;amp;amp;nbsp; flashvars=&amp;amp;amp;amp;amp;quot;creator=The+Chemical+Brothers&amp;amp;amp;amp;amp;amp;title=Do+It+Again&amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;uniqueName=Do+It+Again&amp;amp;amp;amp;amp;amp;albumArt=http://cdn.last.fm/coverart/130×130/3341430-870824884.jpg&amp;amp;amp;amp;amp;amp;album=We+Are+The+Night&amp;amp;amp;amp;amp;amp;duration=278&amp;amp;amp;amp;amp;amp;image=http://panther3.last.fm/storable/videocap/7808/0/original.jpg&amp;amp;amp;amp;amp;amp;FSSupport=true&amp;amp;amp;amp;amp;quot;
&amp;amp;amp;amp;amp;nbsp; &amp;amp;amp;amp;amp;nbsp; type=&amp;amp;amp;amp;amp;quot;application/x-shockwave-flash&amp;amp;amp;amp;amp;quot;
&amp;amp;amp;amp;amp;nbsp; &amp;amp;amp;amp;amp;nbsp; pluginspage=&amp;amp;amp;amp;amp;quot;http://www.macromedia.com/go/getflashplayer&amp;amp;amp;amp;amp;quot; /&amp;amp;amp;amp;amp;gt;
&amp;amp;amp;amp;amp;lt;/object&amp;amp;amp;amp;amp;gt;
&nbsp; &nbsp;&nbsp; &nbsp;&nbsp;

END;

$data = preg_replace(’/ ‘. trim($attrib) .’=(\w|\s|=|-|:|;|\/|\.|\?|&|,|#|!|\(|\)|\+|{|})*/i’, ”, $data);

echo "No segfault here";

Bookmark and Share

Facebook Applications not working properly? Wont submit forms?

Wednesday, August 29th, 2007

This is due to a recent change (this morning) to FBML using the new login procedures. However, they appear to be broken. Simple fix: add requirelogin="0" to your <form> tags.

Hopefully facebook will fix this issue soon.

(For more info, see here: http://soton.facebook.com/developers/message.php#msg_60)

Benjie.

Bookmark and Share

SimplePie Memory Leak

Monday, August 27th, 2007

Quote from SimplePie wiki:

When processing a large number of feeds (via a cron job or MySQL loop), a memory leak can occur causing PHP to run out of memory. This is due to PHP Bug #33595 where PHP doesn’t release memory when making recursive (i.e. self-referential) object calls.

This bug has troubled me for a while with Blog Friends, and caused me to stay with older versions of SimplePie. However, today I think I may have fixed it.

I have put up details of my fix on the SimplePie wiki page linked above, but reproduce them here (without syntax highlighting) for your convenience…

Possibly Solution

The problem is due to recursive references within SimplePie (and PHP’s poor handling of said references). A solution that works for me is patching the vanilla SimplePie 1.0.1 with this:

<code>
— simplepie1/simplepie.inc (revision 528)
+++ simplepie1/simplepie.inc (working copy)
@@ -668,6 +668,12 @@
$this->init();
}
}
+ function __destruct() {
+ if (isset($this->data['items']) && is_array($this->data['items'])) foreach (array_keys($this->data['items']) as $k) {
+ $this->data['items'][$k]->__destruct();
+ unset($this->data['items'][$k]);
+ }
+ }

/**
* Used for converting object to a string
@@ -1521,6 +1527,7 @@
return false;
}
}
+ $locate->__destruct();
$locate = null;

$headers = $file->headers;
@@ -2703,6 +2710,9 @@
$this->feed = $feed;
$this->data = $data;
}
+ function __destruct() {
+ unset($this->feed);
+ }

function __toString()
{
@@ -10013,6 +10023,9 @@
$this->timeout = $timeout;
$this->max_checked_feeds = $max_checked_feeds;
}
+ function __destruct() {
+ unset($this->file);
+ }

function find($type = SIMPLEPIE_LOCATOR_ALL)
{
</code>

Then all you have to do in your code is ensure that you call $sp->__destruct() once you are done with the SimplePie instance.

If you don’t need multiple concurrent SimplePie classes (i.e. you use SimplePie in a serial fashion) you could use a custom loader to do that… for example:

<code php>
function SimplePie_Loader($url) {
static $sp = NULL;
if ($sp !== NULL) {
$sp->__destruct();
$sp = NULL;
}
$sp = new SimplePie($url);
return $sp;
}
</code>

I hope this helps someone.

Bookmark and Share

Killer recursive download command

Sunday, July 15th, 2007

I made this command to download a series of websites including all files, and to do so without being stopped by any automated protection methods (e.g. robots.txt, request frequency analysis, …). It served its purpose well. The command is this:

wget -o log -erobots=off -r -t 3 -N -w 2 –random-wait –retry-connrefused –protocol-directories –ignore-length –user-agent="Mozilla/5.0 (X11; U; Linux i686; en-GB; rv:1.8.1.2) Gecko/20060601 Firefox/2.0.0.2 (Ubuntu-edgy)" -l 100 -E -k -K -p http://web.site.here/

Quite long, don’t ya think? Read on for a description of what it does (that is, if you don’t have the wget manpage memorized…)

Basically it does this:

  • -o log –> outputs messages to log file "log" instead of the terminal. Allows for easier debugging (and you can tail -F it in another terminal anyway…)
  • -erobots=off –> tells wget to NOT respect the robots.txt file - i.e. abuse webservers. This is naughty, but was necessary for the sites I wanted.
  • -r –> recursive
  • -t 3 –> retry each URL 3 times
  • -N –> turn on timestamping
  • -w 2 –> wait 2 seconds between requests
  • –random-wait –> instead of waiting 2 seconds, wait a random time, that averages to 2 seconds
  • –retry-connrefused –> for sites that go down frequently (e.g. ones on home computers) use this to retry if the connection is refused, rather than just skipping
  • –protocol-directories –> makes files from http://web.site.here/dir/ be laid out as ./http/web.site.here/dir/ e.g. puts https files in a different directory to http files
  • –ingore-length –> If the webserver gives the wrong Length: header, ignore it.
  • –user-agent="" –> pretend to be Firefox rather than wget - prevents connections being refused by anti-spidering measures that filter just by user agent.
  • -l 100 –> go up to 100 levels deep
  • -E –> put .html on the end of files that don’t end with HTML but are HTML files. This is useful so that you can download a dynamic site and stick it in a static webserver and it still works
  • -k –> convert links - VERY ADVANCED - edits the HTML files and changes the links to be relative and point to the right file names (e.g. adds .html for the last option). NOTE: this only works for HTML files - you will have to modify the CSS or JavaScript yourself.
  • -K –> when doing -k, keep a backup of the original file, without links converted.
  • -p –> download everything needed to properly display the files you download. e.g. all style sheets, JS files, images, …

So as you can see, a lot of work went into making it (I read the entire wget man page) but I think it was worth it. I now keep it in a file so whenever I need to do the same thing again, I have it. And now I have a blog, I thought "why not share?!" :-)

Enjoy. And please tell me if you find it useful.

Bookmark and Share

Facebook Applications - Feeds and SMS

Tuesday, June 26th, 2007

I wrote two facebook applications yesterday. Yes, you heard me - two, yesterday. Admittedly I did work for 12 hours almost solid yesterday.

One is an application that lists new posts on blog feeds that you are interested in, and the other allows you to SMS your facebook page to tell your mates "I’m down the pub, why don’t you join me?" They are both, obviously, in their infancy, and the SMS one took me another days coding a few weeks ago (by day, I mean 10-12 hours…) to get SMSs sent to my phone to hook up to my computer well… but it works very well now, tested it last night!

So… watch this space. When they get a little more professional I will consider releasing them. Before then, I need a new phone that I can dedicate to the purpose, or a Nokia usb data cable for my 3310.

Bookmark and Share

WebDAV on Windows XP SP2

Thursday, June 21st, 2007

When Microsoft released Windows XP’s SP2, they broke a lot of stuff. Amongst it was WebDAV Basic Authentication. They "disabled it by default" because it is a "security risk." Well, that is true, but there is no option in menus anywhere to re-enable it. You have to do a registry edit! Well… I have just had to write the following excerpt for BrainBakery’s CMS product:

If you are using Windows XP SP2, and you cannot connect to WebDAV, then you need to enable "Basic Authentication" by running this command (all one line) from Start -> Run:

REG ADD "HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\WebClient\Parameters" /v UseBasicAuth /t REG_DWORD /d 2

And then restart your computer. More information can be found here: http://support.microsoft.com/kb/841215.

Now, this is not live yet, as I have not tested it. I will test it as soon as Jem stops playing The Sims 2, so I can use her Windows XP machine. Here’s hoping it doesn’t break anything.

Incidentally, if you are wondering why I used "2" and not "1", the reason is that 2 enables Basic Authentication over plain HTTP (not HTTPS) under Windows Vista. I don’t know if the same command will work for Vista though.

This post could be useful for users of the PEAR module HTTP_WebDAV_Server, as that only currently supports Basic Authentication, I think.

If you find this useful, please leave me a comment! Thanks.

Bookmark and Share