Forum Moderators: coopster
/**
* Write a file to the server.
*
* First open a temp file for writing and acquire a lock.
* Proceed to write, unlock and copy the file from a temp file. If the
* file size is the same, the write worked, clean up and go home. if
* not, clean up and hope it works the next time.
*/
function cms_writeFile($filename, $tempfile, $data) {
$ft = fopen($tempfile, 'w');
if(flock($ft, LOCK_EX)) {
fwrite($ft, $data);
flock($ft, LOCK_UN);
fclose($ft);
if(copy($tempfile, $filename) && filesize($tempfile) == filesize($filename)) {
unlink($tempfile);
return true;
} else { // The whole process failed.
unlink($tempfile);
unlink($filename);
return false;
}
} else return false;
}
I pass it a random tempfile name, so at that point I think we avoid any race condition issues, but to be paranoid I check the file size after the copy. If the files are of a different size, I drop them both and hope this works the next time.
An example to use the function:
if(cms_writeFile('test.php', 'randomg-temp.php', 'some data')) {
echo 'written';
} else {
echo 'notwritten';
}
Seems right, and it works, am I missing anything?
So that replaces the top of the function now. You can pass it a tempfile if you want, otherwise it takes care of that for you...
It also does not account for the possibility that two editions of the function write files of the same size.
if two calls pass the same tempfile name, then you are back to the original race condition that you were seeking to solve.
Also, if you used
rename() instead of copy() you would have less clean up to do. it is not necessary to roll your own tmp file name as
tempnam ( string dir, string prefix ) does this for you. There is a standard algorithm that can be used to get around your problem, and prevent file overwriting without locks.
if we assume you have a file "lockfile" that already exists and is locked as if it was a mutex, you could get:
[pre]
function cms_writeFile($filename, $tempfile, $data, $mutex='cms_writeFile') {
$retval=false; //assume failure of function
if(! $tempfile) {
$tempfile = tempnam(dirname($filename),basename($filename));
}
$fm=fopen($mutex, 'w');// existing file that is always present to be locked as a mutex
if (flock($fm), LOCK_EX) { //Use the mutex to ensure only one write is happening
$ft = fopen($tempfile, 'w');
if(flock($ft, LOCK_EX)) { //Now not essential, but still good practice
fwrite($ft, $data);
flock($ft, LOCK_UN);
fclose($ft);
if(rename($tempfile, $filename)) { // check for size became unnecessary
$retval=true; // The only path to success
} else { // The whole process failed.
unlink($tempfile);
}
}
flock($fm,LOCK_UN); // Only unlock mutex when whole atomic action has completed.
fclose($fm);
}
return $retval;
}
[/pre] 1. rename - thanks (must read more manual).
2. tempnam - thanks (see #1!).
3. If the revised function is being used to write many different files, then it will form a bottleneck
So, this being a common pattern for file locking, are people accepting of this being a bottleneck?
If I was to use this to cache queries, for example, it could theoretically be much slower for a given request with many queries, no?
Also, looking at the function, while the mutex is locked any requests to cms_writeFile will return false and nothing will be written. So for a given request, multiple calls to cms_writeFile won't queue or fire, they will simply die silently, no?