Wednesday, September 1, 2010

long time


Just made it a point to put in a snap of my little princess here, born on 2nd Nov 2009 ...

Saturday, June 20, 2009

File locking protocol with only shell scripting primitives

Writing this post out of some faint feeling of guilt and disappointment of not being able to keep up to the writing habit and discipline. It happens to be my first post ofthe year 2009. A lot many things have happened over these months, but as usual,the more things change, the more they seem to remain the same. So I jump directly to the point that pushed me to post this.


Encountered a problem today, regarding certain shell script running on a linux box.The script could be invoked by many users simultaneously. Now the problem is that the script used to write some text to a file. The changes could be updates through a redirect ('>>') or even deletes through 'gnu_sed'. As you might have guessed, the file being written to by the script was common to all users as well. I was looking for a way to serialize the 'write' access. Several incidents showed that multiple concurrent users eventually messed up the file contents through the concurrent shell script invocations.


Now, someone suggested a file locking mechanism. Each invocation of the script will lock the file and other invocations wait till it is released. I don't know for sure, but there is no shell interface for locking some file. I am aware of seeing something like that interface only at the system/C programming level through fcntl/flock calls. I had to think for a while before getting to the following algorithm, that is purely based on shell scripting primitives and guarantees a predictable first-in-first-out ordering. It also ensures that starvation cannot occur. Deadlocks are out of question, with there being only one resource to lock.


Here it is,
1/ Lets assume that a shell script (S) modifies the file (F) and we have a set of processes {P0, P1, ... Pn} that can concurrently invoke S to modify F according to some arbitrary logic.
2/ We make a kind of wrapper script (S2) to control access to S and add some logic.
3/ S2 works like this,
It creates a temporary file in the same directory with a name whose uniqueness is guaranteed e.g - a file named lock., pid - known quantity of the process P.
4/ Next, it checks for the creation dates on all lock.<*> files in that directory.
case 1 : If it finds that the date on its own lock file is earliest, it can invoke the script.
case 2 : If it finds another lock file with an earlier date, it sleeps for a configurable duration and repeats step '4/' once it wakes up. If the dates of another lock file co-incide and they are earliest among all locks, we can make the process with the lower pid the winner. The higher pid process waits and repeats 4/ after sleep.
5/ S2 finally invokes S which does all he work in exclusive mode for 'writes'.
6/ Delete the lock file created by self (having the own pid) after done and S returns control to S2.



We can assert that the FIFO ordering will be maintained, ensured by having the creation dates as the decider.Starvation is prevented by having the processes first create a lock file and thus marking the availability for ascript invocation.A process also has the luxury of relinquishing lock on the file by simply deleting its lock file, regaining lock by simply re-creating the lock file when it needs and queueing up again for access. If a process wants to justread the file, we can allow the script to make a copy of the file and release the lock as soon as possible.



Issues -
1/ The script S should be 'well-behaved'. Otherwise, it can simply use 'exit' to finish the processinstead of returning control to S2 for cleaning up the lock. file of that process. Ifany such pid file is left out in the system ,we can end up with a total starvation because noprocess will find its lock file better than the dead process' lock. A simple workaround is thatif a process P finds that is is deprived of a lock because of only one other process P, it verifies if the process who owns the lock P is still alive with the 'ps -ax' commands.If the process P is not alive, this process P can delete the lock file created by the dead process and assumeto be owner. Only one process will do this check because only one process will be in the 'second' position from gaining access.



2/ Another problem is much tougher to crack. The overall response/efficiency of this locking protocol heavily dependson the sleep time for the processes. The process has to go to a sleep if it finds that the script is already locked. We need the sleep time to be equal to the approximate time that the script S takes to run. If the sleep time is too low, processes will spend computing resources checking locks too frequently, only to find that theyhave to continue sleeping. If it is too high, the processing is under-utilised. In any case, the sleep has to happen, since I cannot find a way that the OS notifies the sleepingprocesses with this scheme. The OS does not participate in any locking here so it cannot help. Overall, If the requesting processes are assumed to be arriving frequently and steadily, then I will prefer having the sleep time on the higher-end. This is because once enough processes are sleeping, there will be a steady state when enough processes keep waking up and continue processing without the under-utilization problem. On the other hand, if the incoming processes are bursty in nature, then I think the sleep time needs tobe kept on the lower-end.
The above protocol certanly solves the central problem mentionned in the link, http://members.toast.net/art.ross/rute/node24.html.


- QED -

Friday, September 26, 2008

Challenge problem - From a classic Steven Krantz book -

I have simply been unable to crack this one, particularly the general version of this problem... any hints ?
You have six points on a piece of paper. Every point will be joined to every other point by either by a red line segment or by a blue line segment. You can colour any line in anyway you choose. Show that there will have to be either be a triangle that is all blue or a triangle that is all red.
Formulate a generalization of the last problem. Suppose that you have k colors. How many points are required to guarantee that the process of joining all possible pairs of points with line segment segments of one of these colors will guarantee that there is a triangle of just one color?

Saturday, September 13, 2008

Interesting problems - first batch

Housie Magic -
This one arose from a party in my company, where we played Housie. The game goes like this - Every participant is distributed Housie tickets, each of which has 15 random numbers (distinct on a given ticket) printed on them. Numbers are between 1 to 100. Next, random numbers in the same range are selected and the numbers are called out. All participants who have the number on their ticket cross them. The first participant who gets all numbers crossed on their own ticket wins, and gets the prize.
Certainly, all numbers on the ticket are random and those called are random too. So everyone has an equal chance of winning. But there is a twist .....
What if you are allowed to take up more than one ticket ? You then get a bigger range of numbers to cross. Does that increase your chances of winning the prize ? Remember that to win, you need to have all numbers on any single ticket crossed.
Comments welcome ...
Sorting -
I ask this one for interviews and get varying answers... Given an array of integers (of size N), what is the most efficient algorithm to find out the 'K' largest numbers ? Note that K <= N, K > 0.
We don't want the largest K numbers to be sorted by themselves, just find the largest set. Is there a way to find the K numbers in less than O(N*K) ?

Wednesday, May 7, 2008

IPL - Manoranjan ka Baap

I and at least the few gentlemen around me have begun to wonder when the IPL circus is going to end ... Not that I don't like cricket - I have watched matches even in my university exam days. With IPL, its like if you love chocolate cake, you are asked to eat it everyday for two months. The organisers should have known how cricket tournaments tend drag on and on.

I watch cricket to get a sense of competition, aggression, and the drive to win among the teams. To give you an idea, remember the classic moments - Javed Miandad's last ball match winning six, Sachin's consecutive Sharjah centuries against Australia or Kluesener's 1999 world cup exploits. But the overwhelming feeling I get while watching IPL is that of a money-spinning exercise, the junk food of cricket. You don't get to savour the chocolate cake, because you are made to eat it rapidly before another helping arrives.

Saving grace - IPL's all purely market-driven. If it becomes unpopular, it won't sustain even with Cricket Board's backing ... Nobody needs to kill it... like dead rubber Test Matches (imagine a Test match played between India - Sri Lanka on flat, dusty wicket). If spectators stick with it, it will live, else it will die on its own.

Saturday, May 3, 2008

Starting up...

Tune in to this spot for thoughts on Computing, Economics, Science, Literature and just about everything under the Sun.