Saturday, July 26, 2014

About API design and copy semantics

This is a tale of how a combination of counter-intuitive copy semantics and API design kept me debugging a piece of code, late into the night, for several hours... Late night coding might be fun but it there is never any fun in working in a crisis mode, with time pressure and against a bug.

I suspect that the reasons behind my ordeal, are also the reasons why most programmers tend to vastly under-estimate the time it will take them to finish a programming task.

Sequences or contiguously allocated arrays usually offer some facilities to copy their internal representation, that is efficient, and shallow. Certainly more efficient than simply iterating over the entire sequence and copying each object by value. Typically such sequences have a raw buffer where the data is managed, a 4 or 8 byte 'length' to know the size of that buffer and whether the sequence 'owns' the buffer or not.

Ownership of the raw buffer is the "hot potato", the sequence which ends up with it should be the one releasing the buffer's memory. The consequences are ghastly - If both of them release or if both fail to release.

When a copy of a sequence is required, into another sequence, the pointer to the raw buffer is assigned, the 'length' is copied and the ownership is relinquished by the original sequence depending on what the caller wants.

Consider such feature from a CORBA sequence, and an implementation from 'Orbacus',
typedef OB::FixSeq< ::CORBA::ULong, int > MSeq; /* CORBA's fixed sequence of integers */

MSeq source;
source.length(2); /* Fill some data to test copy */
source[0] = 1;
source[1] = 2;

MSeq target; /* Empty sequence */

bool takeOwnership = true; /* Let's have target sequence acquire ownership, and source sequence relinquish it too ! */

/* Now actual call to copy sequence into target  */

I noted the following painful aspects of the 'replace()' call, that is supposed to do the efficient copy :

1. The name itself, 'replace' is suggestive of an action to 'change the guts' of the target sequence, but implies nothing about the fact that even the source sequence can be totally 'gutted' through passing certain choice of options transferring ownership.

2. How much of grunt work is expected from the caller ! The function takes 4 arguments including the length, maximum allocation of the source sequence and buffer, all internal details of the source sequence... Why does it not do with just a reference to source sequence and ownership option, and use all such internal detail from the source sequence itself ? After all, 'replace()' is a member function.

3. I find that boolean option of 'takeOwnership' is accepted twice in the same function call - first to tell the source that it should relinquish ownership and second to have the target sequence acquire it. Now, I can't think of a situation where the caller might desire to pass 'takeOwnership' as 'true' in first and 'false' in second or the other way. In fact, using different values for the ownership caused a crash that I debugged. The code was deep into the implementation and the results are totally counter-intuitive.

/* This code crashes  */
/* We are copying between 3 sequences : original -> intermediate -> final */
#include "OB/CORBA.h"

int main(){

typedef OB::FixSeq< ::CORBA::ULong, int > MSeq;

MSeq original;
original[0] = 1;
original[1] = 2;

MSeq intermediate; 

std::cout << intermediate.length() << " " << intermediate[0] << " " << intermediate[1] << "\n";

MSeq final;

std::cout << final.length() << "\n"; /* Length is as expected ==> 2 */

std::cout << final[0] << " " << final[1] << "\n"; // Still ... Crashes here...

return 0;


To build :
g++ -I  -L  -lOB -lJTC -ldl -lpthread 


What is certainly bad is that the length of the 'final' sequence is 2 but the internal buffer is just not there, any access causes a crash. The resulting sequence is simply not internally consistent.

The 'replace()' function API is designed such that it accepted too many arguments, that should have been deduced from a source reference. And that made it easy for a caller to mis-use the API. In that sense, it fails to have certain characteristics mentioned in an excellent piece from Joshua Bloch here !

No comments: