Monday, April 25, 2011

XML file comparison

Another problem is to compare two XML files with the same structure and bring out a result that spells out the differences in the two files. Again I am set out to talk on a solution more involved than the simple one we actually chose to implement.

Simple approach :
A simple approach is to start by referring to the topmost parent node in the code and try to find it in both the XML files. If the node is found in both the files, then no difference found at the first level and we can proceed to the next deeper level of nodes to query. The actual XML tags are thus hard-coded and the same process can be repeated. If it is not found in either of the files, then the first difference is found. We can note this difference and then generate some other structure to describe the difference or a "delta". We can call this as the difference structure. If the XML is a representation of a relational database then this difference structure can then be used to translate into a set of RDBMS queries that equalise the database contents. Thus it is quite plausible that the difference structure is a linear structure rather than a tree-like XML.

Drawbacks of the simple approach :
The logic to parse the structure is inextricably mixed up with the logic to generate a difference structure that is a linear one. If the XML structure changes, the parsing logic also changes and the changes in the delta generated also follow. What would be ideal is the parsing of the nodes be independent of what the XML is about and just the actual difference structure (and subsequent RDBMS queries) generated be dependent on the actual differences in the XML. Its the changes needed to the parsing code that is a matter of contention. I think we can do better.

A better approach :
As it usually happens in computer algorithms, the better approach is more complex than the simple ones. As an aside, the situation is quite different when dealing with mathematical statements and proofs - the better proofs are shorter and have an "elegance" quality in them, in the long run, they prove to be more readily intuitive.
Ok, here we should let the coding logic be independent of any actual XML tags. We actually create a nice array of strings that is multi-dimensional and "hand-write" the XML tags in them. So If the XML has a hierarchy that goes two levels deep, then the array will be actually a list of a list of strings. We just create a string structure that mirrors the XML. This looks tedious but is way simpler than coding that structure for parsing into something the compiler finds acceptable.
The parsing code traverses this string array and treats it as as a descriptor to the XML. It does the same for both the files and if it now finds a difference, it will proceed to generate a in-memory linear difference structure. A separate piece of code will translate the differences into something like a set of RDBMS queries as required and the important fact is that this query generator is strictly different from the parser.
If the XML structure changes (or more likely, extended), no change is needed to the parsing logic. Just the hand-written string array should be changed to reflect the changes. Some skill involved in coding the parsing logic but once done, this approach scores over the earlier one on counts of maintainability and extensibility.

I see parallels in this approach and this is one that could be described as a "data-driven" approach (my term, my quotes). The actual code is only like a markup processor, that is very generic and simplfied. The key point is not that compilation is avoided but that the changes are not in code but in an array-like stucture of simple strings or values.

XML specification and duplicate tag processing

XML has been long touted as a very promising method for information exchange. Some count it as too verbose and doubt how efficient XML turns out to be if the information is voluminous. However, XML still reigns as the most widely accepted method to convey structured data, in a human readable form, for which parsers are widely available and one that is extensible.

One pattern of usage was noticed at my work product : Referring to another tag, to copy content -

A huge XML file that carries product control configuration of the entire application is usually being edited by humans. It basically stores configuration properties for various services that run as part of the product. What should we do if there are multiple duplicate services and they essentially have the identical properties ?

For example -

<Top_Parent_Node attr1="val1">
<Service_Node attr2="val21">
<Prop attr3="val3">
....
... Complex set of enclosed tags ....
....
</Prop>
</Service_Node>

<Service_Node attr2="val22"> <!-- duplicated service tag : we need this for the application -->
<Prop attr3="val3"> <!-- Forced to repeat this from the previous tag -->
....
... Complex set of enclosed tags ....
....
</Prop>
</Service_Node>
.... More such repetitions ....
</Top_Parent_Node>

The simplest way is to repeat the properties at both locations by copy-paste. We are rather good at that.
We, however, screw up miserably when it comes to propagating changes to one set of properties to all other identical locations.

I have a suspicion that this is a common situation that others run into as well. Which makes a good case for formalising this requirement in the XML specifications itself. The XML specification should allow a choice - either specify tags or make a reference to other tag that will be as good as copied into this tag while parsing.
For example -

<Top_Parent_Node attr1=val1>
<Service_Node attr2=val21 ?xmlref="N1" > <!-- Label this tag as a reference -->
<Prop attr3=val3>
....
... Complex set of enclosed tags ....
....
</Prop>
</Service_Node>

<Service_Node attr2=val22> <!-- duplicated service tag : we need this for the application -->
<?xmlref="N1" /> <!-- No need to repeat - referred label is treated as copied -->
</Service_Node>
.... More such repetitions ....
</Top_Parent_Node>

Few points to note :
- Only one place where entire spec of a node that will possibly duplicate resides.
- Any changes made to one place will reflect in all other places which refer to it.
- The first Service_Node, that carries the complete spec is labelled in a unique manner. This label is part of the specification and any node can be labelled in this manner. Thus it need not appear in any dtds or xsls as an available attribute.
- Any node can refer to this label by enclosing a <?xmlref> with a label identifier. The parser should copy the entire specification within the referred node into this node.
- The referring node and the referred node need not be in the same hierarchy or tree depth. The parser should deal with a referring node appearing before the referred node in the file. This is to keep the XML parsing independent of ordering. If the referred node is not found, the parser should throw an exception. I can see that DOM parsers can handle this in a straightforward manner. The SAX however should need to parse to the end in search of a referred node.
- I don't quite see the possibility to provide partial overriding capability to this idea without unnecessarily complicating the idea and obfuscating the XML specification.
- The fact that integrity is maintained easily with changes to the spec gives some credence and value to this idea over the fact that readability of the XML is somewhat hampered.

When I encountered this problem at my workplace, I must say that the problem was solved at the application layer, i.e- a new tag was added inside the duplicates to refer to the other node. It was a simple hack to the problem but it seems not to solve the problem but rather work around it. As you would have known, this is what happens in a commercial context under time pressure.

Friday, February 4, 2011

Software development (requirements and design)

Two strictly random thoughts on two aspects of software development - Requirements and Design follow. Just something to play with over the weekend -

1/ Systematic and scientific approach to software development instead of intuitive, programmer driven.

An intuitive, programmer-driven approach is a typical one taken up by programmers, that is heavily based on experience in solving similar natured problems and also on observations of other programmers' practices. Programmers are very trained to follow patterns of solutions. Most of the time, it goes OK, since the thought process has been reviewed by several programmers and has a proven workability. However, what if a fresh look is taken in some situations where the chosen solution is only based on tried and tested formulas ? How about deriving a solution based on a totally rational, logical and scientific approach ? Can it find some gaps in understanding, find some other, hitherto overlooked determining factors ? Most research in academia takes this approach and does succeed in finding better solutions than those in the industry. The industry is sometimes too focused on finding cost-effective, workable and time-bound solutions, to the detriment of doing something that will prove to be profitable and efficient for the long term. The hard, complex problems in software suffer in this regard much more than the run-of-the-mill kind of software, such as the CRUDs.
As a sunshine industry, software is still evolving, with new practices being thought of and proposed at regular intervals. It should be a place full of such opportunities for betterment.

2/ Allowing maximum configurability but not exposing it to users.

Customers use software built be developers but they usually end up complaining more often about features not implemented according to what they wanted and less often about things that don't seem to work correctly as per specifications. If something does not work correctly as per specifications, its a clear bug and the developers feel obliged to make the corrections. However, the point about missed specifications, misunderstood specifications and conventions is not so easy to rectify.
Can a solution be to allow for configuring everything and anything that is feasible ? Users will be quick to complain about huge configuration choices and complicated installation tasks. So the configuration out-of-the-box is chosen to be a vanilla, typically acceptable one and not exposed to users at all. If some user needs something different that how the software behaves, we will need to customize and configure but we will be very likely to find some way out without making code changes since we just made it as configurable as possible, even when we never expected the users to change it.
When I thought of this, I was really thinking about issues faced with various customers and the varied preferences each one has with respect to the same software. Some customers could complain about software components needing bi-directional access in network ports and interfering with firewalls, others cribbing about GUI of administration consoles and the layouts. Somehow, it is a bad bet for the developer to make a design choice, code something accordingly, and then face urgent situations because of making those choices if some customer expected otherwise. Even if we take care to implement according to some well-known standards and conventions, are we in a position to deny something to a prospective customer (and maintain same specs) if they don't like how it behaves ?

Friday, November 26, 2010

Software Product Features

Software Product Features, Qualities and Best Practices

This document is about the desirable features a software product should have and the practices to be followed during its conception, design and development. They are, in a way, guidelines at a most generic level, applicable to most software. Certainly, this document is incremental in nature.

For a diverse, dispersed team trying to develop software, it is important to follow certain common practices, so that the integration phase is simplified in the later stages. This document aims to just lay out broad principles and techniques for achieving that. I just gathered several software features, processes, tools and techniques that I have found impressive. I need to expand on each section in much more detail.

The various design and coding issues that are to be taken care of are listed below along with their desired solutions/techniques,

A. Platform –
Hardware as well as Software infrastructure should be chosen carefully to satisfy all functional or other (financial…) constraints. For now, we consider only PC desktops and open systems. The software itself should be developed to be compliant with widest possible range of infrastructure. The common way to achieve this is to use conditional compilation possible in C++ like language to allow different versions of platform specific code to run without changes on different platforms. To maintain portability for C++ like code for instance on 64 bit OS, it is imperative to use explicit type definitions for data types of different sizes. If any third party software is being used, such as RDBMS, it is preferred to use a wrapper library, such as SQLAPI. For XML parsing, it is certainly preferred to use Xerces. Follow common denominator among database standards like for an RDBMS, using ANSI SQL data-types and wrapping any table/column names by appropriate prefixes and namespaces to avoid name clashes and conflicting with keywords in a particular database.

B. Architecture (assuming PC-based software!)
It could be web-based, Desktop (n tier) or custom (both). The first two should be clear. The third requires explanation. As usual, we try here to separate the engine (business logic) from the UI part. The UI part is further separated into ‘Content’ versus the ‘Presentation’. The Content expresses the widgets (components) to be used and also their placement on the screen. The presentation takes care of converting the content in a format suitable for the medium. Thus, given a screen, the Content can be expressed in the form of an XMLdlg (XML document specifying the dialog look and feel). The web presentation layer is then a generic code to convert the XMLdlg to HTML, suitable for browser display. The desktop presentation is again generic, and converts XMLdlg into MFC/ Xmotif API calls.
Most ‘saleable and business’ software is of transactional nature. The business logic generally consists of certain objects going through phases of operation (like guests doing check-in, stay and check-out in a Hotel mgmt software). I propose to keep the business logic generic enough by making use of the Workflow infrastructure, so that the main engine could be reused for many other applications once done. If a client-server/distributed architecture is involved, it is better to observe restrictions like – Not having server call-backs on client for compliance with firewalls, SSL etc. Use distributed components to enhance scalability and fault tolerance (through replication), load-balancing (also through replication). Consider service oriented architecture for generic interfaces. Create clean, terse and generic interfaces, about which clients can query and find out. Importance of “loose coupling and high cohesion” cannot be understated. Consider deployment constraints like - blocking of ports, physical memory, utilization of any resources like multi-core processors, threading models, and be scalable.

C. Framework
A common framework binds the various application components together and provides a consistent set of rules/coding conventions that each part of the application can follow. A good framework should support – wrapping functions (for platform specific areas, threading/locking, middleware, file-systems and other applications like databases), support component creation & extension (like COM, CAA), and help integrate well with third party software (like SSL), provide utility functions (like math libraries), central and localized logging engine.

C. Configurability
This refers to the ability to quickly change certain aspects of the software according to the customers’ environment. The usage of XMLdlg as mentioned above, results in Configurability of the software to web-based or desktop-based without changing any code at all. Other practices for ensuring maximum Configurability are commonly known ones, like not hard-coding essential parameters of operation (Server names, database ports etc.). Using plain-text Settings files is ok, but preferred is XML.

D. Localization
The product should allow complete user interface (All visible text) to be in any local languages (Hindi, Marathi etc.). The common technique to achieve this is to set up text files (Nls files), which map ‘Keys’, used in code, versus the ‘specific language text’. Each language will have its own Nls file. The local language for the software should, in turn, be configurable! The software should not assume common conventions like e.g.- the decimal point. Certain languages have ‘comma’ instead of ‘dot’ as decimal point. The suitable data-structures are programming language dependent but should be Unicode capable.

E. Customizability
It adds more control to specific behaviour of the software. For e.g.- one banking customer could have two types of accounts (Saving, Current) and some other customer might have four types (Saving, Current, Salary and Fixed Deposit). There is no reason why the first customer should see all the four account types while it in the process to create new bank accounts. The software design should be such that no code changes should be required for such customization. Again, read customizable information from a text file, not hard-coding.

F. Licensing
It differs from Customizability or Configurability. It is the ability to have different versions of the software like Light Demo version, typical version and Full-featured version for different customers. Further granularity should be added, like enabling specific functions to certain customers by the facility of ‘environment flags’.

G. Code placement (directory hierarchy)
Related code should go into specific directories (frameworks). Code of the same functionality should go into a single directory inside a framework (module). Each module should compile into a .dll or .so (or a Java package jar!). The interfaces to be shared between frameworks should be collected in a directory at the framework level of the hierarchy. We may need to create special make-files. The build should allow maximum granularity and flexibility in packaging.

H. Testing
The Unit testing should include creation of automatic regression tests. Testing scripts or code should be gathered in its own test framework. For automatic test, typically a shell or batch script starts a separate Test program executable. The executable contains calls to the functional code. The test results in an output trace. The shell script then compares the output trace to a reference trace, known to be correct output. If there is a difference, the test is failed and regression is automatically detected. Such debug tracing should be enabled at the time of doing tests or debugging, not in production. If UI is to be tested, this facility will need support from within the UI wrappers/framework classes which will run in a special mode and not accept any User inputs while tests are being run. Only the recorded UI inputs can be re-dispatched by the test UI to the functionality.

I. Accessibility
It is recommended to maintain compatibility with an accessibility aiding software like JAWS, popularly used by visually impaired persons.

J. Scripting Engine
A scripting engine is vital for some applications, in which either admin operations are involved or if a large number of inputs are required for an operation. This is even more important when UI is not feasible for some operations. A Javascript compliant scripting engine can be used to expose script objects to user scripts, referring to the Mozilla projects’ Javascript engine.

K. Version Control

1/ http://en.wikipedia.org/wiki/Distributed_revision_control -
Main advantages - a/ Some project members can have privileges to decide on what changes to merge.
b/ Web of Trust - Changes from many repositories can be merged based on quality of changes.
c/ Network not required all the time - Check-ins can be into a private workspace/View.

Some open source software is available which will fulfill main requirements of distributed development (version control, release management) & security.

2/ The entire application source code can be split into workspaces, that can be logical partitioning of source code (organized by directories) that each developer can work on. Each workspace can be a set of directories, for which the developer can check-out source code. The source code is not available for all other directories, for which the individual developer has no check-out privileges. Therefore, the developer must get all executables/libraries for the protected source to be able to build and run in local workspace.

3/ Intellectual property/unauthorized usage will have to be additionally protected by licensing.
Licensing can be done in two ways -
a/ Simpler is to have environment flags, which are read and verified by the application to detect authorization.
b/ Some licensing utility can be built, that generates a license for a developer machine. The source code protected part of the software will check on the license before allowing an application startup.
This license checking component of the application (like other protected source code) will only be available to all developers as libraries/executable.

4/ Options -
a/ CVS / Open CVS - Open source, Client-Server - http://www.nongnu.org/cvs/
It has emphasis on security and source code correctness, web interface available as well desktop.
b/ Subversion - Open Source, Client-Server - http://subversion.tigris.org/
Offers directory level security, It's not strictly distributed, can use Apache/HTTP as web server & desktop.
c/ Git - Open source, distributed CVS - http://git-scm.com/
It offers desktop & web hosting.

L. Release Management
Also see section on “Version Control”.

M. Report Generation
TODO - Can we have configurable report generation ?

N. Documentation
This needs to be done for 5 areas/users - Interfaces, Code, Users, Administrators and Implementers. Create Release Notes for a release and a Pre-Release Bulletin. Follow the coding standards needed by ‘Doxygen’ or some other common tool.

O. Logging
Logs need to be generated for the Administrators and Users principally and support enabling of special DEBUG logs for bug fixing with developers.

P. Administration console or UI
For use by the administrators, offering authenticated and full view of the system. Should be able to configure it at runtime as well as install and uninstall just based on this – from the cradle to the grave!

Q. Security
Integrate with other authentication mechanisms like PAM, LDAP, SSO and NTLM.

R. Programming Language
It should be garbage collected, efficient and clean, which focuses on minimizing the pain for the developer, and makes it easy to code. It should allow the developer to focus on the algorithm than make a mess with low level management and intricate syntax. There may be multiple languages used as per suitability and the functionality that is being offered by that component.

S. Build Management
Daily builds. Multi-level workspaces. Limited workspace view for each project and for each developer. Views and check-outs to workspaces authenticated on the developer roles. Unit and Integration testing at each level in the workspace hierarchy. Automated tests to run in highest level daily.

T. Quality Metrics
Metrics based on proportion of failing automation tests (memory leaks, memory overwrites and null pointer checks if applicable). Test quality based on Code coverage. Performance tests to detect performance regression in addition to functionality.

U. Aspiration stuff
Some very advanced features … User Exits and Public Interfaces! For customers, who are programmers themselves and who want to build upon the existing product or make acute customizations (and other reasons), User Exits and Public Interfaces should be provided. User Exits are calls that the existing software makes to code written by customers. It allows extreme levels of customizations. Public interfaces are interfaces placed at the Framework level. These interfaces are implemented by the vendors and provided to the customers (along with usual libraries). The customers can compile their code, which makes calls to these interfaces.
There is a need for a special tool to record automatic regression tests, which involve the UI. The tool should be able to detect if the software is being run in a Test mode (as opposed to an interactive mode). On such occasions, it should be able to create a trace dump of the UI for test comparisons as stated earlier.
Another possibility is interfacing with Scanning or OCR hardware, which allows direct data entry?

That is all for now. The key thread running through the whole document is the indisputable need to invest enough into design and architecture so that the involvement in improving / customizing / bug-fixing the software is kept to a minimum. Some of the techniques mentioned above are well known, while others are a product of experience and a bit of imagination. These guidelines will drive further software development.


Hrishikesh Kulkarni.
20th Jan 2007

Last Updated – 8th Dec 2010
Updated – 26th Nov 2010
Updated – 11th Apr 2009
Updated – 4th Feb 2009
Updated – 21st Jan 2007

Tuesday, October 26, 2010

Quotes

This is my collection of quotes. gathered over the years.
I have strived not to record any of the soapy, flowery stuff that goes around as "inspirational" and "cute-sy". I have also hated any cheap slapstick or contradictions.
Many of the quotes below have a historical, political or scientific reference and relevance, and have an element of wit. I want them to reflect reality, so they have been uttered by (or have been attributed to) well-known personalities.
Here they are - Enjoy !


****


There are no crisis coming next week. My schedule is already full. -- Henry Kissinger.

****

Have you ever noticed ? Anyone walking slower than you is an idiot and anyone walking faster is a maniac.

****

Soap and education are not as sudden as a massacre, but they are more deadly in the long run. -- Mark Twain

****

Man is the only animal that blushes ... or needs to. -- Mark Twain

****

Show me a guy whos afraid to look bad, and I'll show you a guy you can beat every time. -Lou Brock

****

To doubt everything or to believe everything are two equally convenient solutions; both dispense with the necessity of reflection. -- H. Poincare

****

Woolsey-Swanson Rule: People would rather live with a problem they cannot solve rather than accept a solution they cannot understand.

****

The greatest love is a mother's, then a dog's, then a sweetheart's. -- Polish proverb

****

Superstition,idolatry, and hypocrisy have ample wages, but truth goes a-begging. -- Martin Luther

****

Once at a social gathering, Gladstone said to Disraeli, "I predict, Sir, that you will die either by hanging or of some vile disease". Disraeli replied, "That all depends, Sir, upon whether I embrace your principles or your mistress."

****

The second best policy is dishonesty.

****

Nothing in life is to be feared. It is only to be understood.

****

Certainly there are things in life that money can't buy,But it's very funny -- did you ever try buying them without money? -- Ogden Nash

****

Meekness is uncommon patience in planning a worthwhile revenge.

****

To a Californian, the basic difference between the people and the pigeonsin New York is that the pigeons don't shit on each other. -- From "East vs. West: The War Between the Coasts

****

He who knows others is wise.He who knows himself is enlightened. -- Lao Tsu

****

The fact that an opinion has been widely held is no evidence that it is not utterly absurd; indeed, in view of the silliness of the majority of mankind,a widespread belief is more often likely to be foolish than sensible. -- Bertrand Russell, in "Marriage and Morals", 1929

****

In War : Resolution
In Peace : Goodwill
In Victory : Magnanimity
In Defeat : Defiance
-- Winston Churchill

****

Associate with well-mannered persons and your manners will improve. Runwith decent folk and your own decent instincts will be strengthened. Keep the company of bums and you will become a bum. Hang around with rich people and you will end by picking up the check and dying broke. -- Stanley Walker

****

Good Judgment Comes From Experience. Experiences Comes From Bad Judgment.

****

Murphy's Law of Research: Enough research will tend to support your theory.

****

Don't interrupt me,
While I'm interrupting.
-- Winston Churchill.

****

Alcohol doesn't solve any problems,
but then again, neither does milk.

****

There are three kinds of lies: lies, damned lies, and statistics.
-- Benjamin Disraeli

****

I refuse to join any club that would have me as a member.
-- Groucho Marx

****

"Diplomacy" is letting them have it your way.

****

If you cannot convince them, confuse them.
-- Harry S. Truman

****

Common sense is the collection of prejudices acquired by age 18.
-- Albert Einstein

****

If you want a friend in Washington, get a dog.
-- Harry S. Truman

****

Politicians not born; they are excreted.
-- Cicero

****

Science is a differential equation and Religion is a boundary condition
-- Alan Turing

****

Time is God's way of keeping things from happening all at once.
-- Texas road-sign

****

The death of one man is a tragedy. The death of millions is a statistic.
-- Joseph Stalin

****

Death solves all problems - no man, no problem.
-- Joseph Stalin

****

Gratitude is a sickness suffered by dogs.
-- Joseph Stalin

****

When we hang the capitalists, they will sell us the rope
-- Joseph Stalin

****

In wartime, truth is so precious, that it has to be protected by bodyguards of lies.
-- Joseph Stalin

****

If I can't be loved, I'll find a way to be admired.

****

The trouble with the world is that the stupid are cocksure and the intelligent are full of doubt.
-- Bertrand Russell

****

Let us swear while we may, for in heaven it will not be allowed
-- Mark Twain

****

In Paris they simply stared when I spoke to them in French; I never did succeed in making those idiots understand their own language
-- Mark Twain

****

Any Universe simple enough to be understood cannot produce a mind complex enough to understand it.
-- John Barrow

****

Moderation is a fatal thing. Nothing succeeds like excess.
-- Oscar Wilde

****

Society often forgives the criminal; it never forgives the dreamer.
-- Oscar Wilde

****

Women love us for our defects. If we have enough of them, they will forgive us everything, even our gigantic intellects.
-- Oscar Wilde

****

The well bred contradict other people. The wise contradict themselves.
-- Oscar Wilde

****

A dreamer is one who can only find his way by moonlight, and his punishment is that he sees the dawn before the rest of the world.
-- Oscar Wilde

****

And he goes through life, his mouth open, and his mind closed
-- Oscar Wilde

****

Ridicule is the tribute paid to the genius by the mediocrities.
-- Oscar Wilde

****

A little sincerity is a dangerous thing, and a great deal of it is absolutely fatal.
-- Oscar Wilde

****

Don't give a woman advice: one should never give a woman anything she can't wear in the evening.
-- Oscar Wilde

****

How can a woman be expected to be happy with a man who insists on treating her as if she were a perfectly normal human being.
-- Oscar Wilde

****

Ah, well, then I suppose I shall have to die beyond my means.
-- Oscar Wilde

****

He has no enemies, but is intensely disliked by his friends.
-- Oscar Wilde

****

Be yourself; everyone else is already taken.
-- Oscar Wilde

****

Some cause happiness wherever they go; others whenever they go.
-- Oscar Wilde

****

We are all in the gutter, but some of us are looking at the stars.
-- Oscar Wilde

****

Consistency is the last refuge of the unimaginative.
-- Oscar Wilde

****

The true sign of intelligence is not knowledge but imagination.
-- Albert Einstein

****

The important thing is not to stop questioning.
-- Albert Einstein

****

I don't know, I don't care, and it doesn't make any difference!
-- Albert Einstein

****

Any man who reads too much and uses his own brain too little falls into lazy habits of thinking.
-- Albert Einstein

****

No problem can be solved from the same level of consciousness that created it.
-- Albert Einstein

****

Only two things are infinite, the universe and human stupidity, and I'm not sure about the former.
-- Albert Einstein

****

If the facts don't fit the theory, change the facts.
-- Albert Einstein

****

A perfection of means, and confusion of aims, seems to be our main problem.
-- Albert Einstein

****

After a certain high level of technical skill is achieved, science and art tend to coalesce in esthetics, plasticity, and form. The greatest scientists are always artists as well.
-- Albert Einstein

****

The only thing that interferes with my learning is my education.
-- Albert Einstein

****

If my theory of relativity is proven successful, Germany will claim me as a German, and France will declare that I am a citizen of the world. Should my theory prove untrue, France will say that I am a German, and Germany will declare that I am a Jew.
-- Albert Einstein

****

Never wear your best trousers when you go out to fight for freedom and truth.
-- Henrik Ibsen

****

The illegal we do immediately. The unconstitutional takes a little longer.
-- Henry Kissinger

****

The absence of alternatives clears the mind marvelously.
-- Henry Kissinger

****

In crises the most daring course is often safest.
-- Henry Kissinger

****

The conventional army loses if it does not win. The guerrilla wins if he does not lose.
-- Henry Kissinger

****

The ink of the scholar is more sacred than the blood of the martyr .
-- Muhammad

****

Half the lies they tell about me aren't true.
-- Yogi Berra

****

The various modes of worship, which prevailed in the Roman world, were all considered by the people as equally true; by the philosopher, as equally false; and by the magistrate, as equally useful .
--EDWARD GIBBON
(The Decline and Fall of the Roman Empire )

****

The question of whether a computer can think is no more interesting than the question of whether a submarine can swim.
-- E. W. DIJKSTRA

****

In times of tumult and discord bad men have the most power; mental and moral excellence require peace and quietness.
-- TACITUS

****

Most people are willing to pay more to be amused than to be educated.

****

It is a miracle that curiosity survives formal education.
-- ALBERT EINSTEIN

****

Creative minds have always been known to survive any kind of bad training.
-- ANNA FREUD

****

Everything that is really great and inspiring is created by the individual who can labour in freedom.
-- ALBERT EINSTEIN

****

The believer is happy; the doubter is wise.
-- Hungarian proverb

****

The more things a man is ashamed of, the more respectable he is.
-- GEORGE BERNARD SHAW

****

No object is so beautiful that, under certain conditions, it will not look ugly.
-- Oscar Wilde

****

The Heart was made to be broken.
-- Oscar Wilde

****

Take care to get what you like or you will be forced to like what you get.
-- GEORGE BERNARD SHAW

****

Truth comes out of error more easily than out of confusion
-- F Bacon

****

To achieve the impossible, one must think the absurd; to look where
everyone else has looked, but to see what no one else has seen.

****

Anything good in life is either illegal, immoral, or fattening.
-- Pardo

****

Research is what I'm doing when I don't know what I'm doing.
-- von Braun

****

It is easier to get forgiveness than permission.
-- Stewart's Law of Retroaction

****

When in doubt, predict that the present trend will continue.
-- Merkin's Maxim

****

If you want to make enemies, try to change something.
-- Woodrow Wilson

****

As the honourable senator from New York knows, I am in favour of the
constitution, institution, and pros ... perity.

****

In American politics you ask the rich for money and the poor for votes and
expect them both to believe you.
-- A. Lincoln

****

When you don't know where you're going any road will take you there.
-- Alice

****

The most beautiful things are those that madness inspires and reason
writes.
-- Andre Gide

****

If you want to make an apple pie from scratch, you must first create the universe.
-- Dr. Carl Sagan

****

Believe those who are seeking the truth. Doubt those who find it.
-- Andre Gide

****

Judge a man by his questions rather than by his answers.
-- Voltaire

****

Not everyone who wanders is lost.
-- J.R.R Tolkien

****

Intelligence is not to make no mistakes, but quickly to see how to make them good.
-- Bertolt Brecht

****

A mask tells us more than a face.
-- Oscar Wilde

****

The old believe everything; the middle aged suspect everything: the young know everything.
-- Oscar Wilde

****

If we begin with certainties, we shall end in doubts; but if we begin with doubts, and are patient in them, we shall end in certainties.
-- Francis Bacon

****

Iron rusts from disuse; water loses its purity from stagnation ... even so does inaction sap the vigour of the mind.
-- Leonardo da Vinci

****

Progress is man's ability to complicate simplicity.
-- Thor-Heyerdahl. Norwegian ethnologist, 1914-2002

****

Life's disappointments are harder to take when you don't know any swear words.
-- Calvin & Hobbes. Fictional characters from comic series

****

The thing that is really hard, and really amazing, is giving up on being perfect and beginning the work of becoming yourself.
-- Anna Quindlen. American bestselling Author and Journalist, b.1953

****

Art is the most intense mode of individualism that the world has known.
-- Oscar Wilde

****

If you are going to achieve excellence in big things, you develop the habit in little matters. Excellence is not an exception, it is a prevailing attitude.
-- Colin Powell - American Military leader and Statesman. Chairman of the US Joint Chiefs of Staff (1989-93). US Secretary of State (2001-2004). b.1937

****

You know you've read a good book when you turn the last page and feel a little as if you have lost a friend.
- Paul Sweeney.

****

Risk! Risk anything! Care no more for the opinions of others, for those voices. Do the hardest thing on earth for you. Act for yourself. Face the truth.
- Katherine Mansfield. New Zealander Writer, 1888-1923

****

If we're growing, we're always going to be out of our comfort zone.
- John Maxwell. American Author and motivational speaker

****

One of the most important lessons that experience teaches is that, on the whole, success depends more upon character than upon either intellect or fortune
- William Edward Hartpole Lecky. Irish Historian and Essayist. 1838-1903

****

To sin is a human business, to justify sins is a devilish business.
- Leo Nikolaevich Tolstoy. Russian moral Thinker, Novelist and Philosopher, notable for his influence on Russian literature and politics. 1828-1910

****

I like men who have a future and women who have a past
- Oscar Wilde.

****

All that we are is the result of what we have thought. The mind is everything. What we think we become.
- Buddha.

****

He who does not understand your silence will probably not understand your words.
- Elbert Hubbard. American editor, publisher and writer, 1856-1915

****

Do not worry about your difficulties in mathematics, I assure you that mine are greater.
-Einstein

****

Computers are useless. They can only give you answers.
-Pablo Picasso

****

Haste is of the devil. Slowness is of God.
-H L Mencken

****

Some problems are so complex that you have to be highly intelligent and well informed just to be undecided about them.
-Laurence J. Peter

****

There are two ways of constructing a software design. One way is to make it so simple that there are obviously no deficiencies. And the other way is to make it so complicated that there are no obvious deficiencies.
-C.A.R. Hoare

****

I have always found that plans are useless, but planning is indispensable.
-Dwight Eisenhower

****

It's not at all important to get it right the first time. It's vitally important to get it right the last time.
-Andrew Hunt and David Thomas

****

Don't worry about the world coming to an end today. It's already tomorrow in Australia.
-Charles Schulz

****

Many journalists have fallen for the conspiracy theory of government. I do assure you that they would produce more accurate work if they adhered to the cock-up theory.
-Sir Bernard Ingham

****

A painting in a museum hears more ridiculous opinions than anything else in the world.
-Edmund de Goncourt

****

Every time I paint a portrait, I lose a friend
-John Singer Sargent

****

"Therefore" is a word the poet must not know.
-Andre Gide

****

One does not discover new lands without consenting to lose sight of the shore for a very long time.
-Andre Gide

****

Nothing is so silly as the expression of a man who is being complimented.
-Andre Gide

****

Nothing prevents happiness like the memory of happiness.
-Andre Gide

****

It is unthinkable for a Frenchman to arrive at middle age without having syphilis and the Cross of the Legion of Honor.
-Andre Gide

****

It is better to be hated for what you are than to be loved for something you are not.
-Andre Gide

****

A samurai once asked Zen Master Hakuin where he would go after he died. Hakuin answered 'How am I supposed to know?'
'How do you not know? You're a Zen master!' exclaimed the samurai. 'Yes, but not a dead one,' Hakuin answered. ( - Zen)
****

Do not seek the truth, only cease to cherish your opinions. ( - Zen)
****

If you understand, things are just as they are; if you do not understand, things are just as they are. ( - Zen)
****

It takes a wise man to learn from his mistakes, but an even wiser man to learn from others. ( - Zen)
****

The ten thousand questions are one question. If you cut through the one question, then the ten thousand questions disappear. ( - Zen)
****

The tighter you squeeze the less you have. ( - Zen)
****

Though the bamboo forest is dense, water flows through it freely. ( - Zen)
****

To do a certain kind of thing, you have to be a certain kind of person. ( - Zen)
****

To follow the path, look to the master, follow the master, walk with the master, see through the master, become the master. ( - Zen)
****

When the pupil is ready to learn, a teacher will appear. ( - Zen)
****

When you reach the top, keep climbing. ( - Zen)
****

A weed is a plant whose virtues are only waiting to be discovered. ( - Zen)
****

Women may spend their whole lives looking for true love. If you wish for true love, learn to love yourself. ( - Zen)
****

You do not wait for fulfillment, but brace yourself for failure. ( - Zen)
****

Zen students must learn to waste time conscientiously. ( - Zen)
****

Truth is beautiful; without doubt, but so are lies - Emerson
****

"There is no security on this earth; there is only opportunity." (Douglas MacArthur)

****

"Economic history is a never-ending series of episodes based on falsehoods and lies, not truths. It represents the path to big money. The object is to recognize the trend whose premise is false, ride that trend, and step off before it is discredited." (George Soros)

****

A civilized society is one which tolerates eccentricity to the point of doubtful sanity.
Robert Frost

****

If you want peace, prepare for war.
Roman Aphorism

****

Thursday, October 21, 2010

The man who loved china - Simon Winchester

I just finished reading a lively biography of Prof. Joseph Needham of Cambridge (the old one). Titled "The man who loved china", written by Simon Winchester. Winchester describes the life and work of the Prof. Needham in manner that kept me hooked on to it for the whole week, until I finished it. He is plain blunt at times, but always colorful in describing the life and times of Prof. Needham.
I hadn't heard about the 'sinophile' professor before I read this book. But the man is most remarkable. The professor began as a researcher in bio-chemistry at the Caius college, Cambridge. Prof. Needham was eclectically intellectual, charming, outspoken and eccentric. When he had made his mark in the top-line research in his area in his early years, he shifted track. Enticed and egged on by his chinese mistress, he learnt the Chinese language and dabbled with calligraphy. He undertook a dangerous mission into China at the height of japanese invasion in the second world war. He pursued a deep study of China, its history of scientific development and eventually creating systematic and detailed notes. 24 large volumes of the scientific and intellectual history of China were written and published by him. He smoked cigars, researched, ventured, lectured and lived a life of fruitful writing till the very end.

He lived a long life, but I found the sheer volume of his research and intellectual output staggering. His roving eye for pretty women did not waver even at age 95. The author tells that at age 94, when the professors wife died, he married his chinese mistress, after having been in courtship for more than 50 years. When she too died, he made proposals to three other women and all of them politely declined.

I got glimpses of the life of the 'dons' at Cambridge university from the book. It sounds like utopia for the serious researchers - wonder if the IITs are a poor imitation.

The sad part is that Prof. Needham was a socialist (but not a communist according to the author. I tend to disagree). He was right from the "red's nest" that cambridge seems to have become prior to 1950s. I am dismayed by the fact that communism had so much of appeal to the intellectual class, in that period of history. Its not as if the murderous excesses of the Stalins and Maos were unknown to the world at large. But their heads were buried deep in the sand. Prof. Needham was found to be an unequivocal and fanatic supporter of almost everything that Chinese government stood for. He was certainly charmed and in love with chinese history, science and language, but mistook the voice of the chinese communist governments as the voice of the chinese people.

There is also a fact that the professor could not really get to the analysis of why chinese contemporary science lagged behind so much, even as science in its history was so far ahead of europeans. One notable cause was that chinese did not develop a competitive mercantile class who needed to innovate to stay in business. The biggest ambition of medival chinese youth was to join the corrupt and burgeoning bureaucracy and in that manner, earn the security and continuance of the government establishment. Justs struck me that this idea runs parallel to the same trend in maharashtrian youth. And the lag of marathis among other communities is also evident.

Prof. Needham has published the most comprehensive and systematic treatment of chinese scientific history. I don't think I would have the stamina to read up on all of those 24 volumes, but Winchester's byte sized offering suits me great. Its something similar to what I thought on reading "Koba the dread" by Martin Amis. I could not imagine myself reading all the volumes by Solzenitsyn on life in the Gulags. Amis's work covered all of it in a far more colourful and concise manner. Compilations from research are needed but commentaries on those same research subjects, if well-written are much more worth my time. A big thank you to the Amis's and Winchester's.

Saturday, October 2, 2010

Project Euler problem involving optimization

This is about my try at a moderate level programming problem from Project Euler, problem # 15.
With usual approach (and attitude), I thought I should be done in 15 minutes on this one, but stretched me for six times that estimate.
Problem is about finding the number of routes through a n*n grid, starting from top-left and ending at bottom-right (for n=20).

My sequence of attempts -
1/ Simplest recursive function that traversed right and down and counted routes on reaching destination. Too bad, it took unacceptably long for n=20 case. In retrospect, I should have avoided coding this approach, knowing the size of inputs.
2/ Making a recurrence relation, with the hope that I would not even need a program to calculate this. That is, if the problem of size N can be expressed in terms of problems of (size < N) and the if I could get a closed form, simple calculation would have done it.
I got this -
For a grid size (m*n),
T(m, n) = T(m-1, n)+T(m, n-1) ..... m <> n
2*T(m-1, n) ..............m==n

For the case when m==n, we have utilized the symmetry : T(m-1, n)=T(m, n-1).

Bringing this recurrence to a closed form is beyond my technical means today, so I modified the code to follow this recurrence and calculate. This too failed since the size n=20 was overwhelming my 64-bit dual core server even then.

3/ After almost giving up after some frustrating debugging and some old fashioned pen-paper workout of the recursion, I wanted to create a cache of T(m, n) values that were getting calculated. Certainly, many values were being repeatedly calculated in various "branches" of the recursion. In this case, the cache needed was no more bigger than n*n=400. So now the code only calculated a T(m, n) if it was being caclulated for the first time and re-used that.
Now the code was terminating in a flash even for large input sizes, but throwing wrong answers.
4/ The last mistake was about the return data-type of the function, that needs to be kept in min for almost all Euler problems. It was a silly 'int'. C++ did not warn of truncation, never mind what the compiler option was.

So I have a few things to take home today, caching at appropriate places being the most important of them.