Reflections and Novelties: November 2010

Software Product Features, Qualities and Best Practices

This document is about the desirable features a software product should have and the practices to be followed during its conception, design and development. They are, in a way, guidelines at a most generic level, applicable to most software. Certainly, this document is incremental in nature.

For a diverse, dispersed team trying to develop software, it is important to follow certain common practices, so that the integration phase is simplified in the later stages. This document aims to just lay out broad principles and techniques for achieving that. I just gathered several software features, processes, tools and techniques that I have found impressive. I need to expand on each section in much more detail.

The various design and coding issues that are to be taken care of are listed below along with their desired solutions/techniques,

A. Platform –
Hardware as well as Software infrastructure should be chosen carefully to satisfy all functional or other (financial…) constraints. For now, we consider only PC desktops and open systems. The software itself should be developed to be compliant with widest possible range of infrastructure. The common way to achieve this is to use conditional compilation possible in C++ like language to allow different versions of platform specific code to run without changes on different platforms. To maintain portability for C++ like code for instance on 64 bit OS, it is imperative to use explicit type definitions for data types of different sizes. If any third party software is being used, such as RDBMS, it is preferred to use a wrapper library, such as SQLAPI. For XML parsing, it is certainly preferred to use Xerces. Follow common denominator among database standards like for an RDBMS, using ANSI SQL data-types and wrapping any table/column names by appropriate prefixes and namespaces to avoid name clashes and conflicting with keywords in a particular database.

B. Architecture (assuming PC-based software!)
It could be web-based, Desktop (n tier) or custom (both). The first two should be clear. The third requires explanation. As usual, we try here to separate the engine (business logic) from the UI part. The UI part is further separated into ‘Content’ versus the ‘Presentation’. The Content expresses the widgets (components) to be used and also their placement on the screen. The presentation takes care of converting the content in a format suitable for the medium. Thus, given a screen, the Content can be expressed in the form of an XMLdlg (XML document specifying the dialog look and feel). The web presentation layer is then a generic code to convert the XMLdlg to HTML, suitable for browser display. The desktop presentation is again generic, and converts XMLdlg into MFC/ Xmotif API calls.
Most ‘saleable and business’ software is of transactional nature. The business logic generally consists of certain objects going through phases of operation (like guests doing check-in, stay and check-out in a Hotel mgmt software). I propose to keep the business logic generic enough by making use of the Workflow infrastructure, so that the main engine could be reused for many other applications once done. If a client-server/distributed architecture is involved, it is better to observe restrictions like – Not having server call-backs on client for compliance with firewalls, SSL etc. Use distributed components to enhance scalability and fault tolerance (through replication), load-balancing (also through replication). Consider service oriented architecture for generic interfaces. Create clean, terse and generic interfaces, about which clients can query and find out. Importance of “loose coupling and high cohesion” cannot be understated. Consider deployment constraints like - blocking of ports, physical memory, utilization of any resources like multi-core processors, threading models, and be scalable.

C. Framework
A common framework binds the various application components together and provides a consistent set of rules/coding conventions that each part of the application can follow. A good framework should support – wrapping functions (for platform specific areas, threading/locking, middleware, file-systems and other applications like databases), support component creation & extension (like COM, CAA), and help integrate well with third party software (like SSL), provide utility functions (like math libraries), central and localized logging engine.

C. Configurability
This refers to the ability to quickly change certain aspects of the software according to the customers’ environment. The usage of XMLdlg as mentioned above, results in Configurability of the software to web-based or desktop-based without changing any code at all. Other practices for ensuring maximum Configurability are commonly known ones, like not hard-coding essential parameters of operation (Server names, database ports etc.). Using plain-text Settings files is ok, but preferred is XML.

D. Localization
The product should allow complete user interface (All visible text) to be in any local languages (Hindi, Marathi etc.). The common technique to achieve this is to set up text files (Nls files), which map ‘Keys’, used in code, versus the ‘specific language text’. Each language will have its own Nls file. The local language for the software should, in turn, be configurable! The software should not assume common conventions like e.g.- the decimal point. Certain languages have ‘comma’ instead of ‘dot’ as decimal point. The suitable data-structures are programming language dependent but should be Unicode capable.

E. Customizability
It adds more control to specific behaviour of the software. For e.g.- one banking customer could have two types of accounts (Saving, Current) and some other customer might have four types (Saving, Current, Salary and Fixed Deposit). There is no reason why the first customer should see all the four account types while it in the process to create new bank accounts. The software design should be such that no code changes should be required for such customization. Again, read customizable information from a text file, not hard-coding.

F. Licensing
It differs from Customizability or Configurability. It is the ability to have different versions of the software like Light Demo version, typical version and Full-featured version for different customers. Further granularity should be added, like enabling specific functions to certain customers by the facility of ‘environment flags’.

G. Code placement (directory hierarchy)
Related code should go into specific directories (frameworks). Code of the same functionality should go into a single directory inside a framework (module). Each module should compile into a .dll or .so (or a Java package jar!). The interfaces to be shared between frameworks should be collected in a directory at the framework level of the hierarchy. We may need to create special make-files. The build should allow maximum granularity and flexibility in packaging.

H. Testing
The Unit testing should include creation of automatic regression tests. Testing scripts or code should be gathered in its own test framework. For automatic test, typically a shell or batch script starts a separate Test program executable. The executable contains calls to the functional code. The test results in an output trace. The shell script then compares the output trace to a reference trace, known to be correct output. If there is a difference, the test is failed and regression is automatically detected. Such debug tracing should be enabled at the time of doing tests or debugging, not in production. If UI is to be tested, this facility will need support from within the UI wrappers/framework classes which will run in a special mode and not accept any User inputs while tests are being run. Only the recorded UI inputs can be re-dispatched by the test UI to the functionality.

I. Accessibility
It is recommended to maintain compatibility with an accessibility aiding software like JAWS, popularly used by visually impaired persons.

J. Scripting Engine
A scripting engine is vital for some applications, in which either admin operations are involved or if a large number of inputs are required for an operation. This is even more important when UI is not feasible for some operations. A Javascript compliant scripting engine can be used to expose script objects to user scripts, referring to the Mozilla projects’ Javascript engine.

K. Version Control

1/ http://en.wikipedia.org/wiki/Distributed_revision_control -
Main advantages - a/ Some project members can have privileges to decide on what changes to merge.
b/ Web of Trust - Changes from many repositories can be merged based on quality of changes.
c/ Network not required all the time - Check-ins can be into a private workspace/View.

Some open source software is available which will fulfill main requirements of distributed development (version control, release management) & security.

2/ The entire application source code can be split into workspaces, that can be logical partitioning of source code (organized by directories) that each developer can work on. Each workspace can be a set of directories, for which the developer can check-out source code. The source code is not available for all other directories, for which the individual developer has no check-out privileges. Therefore, the developer must get all executables/libraries for the protected source to be able to build and run in local workspace.

3/ Intellectual property/unauthorized usage will have to be additionally protected by licensing.
Licensing can be done in two ways -
a/ Simpler is to have environment flags, which are read and verified by the application to detect authorization.
b/ Some licensing utility can be built, that generates a license for a developer machine. The source code protected part of the software will check on the license before allowing an application startup.
This license checking component of the application (like other protected source code) will only be available to all developers as libraries/executable.

4/ Options -
a/ CVS / Open CVS - Open source, Client-Server - http://www.nongnu.org/cvs/
It has emphasis on security and source code correctness, web interface available as well desktop.
b/ Subversion - Open Source, Client-Server - http://subversion.tigris.org/
Offers directory level security, It's not strictly distributed, can use Apache/HTTP as web server & desktop.
c/ Git - Open source, distributed CVS - http://git-scm.com/
It offers desktop & web hosting.

L. Release Management
Also see section on “Version Control”.

M. Report Generation
TODO - Can we have configurable report generation ?

N. Documentation
This needs to be done for 5 areas/users - Interfaces, Code, Users, Administrators and Implementers. Create Release Notes for a release and a Pre-Release Bulletin. Follow the coding standards needed by ‘Doxygen’ or some other common tool.

O. Logging
Logs need to be generated for the Administrators and Users principally and support enabling of special DEBUG logs for bug fixing with developers.

P. Administration console or UI
For use by the administrators, offering authenticated and full view of the system. Should be able to configure it at runtime as well as install and uninstall just based on this – from the cradle to the grave!

Q. Security
Integrate with other authentication mechanisms like PAM, LDAP, SSO and NTLM.

R. Programming Language
It should be garbage collected, efficient and clean, which focuses on minimizing the pain for the developer, and makes it easy to code. It should allow the developer to focus on the algorithm than make a mess with low level management and intricate syntax. There may be multiple languages used as per suitability and the functionality that is being offered by that component.

S. Build Management
Daily builds. Multi-level workspaces. Limited workspace view for each project and for each developer. Views and check-outs to workspaces authenticated on the developer roles. Unit and Integration testing at each level in the workspace hierarchy. Automated tests to run in highest level daily.

T. Quality Metrics
Metrics based on proportion of failing automation tests (memory leaks, memory overwrites and null pointer checks if applicable). Test quality based on Code coverage. Performance tests to detect performance regression in addition to functionality.

U. Aspiration stuff
Some very advanced features … User Exits and Public Interfaces! For customers, who are programmers themselves and who want to build upon the existing product or make acute customizations (and other reasons), User Exits and Public Interfaces should be provided. User Exits are calls that the existing software makes to code written by customers. It allows extreme levels of customizations. Public interfaces are interfaces placed at the Framework level. These interfaces are implemented by the vendors and provided to the customers (along with usual libraries). The customers can compile their code, which makes calls to these interfaces.
There is a need for a special tool to record automatic regression tests, which involve the UI. The tool should be able to detect if the software is being run in a Test mode (as opposed to an interactive mode). On such occasions, it should be able to create a trace dump of the UI for test comparisons as stated earlier.
Another possibility is interfacing with Scanning or OCR hardware, which allows direct data entry?

That is all for now. The key thread running through the whole document is the indisputable need to invest enough into design and architecture so that the involvement in improving / customizing / bug-fixing the software is kept to a minimum. Some of the techniques mentioned above are well known, while others are a product of experience and a bit of imagination. These guidelines will drive further software development.

Hrishikesh Kulkarni.
20th Jan 2007

Last Updated – 8th Dec 2010
Updated – 26th Nov 2010
Updated – 11th Apr 2009
Updated – 4th Feb 2009
Updated – 21st Jan 2007

Friday, November 26, 2010

Software Product Features