Saturday, September 19, 2020

India's China war

Its 1914 and the British are making a royal mess, that has continued to trouble several generations in India, Tibet and China !

Henry McMahon had just drawn a line stretching from near Bhutan to the forests near Burma and was supposed to mark the border between India and China. This McMahon line was to be ratified in a tripartite agreement between British India, China and the ruling classes of Tibet. The underhandedness of the British was in that China, which never agreed to the demarcataion at that time was left out, the agreement was reached with just Tibetan rulers and thereafter propagated as the border. Tibet was a signatory but never followed the line in practice as it involved a partition on the Tawang tract - a wedge shaped land that was a passage from Assam to Tibet and never really considered part of India but very much a part of Tibet. The British wanted a buffer zone between the India and China and therefore chose to put large parts of Tawang tract inside the Indian borders. For similar reasons, British wanted a buffer area further north in Ladakh and avoid sharing a direct border with Russia and thereafter laying the seeds of an Indian claim to Aksai Chin. Earlier, in 1899, the British had concluded agreements with Russia without involving China in the fate of Aksai Chin and simply included it inside Indian borders. China considered entire Tibet, including Tawang and Aksai Chin as its own province and had sent conquering armies to Tibet in 1909. This unfortunate backdrop forms the historical perspective in Neville Maxwell's book "India's China War".

Neville Maxwell worked as a journalist for 'The Times' in New Delhi, in the years preceding the 1962 war. Most of the events in the book describe the details of what happened in India, the statements from Indian government, opposition leaders and the people. On the other hand, very little has been written about how the Chinese government came to decide and act in the way it did. There are merely the official chinese government statements tell us anything about the thinking on the chinese side and only in cases such as when the chinese premier, Chou en Lai visited New Delhi in 1959, to discuss the border issue, that we get more insight into how the chinese side worked. Overall, the book is very India centric, and because in this case, the Indian leadership happened to show maximum possible incompetence and naivette, the Indian side of the dispute is seen in poor light, while the chinese aggression and war crimes are covered up in fleeting mentions. It is indeed quite damning of the Indian political as well as military leadership just prior to and during the 1962 border debacle with China. The title of the book itself suggests that it was plainly the Indian side that indulged in a fantastic overestimation of their own military capabilities, failed to even agree to negotiate with the Chinese and then triggerred a war on long unresolved border issues with a dim-witted "forward policy".

Post independence from the British, the British commission office in Lhasa was simply taken over by an Indian commissioner and only visible change was in the flag that flew over. The government of India simply chose to maintain the ambiguity around the northern boundaries with China, presumably considering that such boundary negotiations are best done from a position of strength which India clearly wasn't in, at the time. China, too, played very passively and never raised the border fixation formally right through the years from 1949 to 1959. On official Indian maps, the Indian border was a rough demarcation indicating an indeterminate situation, but it still included Aksai Chin. It also included a highway that China had build from Xinjiang to Tibet, that passed through Aksai Chin and China considered that road as a vital connection to Tibet. The first prime minister of independent India - Jawaharlal Nehru, indicated to the parliament that India intended to hold and defend all the land marked south of the McMahon line as its own but also included Aksai Chin that the Chinese held. Nehru seemed to be in two minds even on a question of whether the border issue should be discussed and negotiated with China and got nowhere to making a mutually acceptable negotiation. So incredible was the level of indecision that Chou Enlai spent a week in New Delhi meeting the higher ups in Indian government trying to negotiate a border agreement but all he got was pleasantries, promises and zero commitments. Right when Chou Enlai boarded a plane back to China, the Indian side went back to making claims on land in chinese control, something that the chinese never agreed even if at that point the Chinese did accept that the land south of McMahon line was India's.

The height of the tomfoolery was India's forward policy, that is to have the army slowly and steadily establish posts and patrolling in areas such that they expand the Indian borders. The top politicians - Nehru and the defence minister V.K.Menon had total disregard to the objections of Gen. Thimayya when he stated that in case the chinese retailiated, the Indian Army was in no position to hold the forward areas due to lack of equipment, personnel and over-stretched logistics in the remote, hilly and forested land. The general resigned and was replaced by a political lackey, along with another curious character, B.M Kaul, who got a fast promotion to the Chief of General Staff and who was a favourite of Nehru.

The actual trigger to the war was the attempt to cross over a bridge near Thag La. Thag La itself is a remote, high altitude, thickly forested area, reachable only by a minimum two day march on foot from the Indian side but which was reachable in a few hours from the Chinese side courtesy of a proper road inside the Chinese border. The Indian defence minister with bright ideas was pushing for a forward take on Thag La brushing aside the advice of the commanders on the ground, such as Brig. Dalvi, whose book "Himalayan Blunder" also traces this period in accurate detail. At Thag La, the Indian army was neither as equipped as the Chinese but also short of personnel. CGS Kaul availed the luxury of a horse side instead of a two day trek to Thag La and then immediately returned to Tezpur citing altitude sickness. As the misadventure at Thag La became evident when the Chinese finally retaliated with overwhelming force, the CGS was quick to fly away to Delhi citing more serious health problems though he had no formal relieve. The Indian forward posts were quickly and easily wiped out, and in the next few weeks, the chinese army took over Tawang, Walong and even the 13000 feet high Se La pass, where the Indian army was looking to hold fort. There was panic setting in to an extent that even Tezpur in today's Assam was at the point of desertion and capituation. The situation in the western sector (Ladakh) was better than NEFA, where the soldiers of fallen Indian posts were scattered and frantically retreating towards home but still in a loss. The Indian army faced a comprehensive rout with 2000 to 3000 dead and missing, and the chinese seemed to have achieved all of their objectives by November 1962. Nehru seemed still dithering on reaching out to USA and Russia for military help because he could not get off the tall horse of non-alignment. When he did reach out to President Kennedy, it was late enough because China had announced a unilateral ceasefire shortly afterwards.

The Chinese ceasefire announcement was unexpected but the real rub was that they also announced their intention to withdraw completely from conquered lands and go back to positions that were held in 1959! That was exactly the Chinese view of the Indo-China border on the negotiation table at that time. It implied that the only reason for China to start the war was to teach a lesson and establish once and for all the right Indo-China border. Any further negotiations with India would be from a position of strength for the Chinese.

Were there proper lessons learnt from the disaster in India ? The high political leadership that was debating about how to beat back the enemy in the blink of an eye went into a shock. Nehru got a resignation from Krishna Menon and both Nehru and Krishna Menon were reluctant about it. Nehru himself was on the wane since then until his death in 1964. There was a shuffle in the army leadership and the succeeding prime minister Lal Bahadur Shastri made it a point to listen better to the army generals in the 1965 war with Pakistan. But the detailed findings and investigations into the 1962 loss remain unimplemented, and misunderstood even today as the Indian polity prefers to bury the dirty linen instead of cleaning and letting it dry in open air. You won't find a chapter in the regular history books about 1962 and will read about it only in books like Neville Maxwells's.

I liked this book but it tells a dark tale and a stark warning for people who are unprepared, who prefer to remain out of touch with reality and yet bear pride on their sleeves.

Monday, May 4, 2020

Efficiency pitfalls in Haskell

This is my third note on programming in Haskell that has performance centric commentary. First two are here and here. In this post, I hope to consolidate in one place several best practices and gotchas regarding runtime performance in Haskell, at an elementary level.

When I first learned Haskell, I would tend to solve any problem involving speed by going through the same old imperative tricks and techniques, with just a 'for loop' replaced by a map, aggregating actions replaced by fold and of course, using explicit recursion. If there was some data that had to be mutated, I would tend to introduce additional arguments to the function involved to bypass the need to mutate data.

Example,

raisedToLinear :: Int -> Int -> Int
raisedToLinear _ 0 = 1
raisedToLinear x 1 = x
raisedToLinear 1 _ = 1
raisedToLinear x y = x * raisedToLinear x (y - 1)

-- Linear time w.r.t exponent. To make it efficient...
-- Divide and Conquer makes this logarithmic time
raisedTo :: Int -> Int -> Int
raisedTo _ 0 = 1
raisedTo x 1 = x
raisedTo 1 _ = 1
raisedTo x y
| y `mod` 2 == 0 = sq * sq
| otherwise = x * sq * sq
where sq = raisedTo x (y `div` 2)


Here is another example where carrying forward the problem solving thinking from imperative languages can still help obtain an efficient solution using Haskell. It is a O(N) time solution for the maximum subarray problem, that every job applicant to the tech companies has learned how to write while asleep,



acf :: Int -> (Int, Int) -> (Int, Int)
acf elem (rs, best)
| (rs + elem <= 0) = (0, best) | otherwise = (rs + elem, max (rs + elem) best) maxsub :: [Int] -> Int
maxsub a = snd (foldr acf (0, 0) a)



It is the same logic expressed in Haskell, as is well known except that modification of variable that sums up incoming list elements is now handled as an accumulator while 'reducing' in a foldr. Its linear time alright but space ? Refer here for implications of using foldr on memory requirements in a lazy setting.
Also observe how passing an 'accumulator' argument helps the function acf improve efficiency. It uses a common technique of tupling the argument to include the best result so far as well as the ongoing accumulation.

Well, it makes for a rough and ready solution not involving too much knowledge of Haskell and making do with as much generic logical skills as possible. It would still work well as long as one is properly cognizant of mistakes usually made by beginners involving lazy evaluation and space leaks.

However, the techniques described by good books in Haskell approach problems differently. They tend to go strong on first defining types involved, writing composable functions that do a small single thing and then intelligently compose them. It makes the solution difficult to read and understand for someone fed on imperative algorithms and leaves us wondering why the paraphernalia is needed.

Under strict evaluation and in pure languages, one is forced to use explicit recursion to define functions for reasons of efficiency. It does not allow composing program logic out of simple library functions like folds, maps and filters.
Example,

-- With eager evaluation, code to select first element of a list that passed a filter would be -
first :: (b -> Bool) -> (a -> b) -> [a] -> b
first p f xs
| null xs = error "Empty List"
| p x = x
| otherwise = first p f (tail xs)
where x = f (head xs)

-- However, under lazy evaluation, happyness looks like ...
first :: (b -> Bool) -> (a -> b) -> [a] -> b
first p f = head (filter p) (map f)
-- Under eager evaluation, first p f = head (filter p) (map f) makes us
-- evaluate entire list and not stop where
-- the filter returns a single item that the head really needs.

Now, an example where seemingly innocuous code can deliver a painful bite when you least expect,

sum = foldl (+) 0

-- If sum is defined as above,
sum [1..10000]
-- consumes 10000 units of memory before it actually adds anything at all


If you are coming from a Scala background, foldl is known to be much more efficient than foldr because it does not need to consume space due to eagerly evaluating the passed function on elements of the list starting at the beginning. Tail recursively implemented functions were supposed to be memory efficient in a guaranteed manner. foldr was supposed to be the problem which required evaluating the list from the rightmost/last element and thus consuming stack space.
In Haskell, even foldl doesn't cure all ills.

Space Leak ?
Misunderstanding the term 'Space Leak' with the usual 'Memory Leak' leaves most C++ programmers dumbfounded. First, they see a mention of space leak as a problem with automatically garbage collected Haskell programs. Actually space leak should clearly refer to the fact that badly written Haskell code can end up consuming a lot more memory than should be needed. Lazy evaluation can blow up memory requirements and can make writing what should be simple 'bread and butter' code really tricky to get correct. What is worse, the problem doesn't show up until you pass a really large input dataset and track the memory consumption closely. Runtime failures such as stack overflows can be easily triggered.
Here is how it can hurt,



-- We want to compute the simple average of a list of Ints ...
average :: [Double] -> Double
average [] = 0
average xs = sum xs / fromIntegral (length xs)
-- Has a rookie mistake, consumes too much memory when it doesn't need to
-- and has unnecessary list traversals.


Of course, the summing and length calculation traverses the list twice so we try to fix it using a fold and get both together. As explained here, we use an eager version of left fold (foldl') defined as follows,


foldl' :: (b -> a -> b) -> b -> [a] -> b
foldl' f acc [] = acc
foldl' f acc (x:xs) = x `seq` acc `seq` foldl' f (f acc x) xs

sumlen :: [Int] -> (Int, Int)
sumlen = foldl' g (0, 0)
where g (s, n) x = s `seq` n `seq` (s + x, n + 1)

-- a fresh definition of average ...
mean [] = 0
mean xs = s / fromIntegral n
where (s, n) = sumlen xs



The above piece of code to calculate just an average contains a good number of concepts to grok. First it is the specially made eager version of foldl, that is memory efficient and uses a primitively implemented function 'seq'. seq simply evaluates its left argument and returns the right one as it is.
foldl' was used by another function sumlen so that sum of the input numbers and length of list could be calculated in one shot and using constant amount of memory.
Just knowing about the happy lever named 'seq' that turns on eager evaluation is not enough ! seq needs to be applied thoughtfully while implementing sumlen to take care that both 's' and 'n' are eagerly evaluated.

Purity does benefit Haskell code too, sometimes. It is able to cache values returned by functions in case the returned values do not depend on its arguments or if the returned value is "bound" to the function itself.
In the following example, the sum of first few prime numbers is calculated for the first invocation but running the same code again,



sumOfPrimes :: Int -> Int
sumOfPrimes n = sum (take n primes)

primes :: [Int]
primes = [x | x <- [2..], divisors x == [x]] divisors :: Int -> [Int]
divisors x = [d | d <- [2..x], x `mod` d == 0] -- the following blazes through... map (\x -> x + sumOfPrimes 1000) [1..1000]

-- but the following code is unable to re-use calculations of prime and crawls interminably,
sumOfPrimes2 :: Int -> Int
sumOfPrimes2 n = sum (take n primes2)
where primes2 = [x | x <- [2..], divisors2 x == [x]] divisors2 x = [d | d <- [2..x], x `mod` d == 0] map (\x -> x + sumOfPrimes2 1000) [1..1000]



Finally, read and interpret which of the following two versions of a function is more efficient.
The function basically calculates all possible subsequences of a list. So for any list of length N, it has exponential complexity. But still, one of these functions does the same job better -



subseq1 :: [a] -> [[a]]
subseq1 [] = [[]]
subseq1 (x:xs) = subseq1 xs ++ map(x:) (subseq1 xs)

subseq2 :: [a] -> [[a]]
subseq2 [] = [[]]
subseq2 (x:xs) = sxss ++ map(x:) sxss
where sxss = subseq2 xs

subseq2 is more efficient because it prunes repeat work of calculating subseq2 xs through reusing the common data sxss.
I learnt most of what I know about efficiency in Haskell programming from this excellent resource - Thinking functionally in Haskell by Richard Bird. The writing is quite comprehensive in coverage, precise and systematic.I encourage you all to proceed and read up.

Sunday, April 26, 2020

How a simplistic pattern retains hold

It looks like a virus is going to help retain a commonly believed indicator of a recession.
As per this report of exactly a year ago, the US bond yield curve had inverted. Basically that meant that investors preferred to invest money with a 10 year maturity (a longer term bond) instead of a shorter term bond (say, a 3 month one). Investors would do that when they find that in the short term the uncertainties and risks are too high. So longer term investment in an instrument considered universally safe, such as US treasury, is better.
In each of the last 11 out of 11 instances of yield curve inversion, a recession has followed a few months to a year after that. The corona virus battered world is indeed staring at a recession now, exactly a year after the yield curve inversion. Several years later, would we say that this recession was indeed the 12 out of 12 success for the yield curve recession indicator ? We know that correlation does not imply causation but in this truly black swan case of the virus, even counting the correlation as the 12th success seems unwarranted. There is no common cause either for the inverted yield curve and the COVID 19 pandemic.

Saturday, April 25, 2020

Folds unfolded


I wish that we, Haskell programmers, never fall into the ditch that Alan Perlis had imagined when he said that a functional programmer was someone who knew the value of everything and the cost of nothing. Haskell is a compiled language and runs with nearly as nice speed as C++. When I am learning Haskell, I do not want to take up learning a new language for merely the promise of elegant and short expression, but for comparable run time performance too. Ability to write concurrent processing code with no more difficulty than a single threaded program should be great for improving runtime performance. Over and above, it should also help deliver software with lesser development time and definitely lesser number of bugs. Haskell as a pure functional programming language promises all of that.

However, I would be far less obsessed with Haskell if I can't find a way to create software with superb runtime performance. Now, Haskell can show surprising runtime behavior and can shoot you in one foot and blow up the other one. For larger pieces of code, it can be tricky to predict runtime performance because of lazy evaluation and non-strict function arguments. I ran into such problems while understanding a very basic Haskell function that helps 'reduce' a list or some complex structure into a single value - "fold" and its variants foldl and foldr. An excellent introduction to the folds as well as comments on efficiency are written here. This writeup adds to the explanation with that wiki. Lazy evaluation is known to consume a lot of memory while holding references to pending computations so understanding how to use folds is critical.

To understand folds, it is necessary to also understand the two concepts of lazy evaluation and non-strict function arguments. Both of them are quite unusual for any programmer born and bred in C++ or Java.

Lazy evaluation - Haskell lazily evaluates expressions implying that it will do no processing until and unless the programmer needs to print the result to the console or save it to the database or some such thing. Further, Haskell will only evaluate to an extent the printing requires and no more. So it is possible to declare and manipulate an infinitely sized list of objects in Haskell as long as you require processing only a finite number of them -

sumOf10Primes = sum (take 10 primes)
    primes = [x | x <- [2..], divisors x == [x]]
    divisors x = [d | d <- [2..x], x `mod` d == 0]
    -- Dutifully prints 129

    last [1..]
    -- gone forever ! 
Non strict functions - The lazy evaluation described previously is made possible due to the non-strictness nature of some of the functions involved. The function 'take' did not touch the second argument it was passed the moment it has done returning 10 numbers. 'take' does not evaluate its second argument strictly, though it is strict about the first one. Basically, if a function is passed 'undefined' as one of the arguments and still sometimes completes successfully is non-strict in that argument.
Folds - Finally coming to the pitfalls for a novice programmer when using foldl or the foldr variant. The following piece of code taken from this wiki exhausts available RAM
-
foldl (+) 0 [1..100000000]
  -- Sum to 100M, are you crazy ? At least on my Ubuntu with GHCi ver 7.10.3
  -- bloats memory, freezes GHCi for several seconds even after a ^C...

  foldr (+) 0 [1..100000000]
  -- foldr doesn't help either ! 
  -- A stack overflow exception is seen fairly quickly. 

Well, both foldl and foldr failed but in apparently different ways. foldl tried to allocate memory for its so called thunks - unevaluated chain of functions piled up in memory. foldr exhausted stack while doing constructor replacement and not being tail recursive. A different implementation of the foldl described in the wiki works well in under 2 seconds, even for 100M numbers
-
foldl' f z []     = z
  foldl' f z (x:xs) = let z' = z `f` x in seq z' $ foldl' f z' xs

  sum' = foldl' (+) 0
Here, foldl' worked by suppressing lazy evaluation (or if you prefer, by forcing eager evaluation), through 'seq', a primitive function. Otherwise, foldl' is very much like foldl. what about foldr ? It works from the opposite end of the list and maintains the initial order of the list in case the output of the fold is another list. Also, foldr can short circuit evaluation in certain conditions but foldl is condemned to process the entire list before returning anything, always. This means foldr may be used to process an infinite list but foldl can never be. There are some intricacies in how the 'seq' function, it won't work as expected if used incorrectly but that is not the topic for now.
Here is an example where the short circuiting while using foldr makes a helpful difference.
foldr (\e a -> if mod e 10 == 0 then 0 else (mod e 10) * a) 1 [1..10000000]
  -- returns 0 instantly.
  -- Short circuting based on non-strict evaluation of the second argument 
  -- of the lambda function. 
  -- It works only because of the way the lambda is implemented.

  foldr (\e a -> (mod e 10) * a) 1 [1..10^7]
  -- Even foldr does NOT work well here. 
  -- The lambda "uses" both its arguments.
Recommendations - So here we are left with three possible choices whenever we want to fold or reduce some list or a complex structure.

Use foldl' when - Performing constant memory consuming operation on inputs where short circuiting is not going to work out. For possible infinitely sized inputs consider using foldr.

Use foldr when - Almost always if foldl' is not usable. If inputs are infinitely sized, or short circuiting will lead to better performance, this is only option.

Use foldl when - Almost never ! May be needed if there is some possibility that the input arguments are undefined and their evaluation can be skipped through lazy behavior of the function passed to foldl.

Consider this concocted example -
mult _ 0 = 0
  mult x y = x * y

  foldl mult 1 [2, 3, undefined, 5, 0]
  -- works and prints 0

  foldl' mult 1 [2, 3, undefined, 5, 0]
  -- blows up for working on undefined.
Haskell being lazy, garbage collected and coming from a C++ background, I personally prefer to use functions that have predictable and low memory consumption. Strictly evaluated foldl' serves such a purpose well. Unless of course, there is a good reason to use foldr or even the god forsaken foldl.

Tuesday, November 5, 2019

State of the Indian Economy

This is about the state of Indian economy, not so much about global. October of 2019 has passed with a barrage of content everyday talking about the bad health of the Indian economy. Presumably it means that the growth of the Indian economy is stalling. May not be stalling completely but definitely not growing as fast as they expect it should. So yes, the GDP numbers confirm the story. We know by looking at the newspapers. Not sure if we can tell that anecdotally, do we see scores of unemployed people on the streets or posh restaurants and shops deserted now that wasn't t he same earlier ? Not really. I keep seeing scores of unemployed people on the streets every day and I have been seeing them even when the economy was supposed to be going great guns. And as far as expensive shopping and entertainment is concerned, those Indians seem to have a lot of money to burn, if you listened to the firecrackers this diwali or in other festivals. The restaurants are occupied enough to make any reduction in food bills unlikely. So where is the slowdown evident ? The newspapers and other media aren't lying though ! Now, after the elections, its even in the government numbers. Just 5 % it said for the last quarter. Way less than comfortable for us in India, When more than 5% is achieved by China, whose economy is more than 3 times the size of India's. Even Uncle Sam gets to 3% frequently. Why are we Indians going so slow ?

I know many people who just blame this government. They have complains about demonetization, GST and the fact that the government has struck down on black money transactions. Blast them. It's remarkable that so many people believe that its impossible to run the country without using unaccounted wealth. The complainypants even have a problem with Jan Dhan accounts because they say the rural/illiterate folk don't deserve banking services and they won't use them properly anyway. Who cares about them? Who cares about teaching them to operate banking services ? Mind you, those having a grudge on Jan Dhan accounts are actually employers who are saying that they are better off with their workers receiving cash from them as salary instead of depositing money into employees' bank accounts. You know why, right ? Yes, demonetization hurt those same people too. GST is making it slightly more difficult to hide tax payables. So here we go - those crooks are now out to complain about the slowdown in the economy. Yes, the Indian economic system is stunned into inaction, it has gotten a serious shock of a different kind of electricity, named honesty.

There is now an abysmally low amount of trust left in the Indian financial system. Nobody trusts the company books and nobody trusts the company auditors. Companies with AAA credit ratings are going bust in a few weeks, so nobody trusts the rating agencies. Nobody trusts the NBFC when it spells out its NPA's, thinking that there must be a lot more. Nobody trusts the banks either with their NPA numbers. The situation was the same earlier as much as I know - when I used to mention fundamental analysis of stocks, the wise folks used to say that *all* companies cook their books so fundamental analysis is a fundamentally hopeless adventure. Repaying loans is just not a culture here. Now that the banks are getting serious about loan recovery whenever they lend, the rich dudes just don't want to invest anymore ! Even with RERA, the real estate industry still resembles a crime scene. The lobbyists are out to dilute RERA clauses at individual state level.
Now, where do we go from here ? Who is going to invest in this economy ? So far, the deep pocketed crooks used to game the system, taxation had so many holes that for the rich, stealing tax was a given. The rich knew that they could get away with sub-standard delivery of goods/infrastructure and mint money from the government contracts. Recovering money for a finished task from the government or any private entity is incredibly tough in India. The rich now balk at investing in India because the screws seem to be tightening for them, as far as taxation and loan repayment is concerned. The foreign investors are coming to their senses about the reality in India now too.

Why is Indian labour so unproductive ? Facebook and other social media are a good explanation for employees in IT industry, I know. But simple work ethic is poor among all classes of workers here. The chinese have 9-9-6 (12 hours of work, 6 days a week), they have developed their own manufacturing industry through so much hard work, relentless sacrifices and ingenuity. Where do I see such parallel in India ? Indian manufacturing taking off is a pipe dream - Indian workers are simply not working as hard and as efficiently as what is required to make manufacturing competitive. Land prices in India are sky high, wherever you go. Wherever you go in India, we see population. Tiny villages with barely any electricity look like towns, towns look like cities with non-existent dust roads and cities look like a never ending slum expanse, of limitless illegal construction.

Talk about competitiveness of Indian industry now. Indian traditional goods involve manual labour and are works of art. Mass produced goods are made so inefficiently that export competitiveness is poor. See how the local industry captains are howling in terror when the government announced a possibility of joining RCEP ? The Amul top management is terrified that exports from New Zealand will overpower the sales of dairy in India ! Do we wonder how dairy products from New Zealand, manufactured with developed country costs, then exported 10k miles and brought to India with added shipping costs and import duties still manage to beat Amul dairy products in quality or even price ? What the hell is going on ? Amul says that farmers in New Zealand are more mechanized and run their work like an established industry while the Indian farmers are running it like a family cooperative at a smaller scale, so we can't compete. What a pity. When the government announced a deadline for gasoline vehicles and introduction of electric vehicles, the auto industry captains were up in arms. They could not imagine getting to produce an electric scooter or a car even in five years time. Bajaj and TVS keep pointing to negligible infrastructure for charging EVs. But I don't see how they act so incapable. Tesla has been running the EV show for a decade now in the US. Why can the local auto industry come to manufacture EVs in five more years time ? To be fair, the government has backed off from setting a deadline in stone and Bajaj seems to have announced an electric scooter named Chetak recently. Who knows how much they have planned on producing though. The Maruti boss is crying hoarse, wanting GST reduction and going about shutting plants and laying off workers in panic mode. Maruti seems to have 40 thousand crores in cash reserve. The auto sales slowdown is barely 20% over the last year. Why can't Maruti manage to use its reserves to support its distributors and retails in such times ? Is a 20% drop in sales life threatening to some company as established as Maruti is and that has been running stellar profits for last two decades ? How are these companies being managed ? The industry captains cheered the recent steep corporate tax rate cut to match global levels. I am afraid the industry has just got a short term let-off the hook. Now the companies can simply project an inflated profit margin for free, no incentive to gain efficiencies anymore ?

The 5 trillion dollar economy target is unraveling fast, looking like a pipe dream now. I don't know what the government has in mind for now but at least it is responsive to industry concerns and is coming up with as many tax cuts and minor adjustments. But no major reforms, no land acquisition smoothening, no labour reforms, no judicial reforms that are truly epochal. It is still damn difficult to run even a proper legit business in India. Is this government capable of taking any so called "courageous" decisions any more, even with the brute parliamentary majority that it enjoys ? They seem to be in election mode continuously. Even state/municipal elections seem to impact national level policy announcements, mostly resulting in delays in getting real reforms done. Is it really possible that a country in this twenty first century, which is still on the edge over a Mandir-Masjid construction ever get its priorities sorted to begin the monumental nation building work that becoming a 5 trillion economy is going to require ? The Supreme court of this land has spent years deliberating on the existence of a mythological figure after all.

It is going to take time, for us Indians to get out of the jugaad mentality and take a honest look at ourselves in the mirror. To stop taking short-cuts. To be diligent. We need a complete moral, ethical and cultural change.

Tuesday, October 1, 2019

Greta Thunberg - Crusader daring

Greta Thunberg, is the name on everyone's lips and her speech all over the news. Its truly remarkable that a sixteen year old has so much awareness of industrial pollution and concern about the environment and our collective future. Very few sixteen year olds can match that up indeed ! I hope her parents or other adults are not pushing her into this cause even though its such a Nobel noble cause.

But I feel her anger is mis-directed. Listening to her, doesn't it lead you into thinking that "If only the world leaders pay heed, we would all be saved" ? Is it just the obstinacy of presidents and prime ministers which is getting us into trouble ? Clearly, getting those leaders to sign on some document won't revert this deadly threat of environment destruction. And to solve this problem, a government dictat is definitely not the right way to go.

Greta is making an fervent, emotional appeal, alright. But governments make a move based on logical, scientific and popular appeals, not emotional ones. Industries make a move based on scientific and economic data, not emotional outbursts. So Greta's thoughts should really be re-directed to those who really need to listen - the people.

The unthinking people, really. Those jet-setting all over the world, the blind consumerist who generates enormous mounds of plastic waste, who can't stay indoors without air-conditioning, nor move outdoors without gas guzzling SUV's. Those who order Californian apples and almonds into India and then travel all the way to California to eat Idli-sambar in Indian restaurants ! When are they planning to reform ?

Friday, June 21, 2019

Elon Musk BY Ashlee Vance

The extraordinary Elon Musk.


The world knows a lot more about the Zuckerbergs and Jobs' than Elon Musk. That is a problem and Ashlee Vance eliminates it with this gripping biography. You won't turn a page without an exclamation saying "That was crazy, Elon !".

Childhood

He had exceptional lineage and a well travelled, highly educated family that valued adventure and creativity. He did not have a supportive environment at his home and yet he excelled in math and science. Even as a child of 10, he had a strong sense of purpose and deemed study of some school subjects such as the Afrikaans language as pointless, partly because he did not fathom any future staying in his home country of South Africa. But when he realized that not learning them well enough would restrict him moving to the next grade, he could turn things around and do very well in all such subjects. He was bullied in school quite badly, enough for him to end up at the doctor with blood all over his face. Then very strangely he holds up that bullying he faced as a positive moulding effect on his personality and says that his own kids who would never face such cruelty would miss out on that. Being a smart parent even as a super busy entrepreneur, he restricts his kids' cartoon watching and playing games that are not something not as soft as pressing buttons for making cute sounds and encourages problem solving. At that age, Musk would often go into his "zone" - lost in his thoughts and completely unaffected by anything else happening around him. Was it one of the most over-diagnosed Silicon valley disease - Autism ? Not likely and even if it was, it would so incredibly high functioning that it should render everybody else as diseased.

Early startups

His early startups, including Paypal were true successes of his innate adventurous spirit and academic skills. They fully leveraged his natural strengths. He caught on the ideas of a fintech company and digital payments system before anyone did and while everyone else was still making a 'unprofitable-cutesy-site.com'. This is also the time when Musk habituated into a 100 hours work week. Musk was always adventurous and risk taking and such qualities truly came into common instance from hereon.

SpaceX

The work culture in the Silicon valley and other software industries is well known and well criticised. Impossible sounding deadlines are set, employees are stressed and overworked. Never mind the what the deadlines are, they are not met anyway and there is a mad scramble to deliver something, somehow. It encourages short term fixes to basic problems encountered during testing phases. Ugly remnants of past mistakes are left behind in the product and the software engineers even have a name for it - technical debt.
Now, can any company in the business of aerospace technology operate in that kind of work culture ? Mistakes in the end product are prohibitively costly. A single snag costing a human life means curtains drawn on the company itself.
Well, Yes ! Musk and his team operated in that high octane environment for years together, could not only extract the most out of his employees in terms of effort but also in terms of innovative output from them. A Boeing or some other established aerospace player tends to indulge in much paperwork with apparently slow and wasteful procedures to make a simple device or change something that failed. SpaceX employees usually made things in-house with a tenth of the budget and time. Nevertheless, the magnitude of the requirement was such that SpaceX pushed Musk to the limits and nearly bankrupt down to the last 100 grands. Don't forget that he started with around 100M out of his early ventures.
There are several innovations that emerged out of SpaceX and some of them sound quite fundamental and impactful. Not so long ago, none of the engineers and scientists in the business of making rockets reckoned that re-usable rockets was easy or even possible. The stress and thrust of a launch was too much for any material to withstand. Musk made re-usable rockets his key to achieving economies of rocket launching and thus making his company a profit out of it. Welding together large sheets of metal could be done in a much better way with friction stress that yielded lighter artifacts.
Most remarkable is Musk's ability to get down to fundamental physics with someone while reasoning and arguing about what can be done and what can't be done. Some employees find him aggressive, brusque and demanding, so be it. He has a vision and he is shooting for it.

Electric cars

On developing electric cars, Musk continued with the Silicon Valley style of project management and work culture, with apparently disastrous consequences. The project had massive cost overrun and delay and would not have been a viable show unless Musk pumped his personal wealth into keeping it going. One must say though that the if there was anyone who could build an Electric car that captured the imagination of ordinary folks and inspired the rich to buy a sports electric car, it had to be Musk. Musk did'nt want to just build a higher range electric car, he also wanted the car to go from 0 to 60 in 4 seconds. Anyone else would never have begun something so ambitious. Musk is a visionary too, and the first one to realize that improvements in the Lithium-ion battery were sufficiently good to make a full range electric car possible. Conventional car companies of the time and even today will be hard pressed to identify innovative ideas of the calibre that Musk came up with, such as free charging stations for his electric cars, touch screen controls and automatically extending door handles.

Overall ...

Musk has set for himself the loftiest goals, that most people would consider borderline insane. He wants to save humanity by accomplishing the staggeringly ambitious feat of interplanetary colonization, particularly on Mars. Note that while planning to save humanity, he didn't think of making efficient seawater de-salination or controlling global warming or making vaccines or building sanitation for all. He goes all flat out for the fantastic. And continues to calculating the number of rocket trips it might take to send people to Mars. And he could explain all this to someone while finishing dinner in a cafe with a dollop of desert sticking on his chin.
But in aiming that high, he has already accomplished so much that was previously thought impossible. It might have taken more time and more money than he imagined but he has clearly done what he declared as a near term goal !


Does this world need more Zuckerbergs and Bezos' and Jobs' ? yes.
But do we need more Elon Musks ? Bloody sure yes !

Sunday, March 4, 2018

Book Review: How Fund Managers are making you rich - by Pravin Palande


If you have been investing or even thinking of investing in mutual funds in India, you should be really curious of how the folks supposed to handle your hard earned money work. Their background, habits, methods, where and what have they studied or smoked ?
Finally, here is a book with a slightly presumptuous title clarifies most of those things.

Before moving to individual fund managers, Palande provides a good overview of the Indian capital markets. It starts with talking about the period that most of us don't remember fondly - the dot-com bust of 2000. He quotes prominent fund managers of those times who saved their fund value by not falling for the hype and madness surrounding the IT stocks then. Its all about the steadfast fund managers whorefuse to invest in the fluff companies whose business they don't yet understand while the markets were on steroids due to unreal valuations to software companies. They get the brickbats when the fund returns lag the benchmarks and how the smiles were back after the crash.

Palande also brings out the small and big innovations that have come out of Indian fund managers, such as the low cost Benchmark mutual fund ETFs and distributor-less Quantum equity fund.

In the second half of the book, each chapter is devoted to an individual fund manager. It goes from how the manager started his career, moved between fund houses, evolved their own thinking about the markets, their predilections and their methodologies. Each one is a fundamentalist - or someone doing a fundamental analysis of the stocks but each one still has some uniqueness. So much that the chapter can be tagged by their most noticeable idiosyncrasy. Each chapter gives us a view of the nature of the fund managers job and
their uniform assertion of holding humble respect towards the unpredictable beast that is the stock market.

I would have liked some more description about the software systems that the managers use and the kind of data they analyze in the book. Some details about how they decide how much cash ratio to maintain and stock allocation. Do the fund managers really have some secret weapon at their disposal that puts them at a distinct advantage of the ordinary retail investor (apart from their superior knowledge and sense of the stock market, of course) ? They do have a research team and can afford a full-time attention to detail. Still is there anything else ?

Palande ends the book with a chapter that is bound to raise the hackles of the mutual fund gurus in India. He makes a case for investing in low cost Index funds. A majority of the mutual fund managers have similar kind of data gathering and analysis mechanisms at their disposal. Palande reasons that due to a very secular availability of information and maturing of the Indian capital markets, there is a progressively low variance in the returns that mutual funds provide. Indeed, it is getting very difficult for mutual fund managers to make their funds return a premium over each other or even the index. Further, the trend is set to harden, so why put your money in an active fund that charges 2.5% + commissions versus an index fund that does with less than 1% ? The Indian mutual fund industry counters by saying that such a case for index funds is already ripe in American markets but the Indian mutual fund manager can still beat the index that justifies the higher expense ratio comfortably. This is probably the point where the CEO of Birla Sun Life AMC does not agree with Palande, that he mentions in the foreword.

So please go ahead with choosing quality funds and fund managers for now... Yes, but they only can make you rich if your are into the mutual funds or the stock markets !

Sunday, June 4, 2017

More code optimization notes


Continuing from my earlier post about code optimization techniques here.

This is all about making the code more data oriented. Which means that the code is written such that its data is organized intelligently in memory, considering CPU cache access patterns and data sizes.

As it is, extracting benefits out of any of these techniques is a black art. Some may back-fire in specific circumstances. So, benchmark... be absolutely sure to measure(twice) before cutting(once).
Time to consider a few more ways to make code run faster...


  1. Inlining functions: Yes, nothing too special about this but worth noting. Take care about its impact on size of the executables.
  2. Mark functions as constexpr: The newer C++ standards make it simple to go further than inlines. If you tell the compiler that the function can compute statically for all its inputs, make it constexpr so that the computations finish at compilation stage itself. Of course, the inputs that the callers pass to such a function should also be known fully at compile time. Inlining the function produces no guarantee of actual inlining by the compiler but if we do make the function as constexpr, we are sure that there are absolutely no runtime costs of this function at all.
  3. Provide specialized versions of a function per data type. This is not just about your function taking a polymorphic base type as an argument. C++ allows overloaded function names. It is sometimes possible to write a function slightly differently for a particular data type so that it is more optimal. Perhaps it saves you a remote call or two. But at a very minimum, if it helps runtime conditional checks, it could be worth doing.
  4. Unrolling loops: Doing more actions in one iteration of a loop.
    for(int i = 0; i < 4*x; ++i){
                // process(arr[i])
              }
        
    changes to...
    for(int i = 0; i < x; i += 4){
                   process(arr[i]);
                   process(arr[i + 1]);
                   process(arr[i + 2]);
                   process(arr[i + 3]);
              }
        
    which just saves some book-keeping conditional checks and increments in the for-loop that were happening every iteration.
  5. Pass values for anti-aliasing: Sometimes you have got to assure the compiler that some value is certainly not going to change for every iteration of some big loop. Then the compiler will not add a fresh load for that variable.
    void func(int arr[N], int* p){ // 'p' passed as pointer. Not sure if it is actually aliasing some arr[i].
                   for(int i = 0; i < N; ++i){
                       update(arr[i]);
                       // also uses 'p' 
                      // load 'p' again to ensure it is latest.
                   }
              }
        
    Here, we could pass p by value, that is good enough for the compiler.
  6. Avoid guessing for branch predictions: Sometimes, calculating all conditional paths in a piece of code and then deciding at the very end what to return could help save from a bad branch prediction cost.
  7. Process large arrays in parallel. Batch them and take care to avoid false sharing, i.e - threads that still access the same data and then block on mutual access. Make sure each thread gets a full cache line.
  8. If you process something in parallel as in the previous point, you might have to reduce at the end to get a combined result. Reduce pair-wise instead of making all threads block on a single shared data.
  9. Unroll loops based on latency of inner operations and order the operations so that maximum registers/cache is used at each point within the iteration.
  10. Separate fast and slow code. Remove exception handling out of tight code.


That's it for now.
Hoping to be mindful of such finer points.

Thursday, January 21, 2016

Find the permutation !

I first thought that solving this problem would require much longer piece of code ! Urge you to have a go at this first.

The first thought is to see if each inversion sequence has a one-to-one correspondence with the actual permutation. Given a permutation, its inversion sequence is obviously unique. Other way, if an inversion sequence has two different permutations, then we can start verifying numbers and their positions in ascending order as follows - Suppose x is the lowest number whose position does not match in the two permutations but their inversion sequence is the same. But the higher of the inversion values is not consistent since it must be made of the numbers lower than itself. So the permutation is unique too.

The brutest of brute force ideas for this problem could be to just generate all permutations of the input length and see if the criteria is met.

So the clear next approach is to try generating the permutation by considering the positions in the reverse order. Since the inversion sequence (input) only worries about numbers lesser than the current appearing to the right, we use the reverse ordering while writing out the output permutation. So for the permutation of numbers from 1-8, we should be able to write out the answer by deciding the position of '8' first then descending towards '1'.

At each step, we keep track of how many numbers lie to the right of the current position. And here is the core problem - For a permutation of size 'N' numbers ('N' in 10^^5 range), how to maintain efficiently the relative position-from-the-right for each number while some positions keep getting filled iteratively ? O(N) numbers and O(N) per number meaning overall O(N^^2) is not an option with that large 'N'.

The next option is maintaining an augmented balanced R-B Tree wherein each node keeps the count of nodes under it. Too long and complex to code...

Another data structure that comes to mind is a Binary Indexed Tree - BIT . This will help fill up a position and then update the relative positions appropriately in O(N.log N) time. This would require referring to some known implementations.

Somehow I felt that this should require something less than a BIT in full generality. Basically I got this -
1. Start with an array of size 'N' filled with values equal to the initial relative position from the right. So the values are in descending order from '(N - 1)' towards '1'.
2. Going iteratively in reverse order as thought earlier, do a modified binary search for the right position.
3. During the binary search, whenever going right, mark the current node to say that its relative position will reduce by '1' since we are on the way to fill up a position towards it's right.
4. When traversing to the left, increment the position that is sought since we know that those many positions to the right have been occupied !
5. When the correct position is found (sought position matches the calculated position), mark that position as deleted but nothing to delete from the array or shift values.

This takes O(log N) time for each of the number being placed with an overall O(N.log N)...

Here is the "accepted" code I got,
#include <iostream>
#include <vector>

class Position{
  public:
    int pos;
    int change;
    bool deleted;
    Position():change(0), deleted(false){
    }
};

int bin_search(std::vector<position>& vecpos, int seek, int left, int right){
    if(right <= left){
        vecpos[left].deleted = true;
        ++vecpos[left].change;
        return left;
    }
    int mid = left + (right - left) / 2;
    int curr = vecpos[mid].pos - vecpos[mid].change;
    if(vecpos[mid].deleted == false && seek == curr){
        vecpos[mid].deleted = true;
        ++vecpos[mid].change;
        return mid;
    }
    if(seek > curr){
        return bin_search(vecpos, seek + vecpos[mid].change, left, mid - 1);
    }
    else{
        ++vecpos[mid].change;
        return bin_search(vecpos, seek, mid + 1, right);
    }
}

int main(){
    int N;
    std::cin >> N;

    std::vector<int> inv(N);
    std::vector<position> vecpos(N);
    std::vector<int> ans(N);
    for(int i = 0; i < N; ++i){
        std::cin >> inv[i];
        vecpos[i].pos = N - i - 1;
    }

    for(int i = N - 1; i >= 0; --i){
        int pos = bin_search(vecpos, inv[i], 0, N - 1);
        ans[pos] = i + 1;
    }

    for(int i = 0; i < N; ++i){
        std::cout << ans[i];
        if(i < N - 1)
                std::cout << " ";
    }
    std::cout << "\n";
    return 0;
}


This works because for any 'leaf' node position, we always use the same binary search traversal !

It's always interesting to see patterns linking linear structures like arrays with hierarchical ones like trees.
The connection almost always requires something novel.

Sunday, December 13, 2015

Comments on : Functional programming in java script


I highly recommend a wonderful set of interactive exercises here !

It starts with set of exercises for functional programming but later on takes an unexpected turn and goes on to demonstrate features of Microsoft's reactive extensions library for java script. Nevertheless, I jumped into it initially being curious about functional programming techniques common to java script.

Starting with functional programming, there are at least 25 exercises that show how defining a small set of simple abstractions can lead to short and powerful programs. I could work out defining and using the five primary functions (map, reduce, filter, zip and concatAll).

Totally convincing exposition of the power of functional programming - the way that truly complex business logic can be described by surprisingly few lines of code involving just intelligent application of those 5 primary functions ! Basically this was the most 'fun' part of this tutorial.

Later on, not knowing how asynchronous coding and the functional programming techniques are related, the reactive extensions library is introduced. It was striking to learn and find similarities between traversing arrays (using the functional techniques) with processing events (using special objects called 'Observables'). Observables are supposed to be (possibly infinite length) sequences of events. The sequences relate to arrays and thus we can use functional programming primitives to process them. Being observables, though, allows the object and underlying sequence to change over time. Whenever the observable changes, the listeners can run code to process the event.

So the crux of the tutorial is to show how the same kind of programming techniques can help process any kind of data collections or events. It is a unified way of dealing with synchronous data like arrays as well as asynchronous objects.

I am quite used to multi-threaded programming with C++ but not exactly with writing asynchronous java script code that is commonly involved with event handling. So even though I have not really faced the difficulties in handling asynchronous actions in code first hand, I can still appreciate that asynchronicity is bound to present considerable complexity in writing efficient and correct code. The tutorial also comes with examples that show how complex the code can get once certain asynchronous calls are made in parallel and we usually define success and failure callbacks on each request. If it is required that some activity happens once *all* parallel tasks complete then then we have to check and track each request status and perform accordingly.

I still found the narrative promising - that the unified programming model is very helpful in managing that complexity. Functional programming strikes one as hard, sometimes I wondered why one should bother about functional stuff when the equivalent procedural code was so easy and a mechanical activity to write (even if it was 20 times the size). One has to get over a significant learning curve to understand why.

But that is how programming has evolved it seems over the years, in pursuing procedural code for a lower learning curve, we now realize that some things cannot be too simple.



Monday, May 11, 2015

How hard is web development ?

I have trained and worked for most of my career as a back-end engineer and I'm used to struggling with C++, algorithms and distributed computing. Recently, I have taken up some assignments in web development as part of my work. So how does this front-end development with the ubiquitous web technologies feel like and how hard can it be to do it well ?

The web technologies (javascript, HTML, CSS, HTTP and their ilk) are quite inviting to new-comers. There is an attempt to simplify creating your first web page and showing it in a browser. With just a couple of days in learning, you can achieve surprisingly significant work. Very unlike C++/algorithms/multi-threading, they absolutely have no such pretensions.

But it is a costly mistake to go any further (than developing an HTML "Hello World") without a sound understanding of the concepts. There is a vast technology stack that helps the web do what it can do and it should be systematically learnt.

Javascript, the currency of today's web development, is said to be the worlds most widely misunderstood and abused language. It works like a charm in small doses but any piece of code more than 2K lines can become a maintenance nightmare with numerous security vulnerabilities, unless developed correctly.

Web Security is a large topic by itself. A lot of serious vulnerabilities have surfaced due to a rush to introduce new functionality with security only coming in as an afterthought or post damages. Why does HTTP even exist when the technical chops for using HTTPS was already available ?

It's not just depth but also breadth of the technologies that complicate matters. We know that anything that looks particularly impressive and made using CSS wizardry, can break badly on some other browser. There is a whole matrix on browser compatibility for even HTML support.

Anyway, here is what I had to do as a first assignment : find a way to download/upload files from/to a web-server. I struggled to find a comprehensive resource that covered all the relevant issues involved here. I found myself looking at scattered bits of code all over StackOverflow that claimed to do what was required but each one had something missing. And very soon, I was reduced to testing and searching for what just "worked" instead of knowing what should work and why. Backward compatibility was found broken by Ext JS, a client-side javascript framework in terms of whether (and how) it allowed setting HTTP headers for form submits. Server code requirements were completely unspecified. Finally I got the following code working, notice how the code for downloads looks completely different from that for uploads. It uses Ext JS as a client framework so some specific API is used. The server is assumed to have a couple of REST APIs that manage the persistence of the files being transacted on, checks the CSRF token and does certain useful validations on allowable file sizes and file types.

Downloads :

Ext.define(
 'Ext.form.action.StandardSubmit', 
 {
  extend:'Ext.form.action.Submit', 
  alias: 'formaction.standardsubmit', 
  doSubmit: function() {
   var formInfo = this.buildForm(); 
   formInfo.formEl.submit(); 
   this.cleanup(formInfo);
  } 
 }
);
methodURL = 'SomeURL' + 'fetchAttachment'; // 'fetchAttachment is a particular REST API in my server.
methodParamsObject = { // this is an object required by the server code in my case
     user: Ext.encode({
      'userId':'1059',
      'roleId':'1058',
      'roleName':'Administrator',
      'userName':'admin'
      })
    };

// Insert your CSRF token if used to the methodParamsObject here ...

fileDownloadForm = Ext.create(
     'Ext.form.Panel', 
     {
      standardSubmit: true, 
      url: methodURL, 
      method: 'POST'
     }
   );


fileDownloadForm.submit({params: methodParamsObject});
fileDownloadForm.close();


Uploads :

var file_content;
var new_form;
var file_name;

var fileChangeCallback = function (e) {
 var file = file_content.files[0];
 var reader = new FileReader();
 alert("called");
 reader.onloadend = function (e) {
  alert("called CB loadend");
  if (e.target.readyState == FileReader.DONE) {
   var l_data = reader.result;
   l_data = e.target.result;
   new_form.submit();
  } else {
   alert("Failed to read file");
  }
 }
 alert("called CB");
 file_name.value = file.name;
 reader.readAsArrayBuffer(file);
};

new_form = document.createElement("form");
new_form.name = 'file_upload';
new_form.method = 'POST';
new_form.action = 'some URL' + 'addAttachment'; // Again, one REST API in my server implementation.
new_form.enctype = 'multipart/form-data'; // required for correct file handling on server.

//
var csrf_dummy = {
                 };

// Insert CSRF token here if used to the csrf_dummy object In my case, CSRF was being checked by server so passing a token string with a particular key 'CSRFTOKEN' was essential.

var csrf_token = document.createElement('input');
csrf_token.type = 'hidden';
csrf_token.name = 'CSRFTOKEN';
csrf_token.value = csrf_dummy.CSRFTOKEN;

new_form.appendChild(csrf_token);
//

var user = document.createElement('input');
user.type = 'hidden';
user.name = 'user';
user.value = "{'userId':'1059','roleId':'1058','roleName':'Administrator','userName':'admin'}";
new_form.appendChild(user);
//


file_content = document.createElement('input');
file_content.type = 'file';
file_content.name = 'fileContents';
file_content.addEventListener('change', fileChangeCallback);
new_form.appendChild(file_content);
//

document.body.appendChild(new_form);

Sunday, August 31, 2014

Code optimization patterns - my quick notes

These patterns are based on what I read about here. I am still reproducing (dumping !) it here (with as high fidelity as I can manage), primarily because these patterns are what I do use in assorted situations and also that I want to keep in deep touch with all of them and keep applying them as much as I can. These are just my notes.

Some of these ideas are thrown up as directly conflicting with Object Orientation principles, by breaking encapsulation and trashing inheritance, in favor of composition. It really is a "black art" to know the appropriate situations where I need to apply them.

Here they are :

1/ Data locality -
a/ Converting a list of pointers to a list of offsets(4 byte integers) into contiguously allocated arrays. Iterating over an array of pointers is bad - it makes memory access to random locations - array of pointers "thrash the cache" (too bad, it rhymes !). Allocating array of objects in contiguous memory locations makes it likely that successive objects will be "found" in the cache.
b/ Hot-cold separation - Perhaps the object is large and the array of objects will not fit in the cache anyway. Perhaps for most iterations, only a few specific attributes of each object are processed. Splitting a larger object into smaller sets so that more frequently accessed parts are together. This might make the entire array of objects fit in the cache too.
c/ Reduced branch mis-prediction by
c1/ sorting a list of objects based on some criteria such that iteration only happens for a subset without enclosed 'if' condition
c2/ Creating separate lists of objects based on same criteria so that sorting an entire list is avoided
c3/ Just placing the iteration loop inside the 'if' condition, not other way - simple enough.
c4/ Not mixing different types of objects together in an array if they are to be iterated upon, even if they have same base class types and having separate arrays of each type.

2/ Dirty flag - I can think of it as the poor man's lazy evaluation. Basic dirty flagging is to have an additional boolean flag to tell if the object / cache is stale and requires refreshing. It helps avoid computation until it is required and only if it is required. Another flavor to the pattern is time-stamping or sequence numbering - We have a number or time-stamp that only increments without any danger of repeating and use it to mark the 'version' of the object. Whenever the sequence number is less than some other 'master', we update the cache.

3/ Object pooling - This is basically re-using allocations of some sort - either resources like database connections, threads or simple objects or even plain memory. We just want to avoid creating and destroying objects in memory when we know they really are required in future in some other contexts. There could be low and high watermarks set so that a minimum and a maximum limit on the number of objects in the pool is enforced, for not consuming too much memory and also being ready for multiple requests in burst. If we think of multi-threading scenarios, then each thread dynamically allocating memory from Operating systems is known to be slow, due to locking required for using the common heap. If each thread can use its own pool of memory, then no locking is required, re-use is achieved and thread-specific memory pool should not end up being fragmented.

4/ Spatial partitioning - Organize objects that are "close" to each other so that given an object, the "neighbors" can be efficiently located and processed. This is obviously helpful with real geometric spaces such as nearby particles being checked for collisions. However, the distance as a metric can be extended in other ways - Given a tree like structure, the distance between any two nodes may be expressed in terms of the number of common parents between them. This helps us partition the nodes better, so that queries like finding the number of child nodes for any given node may be made more efficient. This partitioning may be limited to nodes at same levels in the tree too. This pattern does present difficulties when the nodes can actually change distances to other nodes and we have to come up with efficient ways of updating the partitioning depending on the problem context.

Monday, August 11, 2014

USA Computing Olympiad Training Problem 3.3.3 (Camelot)

This is about an algorithmic coding problem that I saw on the USA Computing Olympiad training pages. There were several others that I solved earlier to reach this stage but this one is special ... named Camelot, in section 3.3.3. It seems to have been first posed in the IOI 1998.

Attempting to solve it, swept me through the full range of emotions, going from cocky over-confidence to being nastily surprised to bewilderment to despair, frustration and finally to immense relief of eventually getting rid of it by solving it ! There was absolutely no elation at all on solving it, just massive relief, that's all !

The statement goes like this :

Centuries ago, King Arthur and the Knights of the Round Table used to meet every year on New Year's Day to celebrate their fellowship. 
In remembrance of these events, we consider a board game for one player, on which one chess piece king and several knight pieces are placed on squares, no two knights on the same square.

During the play, the player can place more than one piece in the same square. 
The board squares are assumed big enough so that a piece is never an obstacle for any other piece to move freely.

The player's goal is to move the pieces so as to gather them all in the same square - in the minimal number of moves. 
To achieve this, he must move the pieces as prescribed above. Additionally, whenever the king and one or more 
knights are placed in the same square, the player may choose to move the king and one of the knights together 
from that point on, as a single knight, up to the final gathering point. Moving the knight together with the king 
counts as a single move.

Write a program to compute the minimum number of moves the player must perform to produce the gathering. The pieces
 can gather on any square, of course.

PROGRAM NAME: camelot

INPUT FORMAT

Line 1: Two space-separated integers: R,C, the number of rows and columns on the board. There will be no more than 26 
columns and no more than 30 rows.
Line 2..end: The input file contains a sequence of space-separated letter/digit pairs, 1 or more per line. The first pair 
represents the board position of the king; subsequent pairs represent positions of knights. There might be 0 knights or the 
knights might fill the board. Rows are numbered starting at 1; columns are specified as upper case characters starting with `A'.
SAMPLE INPUT (file camelot.in)

8 8
D 4
A 3 A 8
H 1 H 8

The king is positioned at D4. There are four knights, positioned at A3, A8, H1, and H8.
OUTPUT FORMAT

A single line with the number of moves to aggregate the pieces.
SAMPLE OUTPUT (file camelot.out)

10 
SAMPLE OUTPUT ELABORATION

They gather at B5. 
Knight 1: A3 - B5 (1 move) 
Knight 2: A8 - C7 - B5 (2 moves) 
Knight 3: H1 - G3 - F5 - D4 (picking up king) - B5 (4 moves) 
Knight 4: H8 - F7 - D6 - B5 (3 moves) 
1 + 2 + 4 + 3 = 10 moves.

You could assume that the King and Knight pieces move just like the standard chess moves. The board co-ordinate annotation is exactly the same too : The columns (A-Z) start go from left to right, while the rows (1 - 30) go from down to upwards.

Basically it requires full appreciation of breadth first search. To solve it, you need to use breadth first search in novel ways, tweaking it to suit the fact that the King and the Knight pieces can join/merge while moving and adjust shortest distances accordingly.

I struggled with the problem since initially I (wrongly) assumed that even two or more Knights could join just like a King and a Knight. That could make this problem even tougher. But as such, the join only happens between King and a Knight.

The input limits on the problem are very delicately adjusted to time-out the solutions that have even one excessive nested iteration. Suppose for 'R' rows, 'C' columns and 'P' pieces, the expected time complexity should be O(R*C*P) ... For the maximum input dimensions, it works out to be approximately 600K operations, should be comfortable for a time limit of 1 second. My initial solution was working OK, but had an extra iteration on number of pieces in the innermost loop, that was timing out my solution.

Finally, I got the code below, that does these things :
1. Records the distance to each square for the King.
2. For each square on the board, records the shortest path to every other square, using breadth first search, and saves all of it.
3. Tweaks breadth first search to record shortest path from each Knight to all other squares & assuming that this Knight picks up the King. This uses information stored in (1)
4. Iterates over the entire board, checking if the square can be the final gathering point. In the innermost loop, each Knight is examined if that Knight can pick up the King to reach the candidate gathering square in the cheapest way.

#include <fstream>
#include <vector>
#include <queue>
#include <math.h>

const int INF = 10000000;

int min(int a, int b){if(a < b) return a; return b;}

int max(int a, int b){if(a > b) return a; return b;}

int kr[8] = {-2, -1, 1, 2, 2, 1, -1, -2};
int kc[8] = {1, 2, 2, 1, -1, -2, -2, -1};

class co_ords{
public:
 int r; int c;
};

class Knight_King{
public:
 int for_knight;
 int for_king;
 int r;
 int c;
};

void init(Knight_King dists[30][26], int rows, int cols, int king_row, int king_col){
 for(int rr = 0; rr < rows; ++rr){
  for(int cc = 0; cc < cols; ++cc){
   dists[rr][cc].for_king = max(abs(rr - king_row), abs(cc - king_col));
   dists[rr][cc].for_knight = INF;
  }
 }
}

class Knight_Cost{
public:
 Knight_King KK[30][26];
};

bool valid(int r, int c, int rows, int cols){
 if(r >= 0 && c >= 0 && r < rows && c < cols)
  return true;
 return false;
}
void reset(int arr[30][26], int rows, int cols){
 for(int x = 0; x < rows; ++x){
  for(int y = 0; y < cols; ++y){
   arr[x][y] = INF;
  }
 }
}
void knights_shortest(int dists[30][26], int rows, int cols, int r, int c, std::queue& q){
 dists[r][c] = 0;
 co_ords rc; rc.r = r; rc.c = c;
 q.push(rc);
 int rr, cc;
 while(!q.empty()){
  co_ords rc = q.front();
  q.pop();
  for(int i = 0; i < 8; ++i){
   rr  = rc.r + kr[i];
   cc = rc.c + kc[i];
   if(valid(rr, cc, rows, cols) && dists[rr][cc] == INF){
    co_ords rrcc; rrcc.r = rr; rrcc.c = cc;
    dists[rr][cc] = min(dists[rr][cc], dists[rc.r][rc.c] + 1);
    q.push(rrcc);
   }
  }
 }
}

void knights_paths(Knight_King costs[30][26], int rows, int cols, int r, int c, std::queue& q){
 
 costs[r][c].for_knight = 0;

 Knight_King rc; rc.r = r; rc.c = c; rc.for_king = costs[r][c].for_king; rc.for_knight = costs[r][c].for_knight;
 q.push(rc);
 int rr, cc;
 while(!q.empty()){
  Knight_King rc = q.front();
  q.pop();
  for(int i = 0; i < 8; ++i){
   rr  = rc.r + kr[i];
   cc = rc.c + kc[i];
   if(valid(rr, cc, rows, cols)){
    bool enq = ((min(rc.for_king, costs[rr][cc].for_king) + rc.for_knight + 1) < (costs[rr][cc].for_king + costs[rr][cc].for_knight));
    if(enq){
     Knight_King rrcc; rrcc.r = rr; rrcc.c = cc; rrcc.for_king = min(rc.for_king, costs[rr][cc].for_king); rrcc.for_knight = rc.for_knight + 1;
     costs[rr][cc].for_king = min(rc.for_king, costs[rr][cc].for_king);
     costs[rr][cc].for_knight = rrcc.for_knight;
     q.push(rrcc);
    }
   }
  }
 }
}

int get_sum(int dists[30][26], const std::vector& knights, int num_knights){
 int ret = 0;
 for(int i = 0; i < num_knights; ++i){
  ret += dists[knights[i].r][knights[i].c];
 }
 return ret;
}

class square{
public:
 int dists[30][26];
 square(int rows, int cols){
  for(int i = 0; i < rows; ++i){
   for(int j = 0; j < cols; ++j){
    dists[i][j] = INF;
   }
  }
 }
};

int main(){
 int rows, cols;
 char col;
 std::vector knights;
 std::queue qkk;
 std::queue q;

 int row, num_knights = 0;
 int king_row, king_col;
 std::ifstream ifs("camelot.in");
 ifs >> rows >> cols;

 ifs >> col >> row;
 king_col = (col - 'A');
 king_row = (rows - row);
 while(ifs >> col){
  ifs >> row;
  ++num_knights;
  co_ords rc;
  rc.r = rows - row;
  rc.c = col - 'A';
  knights.push_back(rc);
 }
 ifs.close();
 std::ofstream ofs("camelot.out");
 if(num_knights == 0){
  ofs << "0\n";
  ofs.close();
  return 0;
 }

 std::vector board[30];
 for(int i = 0; i < 30; ++i){
  for(int j = 0; j < 26; ++j){
   square s(rows, cols);
   board[i].push_back(s);
   knights_shortest(board[i][j].dists, rows, cols, i, j, q);
  }
 }

 std::vector cost(num_knights);

 for(int x = 0; x < num_knights; ++x){
  init(cost[x].KK, rows, cols, king_row, king_col);
  knights_paths(cost[x].KK, rows, cols, knights[x].r, knights[x].c, qkk);
 }

 int min_so_far = INF;
 int sum = INF, tmp_sum, sim_king_dist;

 for(int r = 0; r < rows; ++r){
  for(int c = 0; c < cols; ++c){
   
   sim_king_dist = max(abs(king_row - r), abs(king_col - c));
   sum = get_sum(board[r][c].dists, knights, num_knights) + sim_king_dist;

   for(int k = 0; k < num_knights; ++k){
    tmp_sum = sum;

    sum -= (sim_king_dist + board[r][c].dists[knights[k].r][knights[k].c]);

    sum += (cost[k].KK[r][c].for_king + cost[k].KK[r][c].for_knight);

    if(sum < min_so_far)
     min_so_far = sum;
    sum = tmp_sum;
   }
  }
 }

 ofs << min_so_far << "\n";
 ofs.close();
 return 0;
}



A result of a rivetting journey over the weekend...

Sunday, July 27, 2014

A Red-Black Tree - slightly unusual

The Red-Black Tree is not something I can implement without referring a standard textbook - it's implementation is just way too intricate. Even so, when you read about it, everything conceptually clicks, somehow, but you wonder how someone could have got around to actually inventing this beast !

The Red-Black Tree is the data structure of choice when a worst case O(log N) complexity is desired. As such AVL trees have preceded them and they make the same guarantees but it seems the AVL trees have lost out to the Red-Black Trees in recent implementations. The AVL tree seems to be too eager to re-balance the heights of the sub-trees resulting in larger constant factor in typical usages. Red-Black trees just don't care too much about imbalances, by only guaranteeing that any path to leaf contains a maximum of twice the number of nodes than the other paths from the same node.

So, we see that the Red-Black Tree is incorporated in some very commonly used C++ - STL structures (maps).

I have implemented the tree now, it's a standard implementation, but with a few twists on how it is used. See my GitHub source shared here...

The Red-Black tree in my implementation has the following features :

1. It has a C++ interface, but there are no pointers in the interface or internal code. There are only 4 byte unsigned integers serving as offsets into a linear structure, such as a vector or array.

2. Tree operations, even insertions, do not require copying an element or compound objects. As such the objects are maintained in a vector by the client code and the tree only maintains offsets into the client's vector. Goes without saying that the objects to be inserted need not have a default constructor either.

3. Comparison function (actually a class) is supplied by the client code and invoked by the tree. Nothing new here though ...

4. Removing elements from the tree is complemented by a free-list feature. The free-list holds a list of freed elements so that subsequent insertions can re-use memory from removed elements.

I hope to derive these advantages from these characteristics :

1. Interleaved insertions-deletions should behave better, with reduced need to allocate-release memory, maybe to the OS or to a shared memory pool.

2. Not using any pointers presents multiple benefits - the memory consumed by the tree will not change just by changing between a 32-bit and a 64-bit OS ! Since the 4 byte unsigned integral offsets all refer to memory within a single contiguously allocated array, the memory references should be more compact/localized. Cache hits should be better than what could have been if memory for nodes was allocated individually.

3. No copying of objects involved. Default constructor on elements to be inserted is not required. Only 4 byte offsets referring to the container are stored. Unfortunately, such references mean that offsets of existing elements cannot be changed, that might happen if the container's elements were to be deleted.

This is something in work too - I am planning to implement a companion container to this tree, that can support deletions too. Also to come are a few other data structures that are not commonly found in libraries...

Saturday, July 26, 2014

About API design and copy semantics

This is a tale of how a combination of counter-intuitive copy semantics and API design kept me debugging a piece of code, late into the night, for several hours... Late night coding might be fun but it there is never any fun in working in a crisis mode, with time pressure and against a bug.

I suspect that the reasons behind my ordeal, are also the reasons why most programmers tend to vastly under-estimate the time it will take them to finish a programming task.

Sequences or contiguously allocated arrays usually offer some facilities to copy their internal representation, that is efficient, and shallow. Certainly more efficient than simply iterating over the entire sequence and copying each object by value. Typically such sequences have a raw buffer where the data is managed, a 4 or 8 byte 'length' to know the size of that buffer and whether the sequence 'owns' the buffer or not.

Ownership of the raw buffer is the "hot potato", the sequence which ends up with it should be the one releasing the buffer's memory. The consequences are ghastly - If both of them release or if both fail to release.

When a copy of a sequence is required, into another sequence, the pointer to the raw buffer is assigned, the 'length' is copied and the ownership is relinquished by the original sequence depending on what the caller wants.

Consider such feature from a CORBA sequence, and an implementation from 'Orbacus',
typedef OB::FixSeq< ::CORBA::ULong, int > MSeq; /* CORBA's fixed sequence of integers */

MSeq source;
source.length(2); /* Fill some data to test copy */
source[0] = 1;
source[1] = 2;

MSeq target; /* Empty sequence */

bool takeOwnership = true; /* Let's have target sequence acquire ownership, and source sequence relinquish it too ! */

/* Now actual call to copy sequence into target  */
target.replace(source.maximum(), 
               source.length(), 
               source.get_buffer(takeOwnership), 
               takeOwnership);


I noted the following painful aspects of the 'replace()' call, that is supposed to do the efficient copy :

1. The name itself, 'replace' is suggestive of an action to 'change the guts' of the target sequence, but implies nothing about the fact that even the source sequence can be totally 'gutted' through passing certain choice of options transferring ownership.

2. How much of grunt work is expected from the caller ! The function takes 4 arguments including the length, maximum allocation of the source sequence and buffer, all internal details of the source sequence... Why does it not do with just a reference to source sequence and ownership option, and use all such internal detail from the source sequence itself ? After all, 'replace()' is a member function.

3. I find that boolean option of 'takeOwnership' is accepted twice in the same function call - first to tell the source that it should relinquish ownership and second to have the target sequence acquire it. Now, I can't think of a situation where the caller might desire to pass 'takeOwnership' as 'true' in first and 'false' in second or the other way. In fact, using different values for the ownership caused a crash that I debugged. The code was deep into the implementation and the results are totally counter-intuitive.

/* This code crashes  */
/* We are copying between 3 sequences : original -> intermediate -> final */
#include "OB/CORBA.h"
#include 

int main(){

typedef OB::FixSeq< ::CORBA::ULong, int > MSeq;

MSeq original;
original.length(2);
original[0] = 1;
original[1] = 2;


MSeq intermediate; 
intermediate.replace(original.maximum(), 
                     original.length(), 
                     original.get_buffer(false), 
                     true);

std::cout << intermediate.length() << " " << intermediate[0] << " " << intermediate[1] << "\n";


MSeq final;
final.replace(intermediate.maximum(), 
              intermediate.length(), 
              intermediate.get_buffer(false), 
              true);

std::cout << final.length() << "\n"; /* Length is as expected ==> 2 */

std::cout << final[0] << " " << final[1] << "\n"; // Still ... Crashes here...

return 0;

}

/*
To build :
g++ -I  -L  -lOB -lJTC -ldl -lpthread 

*/


What is certainly bad is that the length of the 'final' sequence is 2 but the internal buffer is just not there, any access causes a crash. The resulting sequence is simply not internally consistent.

The 'replace()' function API is designed such that it accepted too many arguments, that should have been deduced from a source reference. And that made it easy for a caller to mis-use the API. In that sense, it fails to have certain characteristics mentioned in an excellent piece from Joshua Bloch here !

Saturday, January 18, 2014

Beginning with Haskell

I just gave myself permission to spend some time learning a new programming language, and here are my notes on my initial effort. I bet anyone who has read this, has on their radar, to learn some functional programming language. I chose Haskell. Judging by the experiences of others attempting the same thing, and with vastly more neural firepower than me, maybe that is a brave decision !

So, yes, Haskell is daunting from the outside, but I am only first checking out if it is really possible to make any progress without being an expert in Category Theory or Type Systems ... After all, Haskell is supposed to be very general purpose. Can it give me some prop by allowing a humble "Hello World" come out easy ?

Here is one of the first programs I attempted : calculating the 'n'th Fibonacci number.

With Haskell, this is a first attempt,

fib n
       | n == 0 = 0
       | n == 1 = 1
       | n > 1 = fib(n - 1) + fib(n - 2)


So terse and clean. I could challenge someone to write a specification of the Fibonacci program in plain English and in fewer characters. Even without any introduction to the syntax, we can see through this affair easily, as if someone just wrote the pseudo code. Of course, as horribly performing as a naive C++ program following the same pattern ! Calculating the 30th Fibonacci takes upwards of 5 seconds and if someone asks for the 60th Fibonacci, this program won't return before the Haley's comet and much later. Exponential algorithmic complexity can bite hardest, even with Haskell.

So here is a result after more hours of reading this fantastic introductory online book named "Learn You a Haskell" :

fib_improved n = fst $ fib_helper n (0, 1)

fib_helper 0 (a, b) = (a, b)
fib_helper n (a, b) = fib_helper (n - 1) (b, a + b)


Somehow, this is equally short as the earlier version but is super fast. Asking for the 100th Fibonacci instantly gets me this result : "354224848179261915075" !

The last line of this code tells the whole story, which is using recursion to iterate over a "tuple" repeatedly and transforming the tuple in the process. Probably this textual explanation does a far worse job (at explaining) than what the single line of Haskell makes obvious. The first two lines are just special cases and presentation.

The patterns I see so far are twofold :
1. Recursion is cool, but the algorithmic considerations are paramount. Haskell, with its principles about immutable data, presents additional twists to the efficient coding game. We need to worry about unnecessary copying (to implement data immutability), and how not to blow up the stack even while using recursion.
2. One of the techniques to make things efficient in Haskell is to pass additional arguments to function calls and "build" an answer through them, instead of first calling the function and then processing the returned data from a function. That is what the second version of the Fibonacci program does differently from the first.

Encouraging !


Tuesday, June 19, 2012

Money matters

I never thought that any development in Graph Theory would have anything to do with war between countries. But it seems to. Sometime during the cold war era, the Americans were solving the problem of how to break up the extensive soviet railroad network with the least effort. So which are the places to bomb and destroy railway lines such that there is a maximum disconnect between railway stations ? You just find a "min-cut" for the graph of towns and connecting railway lines. The same min-cut helps in efficiently blowing up oil pipelines and electricity supply as well. And so, in due course of time, we got a few more pages added to the book of the Theory of Graphs.

Well, the deeper point is notice what motivates innovation. During times such as the cold war, innovation in America was driven by a country-wide feeling of competition, threat, fear or maybe mass hysteria. It can lead people to work harder and think harder. There was rapid development in not just missile technologies but also space missions, electronics, energy and of course, computing. Now, the cold war is over and the emotions have cooled down. But people are still doing smart things.

My guess is this. I feel that the collective demographic feelings can only partly explain the rapid developments in that period or any period. For any creative effort to truly succeed, someone has to finance all the efforts. In that period, the governments were rather generous in funding research into all areas that were deemed important. So a lot of money went into missions to the moon and it bore spectacular results. Today, the money is flowing into different areas and that is where all the action is happening. The things that people value can change from time to time but the research projects which click are the ones where the funding is active. Of course, people can build great things without much money (software) or without much secrecy (Open Source Software). But they won't be able to send a man to Mars and certainly cannot build a controlled nuclear fusion reactor. My point is that development is not stalled because of limits of human intellect, but because nobody really values it enough to spend hard-earned money on it.

An article by an astrophysicist named Neil Tyson here makes the same point very elegantly. Elected governments may say that their first priority is to build "clean and green" renewable energy projects but how can we find if the powerful folks really serious about it ? I'd simply look at the kind of money they are spending on it.

Monday, February 13, 2012

Monstrosity as in C++

I am writing this piece out of a renewed terror of C++, particularly after reading some of the items from "Effective C++/More Effective C++" by Scott Meyers (http://aristeia.com/books.html) and now moving on to "Modern C++ Design" by Andrei Alexandrescu (http://www.moderncppdesign.com).
I have been using C++ for practically most of the last 10 years but it still succeeds in throwing up dark corner cases and quirks. Particularly jarring are some of them which go against my intuition of how the code should behave. My experience is in the same vein as here and here. More criticism of C++ is easily found all over the web, some of it coming from star programmers like Linus Torvalds as well. Whenever I am comfortable with using C++ and reading C++ code, it is just when the code uses the small sub-set of C++ that I use and I know. And everyone just seems to use their own favorite sub-set of C++.
An example is the items in the books by Meyers. Many of them are very informative and it never feels like a drab listing, thanks to the lively style of Meyers' writing. Most appreciable are items about class design and efficiency matters. In contrast, most troubling items are those that attempt to explain, cope with or "engineer" around C++ syntax. Operator overloading still tops my list of nonsense and unnecessary features. If it’s just syntactic sugar, it is making everyone a diabetic.
To be fair, C++ has had to deal with supporting new features as well as maintaining backward compatibility with C. It is feature-rich and multi-paradigm. It seems that individual features of C++ like Classes, Inheritance, Templates, Exception Handling and others are very useful and well thought of. However, when these facilities combine in a real world code, they have a devastating effect. Explanations of smart "techniques" to somehow handle each and every combination case increasingly venture into the territory of the obscure and the ugly. Lack of garbage collection and memory/system issues only worsen the situation. So when Meyers tries to work out such complex issues, it just strikes me that such situations should just be avoided at all costs. There should be simpler ways of doing such things. Knowing C++, it is apparent that there would be many ways of doing the same thing, some simple, others not so.
I hope to write something more over the next weeks about C++ and particularly, "simpler" ways of doing things and anything truly useful, like compile time checks.