Tuesday, November 15, 2016

The Myth of Machine Learning

Recently there has been a surge in the number of companies claiming machine learning capabilities, from startups to large organisations. The media is full of "machine learning" claims, investors seem to be dropping large amounts into startups claiming to have "machine learning" or "artificial intelligence" capabilities. From Logo design companies to delivery companies, they all claim to have implemented "machine learning". I recently saw a press release for a US based app development company that had raised a significant 7 figure sum, claiming they had developed "Human assisted machine learning" ! One has to ask, what is that ? Any machine "learning" has to be assisted by humans anyway, who would configure the algorithms, applications and hardware, not to mention the training ?

Neural Networks and AI genetic algorithms have been around for a long time. So why now ? The answer is data (and to some extent computing power).  In order to even attempt to do anything smart with "learning algorithms" one must have large sets of data. Since "Analytics" we have that data, sensors and devices connected to the Internet and mini-computers in our pockets, always connected. So gathering data is not a problem. The problem is what to do with the data.

Fundamentally (despite outlandish claims by the media) , Neural Networks can be "trained" to classify sets of input data into categories.


The diagram above shows a plane in 3D space that separates two sets of data (classification). If we project that plane into 2D space we have a non-linear equation.




If there is a means of determining if the classification was successful or not the classification can "learn" as the input evolves. Most current claims of "machine learning" are really simple rules that are able to cluster input data, there is nothing much too these claims. These are more like the old "expert systems" which have a pre-programmed set of rules that help the computation determine some output result. You can imagine this as a large set of IF THEN statements. It is a myth that these systems can learn or that there is any form of "intelligence" in these systems. Somehow we have jumped from these very limited capabilities to machines running the Earth. I guess the one thing we can learn from the media of late (BREXIT and Trump) is that you can't believe what you read !

Whilst I am positive that there are some very clever people out there working on all sorts of clever algorithms, I feel that propelled by media hype there is a big myth surrounding Machine Learning. In some ways it is in the interests of academics, business leaders, marketeers ect. to promote this myth as this fuels their funding.



Thursday, June 16, 2016

The CTO journey: 5 Things I have learnt

I remember hearing a quote from Herb Cohen, the respected US negotiator. It went something like "I never get called when everything is going well, I dunno who get those calls, but it ain't me !". This is true of many startup CTO's. Usually there is some failed outsourced relationship, or a technical co-founder who takes the product to a point and then gets stuck.

I have received many calls over the years and the problems are usually similar. This was indeed the case at my current position. Here are some things I have learnt ....


  1. Focus on the core product: There will be a lot of noise around the product, features that are not essential, wish lists, dreams. Stick to the basics ! Unless you get the fundamentals sorted out there is no future for the product. This requires single-mindedness, and possible a few "debates", the founders will want to get as many features in the product as possible. Don't give in ! The main role initially is knowing what NOT to do ...
  2. Mostly use tried and tested technology: When you are launching an MVP, or you have to deliver a core product quickly. Use technology that is well established and well documented. The experimental innovative stuff can come later. There is still plenty of room for innovation withing the confines of established technology. 
  3. Solve first - automate later: Often there are great ideas/features that will automate some process or some workflow. Often these can be done manually or can be done for example by writing some simple scripts that aid the process. Get these "manual" or "semi-automated" ways of doing things done first before jumping in and automating. Initially, there is often no need, no scale and within a changing environment it does not make sense to automate processes that are constantly changing or evolving. This can come later. 
  4. Protect your developers: whether you have an outsourced relationship or an in-house development team. Developers, designers etc get fatigued by uncoordinated feedback. Make sure that feedback (bug reports etc.) are handled in a coordinated fashion. The quickest way to destroy job satisfaction is by allowing everyone in the team to point out deficiencies. Also if you are outsourcing development, you want to keep the relationship sound. Protect this relationship. 
  5. Manage expectations: It is always good to over-deliver and under promise. Be realistic and communicate this at all times. So many times problems are caused by engineers saying what they think they should, as opposed to what they know they should.

Writing this today, I feel as if yet another journey has been undertaken and I start the next phase with our team at Knowledgemotion. We have secured investment from Ingram and signed a major deal with Pearson. They next phase will be taking the core product further, and hopefully beyond and expectation !

Sunday, May 15, 2016

Shold Cloud Computing Costs be Regulated ?

I am not one for bureaucracy, in fact the contrary. I cringe at unnecessary bureaucracy and regulation. The face is that we are becoming a more highly regulated society, that being stated, I do feel there are some areas where we need regulation to protect the consumer. One such area is the regulation of costs by commodity suppliers, telecommunications, energy, fuel etc.

As CTO of a emerging start-up I am responsible for ensuring that our systems run on the latest and most secure technology.  Cloud services provide that scalable infrastructure and so naturally I use them extensively. However, let me tell you a story of one warm Friday night last July.

I was just about to go on paternity leave and settling down in my lovely new bath for a Friday night soak, when the phone rang. A chap was phoning from Amazon Web Services (AWS). He politely enquired if I was aware that our company had just incurred over $20 000 dollars of charges transferring some data from Glacier to S3 storage. My heart skipped a few beats, at that stage in our companies life the $20K would almost certainly bankrupt us and leave me without a job on the eve of the birth of my son ! Needless to say, I expressed my horrer at this news and began the long-winded process of composing various emails to Amazon to recoup the money.

Earlier in the day, I had read the AWS pricing guides relating to Glacier. I found them to be rather complex but I ploughed through. I have some ability in mathematics and two engineering degrees, I thought might assist me in understanding how to calculate the cost of transferring some data. I took out a pen and paper and after a long-winded calculation came up with a figure of around $300. I double checked this and when I was certain instructed my team to make the transfer. Obviously I got the calculation slightly wrong, by a factor of about 100 !

This is not an isolated case, I have heard of many very qualified engineers tripping up at computing cloud computing costs. This is because the pricing always involves a number of very elusive quantities, data transfer per unit of time. CPU minutes used etc etc. These quantities are very hard to calculate a-priori and have never previously formed part of and computational analysis.

I think it is time that our regulators take a look at these "Cloud services" and set some guideline on fair and understandable pricing. Afterall Cloud computing services are just another commodity supply that is crutial to the way we live and do business. We cannot be (a) held to ransom and (b) confused to the point where we are unable to accurately calculate the future costs.

The story ends well, AWS refunded the money. I have been slowly transferring my data ever since.