ECVictor

E-Commerce Mobile | Platform Design Montreal

ECVictor

E-Commerce Mobile | Platform Design Montreal

Blog

How to Lose $440 Million in 30 Minutes

Post from http://dougseven.com/2014/04/17/knightmare-a-devops-cautionary-tale/

This the story of how a company with nearly $400 million in assets went bankrupt in 45-minutes because of a failed deployment.

Background

Knight Capital Group is an American global financial services firm engaging in market making, electronic execution, and institutional sales and trading. In 2012 Knight was the largest trader in US equities with market share of around 17% on each the NYSE and NASDAQ. Knight’s Electronic Trading Group (ETG) managed an average daily trading volume of more than 3.3 billion trades daily, trading over 21 billion dollars…daily. That’s no joke!

On July 31, 2012 Knight had approximately $365 million in cash and equivalents.

The NYSE was planning to launch a new Retail Liquidity Program (a program meant to provide improved pricing to retail investors through retail brokers, like Knight) on August 1, 2012. In preparation for this event Knight updated their automated, high-speed, algorithmic router that send orders into the market for execution known as SMARS. One of the core functions of SMARS is to receive orders from other components of Knights trading platform (“parent” orders) and then send one or more “child” orders out for execution. In other words, SMARS would receive large orders from the trading platform and break them up into multiple smaller orders in order to find a buyer/seller match for the volume of shares. The larger the parent order, the more child orders would be generated.

The update to SMARS was intended to replace old, unused code referred to as “Power Peg” – functionality that Knight hadn’t used in 8-years (why code that had been dead for 8-years was still present in the code base is a mystery, but that’s not the point). The code that that was updated repurposed an old flag that was used to activate the Power Peg functionality. The code was thoroughly tested and proven to work correctly and reliably. What could possibly go wrong?

What Could Possibly Go Wrong? Indeed!

Between July 27, 2012 and July 31, 2012 Knight manually deployed the new software to a limited number of servers per day – eight (8) servers in all. This is what the SEC filing says about the manual deployment process (BTW – if there is an SEC filing about your deployment something may have gone terribly wrong).

“During the deployment of the new code, however, one of Knight’s technicians did not copy the new code to one of the eight SMARS computer servers. Knight did not have a second technician review this deployment and no one at Knight realized that the Power Peg code had not been removed from the eighth server, nor the new RLP code added. Knight had no written procedures that required such a review.
SEC Filing | Release No. 70694 | October 16, 2013

At 9:30 AM Eastern Time on August 1, 2012 the markets opened and Knight began processing orders from broker-dealers on behalf of their customers for the new Retail Liquidity Program. The seven (7) servers that had the correct SMARS deployment began processing these orders correctly. Orders sent to the eighth server triggered the supposable repurposed flag and brought back from the dead the old Power Peg code.

Attack of the Killer Code Zombies

Its important to understand what the “dead” Power Peg code was meant to do. This functionality was meant to count the shares bought/sold against a parent order as child orders were executed. Power Peg would instruct the the system to stop routing child orders once the parent order was fulfilled. Basically, Power Peg would keep track of the child orders and stop them once the parent order was completed. In 2005 Knight moved this cumulative tracking functionality to an earlier stage in the code execution (thus removing the count tracking from the Power Peg functionality).

When the Power Peg flag on the eighth server was activated the Power Peg functionality began routing child orders for execution, but wasn’t tracking the amount of shares against the parent order – somewhat like an endless loop.

45 Minutes of Hell

Imagine what would happen if you had a system capable of sending automated, high-speed orders into the market without any tracking to see if enough orders had been executed. Yes, it was that bad.

When the market opened at 9:30 AM people quickly knew something was wrong. By 9:31 AM it was evident to many people on Wall Street that something serious was happening. The market was being flooded with orders out of the ordinary for regular trading volumes on certain stocks. By 9:32 AM many people on Wall Street were wondering why it hadn’t stopped. This was an eternity in high-speed trading terms. Why hadn’t someone hit the kill-switch on whatever system was doing this? As it turns out there was no kill switch. During the first 45-minutes of trading Knight’s executions constituted more than 50% of the trading volume, driving certain stocks up over 10% of their value. As a result other stocks decreased in value in response to the erroneous trades.

To make things worse, Knight’s system began sending automated email messages earlier in the day – as early as 8:01 AM (when SMARS had processed orders eligible for pre-market trading). The email messages references SMARS and identified an error as “Power Peg disabled.” Between 8:01 AM and 9:30 AM there were 97 of these emails sent to Knight personnel. Of course these emails were not designed as system alerts and therefore no one looked at them right away. Oops.

During the 45-minutes of Hell that Knight experienced they attempted several counter measures to try and stop the erroneous trades. There was no kill-switch (and no documented procedures for how to react) so they were left trying to diagnose the issue in a live trading environment where 8 million shares were being traded every minute . Since they were unable to determine what was causing the erroneous orders they reacted by uninstalling the new code from the servers it was deployed to correctly. In other words, they removed the working code and left the broken code. This only amplified the issues causing additional parent orders to activate the Power Peg code on all servers, not just the one that wasn’t deployed to correctly. Eventually they were able to stop the system – after 45 minutes of trading.

In the first 45-minutes the market was open the Power Peg code received and processed 212 parent orders. As a result SMARS sent millions of child orders into the market resulting in 4 million transactions against 154 stocks for more than 397 million shares. For you stock market junkies this meant the Knight assumed approximately $3.5 billion net long positions in 80 stocks and $3.15 billion net short positions in 74 stocks. In laymen’s terms, Knight Capital Group realized a $460 million loss in 45-minutes. Remember, Knight only has $365 million in cash and equivalents. In 45-minutes Knight went from being the largest trader in US equities and a major market maker in the NYSE and NASDAQ to bankrupt. They had 48-hours to raise the capital necessary to cover their losses (which they managed to do with a $400 million investment from around a half-dozen investors). Knight Capital Group was eventually acquired by Getco LLC (December 2012) and the merged company is now called KCG Holdings.

A Lesson to Learn

The events of August 1, 2012 should be a lesson to all development and operations teams. It is not enough to build great software and test it; you also have to ensure it is delivered to market correctly so that your customers get the value you are delivering (and so you don’t bankrupt your company). The engineer(s) who deployed SMARS are not solely to blame here – the process Knight had set up was not appropriate for the risk they were exposed to. Additionally their process (or lack thereof) was inherently prone to error. Any time your deployment process relies on humans reading and following instructions you are exposing yourself to risk. Humans make mistakes. The mistakes could be in the instructions, in the interpretation of the instructions, or in the execution of the instructions.

Deployments need to be automated and repeatable and as free from potential human error as possible. Had Knight implemented an automated deployment system – complete with configuration, deployment and test automation – the error that cause the Knightmare would have been avoided.

A couple of the principles for Continuous Delivery apply here (even if you are not implementing a full Continuous Delivery process):

  • Releasing software should be a repeatable, reliable process.
  • Automate as much as is reasonable.

How to rename a mysql database name

Mysql workbench does not provide a way to easily rename a database.

However,you could use those commands in the terminal

mysqldump -u username -p -v olddatabase > olddbdump.sql mysqladmin -u username -p create newdatabase mysql -u username -p newdatabase < olddbdump.sql

Or

SELECT concat('RENAME TABLE $1.',table_name, ' TO $2.',table_name, ';')
FROM information_schema.TABLES 
WHERE table_schema='$1';

($1 and $2 are source and target respectively)

 

CSV – Excel – Saving Accented Characters

CSV – Excel – Saving Accented Characters


Problem:

I need to import my modified Excel file containing accented characters (French, German, etc) into my database system.

Solution:

If you’re using Microsoft Excel to create your CSV file, you may run into an issue:

When you’re using accented characters (such as é, ç, ü, etc.) – Microsoft Excel does not generate a UTF-8 compliantCSV file. To resolve this issue, please do the following:

  • Save the Excel file as a CSV file.
  • Open the CSV file generated by Excel using another software which can create UTF-8 compliant CSV files.

Windows

  • Open the file using Notepad.
  • Click “File > Save As”.
  • In the dialog window that appears – select “UTF-8” from the “Encoding” field. Then click “Save”.

Mac

  • Use either the “Numbers” application, or the free LibreOffice, instead of Excel.
  • Click “File > Save As”.
  • In the dialog window that appears – select “UTF-8” from the “Encoding” field. Then click “Save”.

Your saved file will be ready to import into your UTF-8 compliant database system.

From CSV – Excel – Saving Accented Characters

Downgrade WordPress from 4.4 to 4.2

If you find some plugins does not work with your current wordpress version, you might want to downgrade your latest version to the old one.

Here how it works:

  1. Backup your old site, expecially the wp-content , .htaccess and wp-configure.php
  2. Download the older version from https://wordpress.org/download/release-archive/
  3. Uzip the older version zip file, delete the wp-content folder
  4.  DELETE all your current files in your current hosting wordpress folder:

– / except .htaccess and wp-config.php

– /wp-admin

– /wp-includes

Be sure to KEEP all of /wp-content

      5. Upload the older version to your hosting site wordpress folder

      6. Copy you /wp-content back.

Done. Now your site should be back to the older version now.

 

Improve your website speed

https://developers.google.com/speed/pagespeed/insights

Is very useful in testing your web site speed. It will give you all the information about why your website is slow.

The most easy way to make your site faster is compress your images.

Optimize images

Using small size images can greatly reduce your download time.

Gzip compress

Enable Gzip compress is the first task you need to consider, https://developers.google.com/speed/docs/insights/EnableCompression.

Using min.js instead of the full version js package

Usually mini version js already covered your most usage, using full version will generate more server load and takes more time for downloading, using CDN version is also a good idea to reduce the traffic.

Optimize your website

Problem

Lots of our team members found our web site is really slow these days, they are asking us to investigate the issue and optimize our website

Benefits

Improve our website performance will greatly improve our work efficiency and customer experience, while I am doing the investigation ,I also find that it can increase your web site SEO ranking.
fisrtbit

Testing tool

We running the online tool  http://www.webpagetest.org to test our website’s performance and we get the following result:

0F5C8461-A188-489C-B49D-6B40E55FB7FE

4E20BC73-28FA-4041-A970-DD5104E19D44

What is Time to First Byte?

First Byte Time Applicable Objects Time to First Byte for the page (back-end processing + redirects)
What is checked The target time is the time needed for the DNS, socket and SSL negotiations + 100ms. A single letter grade will be deducted for every 100ms beyond the target.

 

Solutions

  1. Disable all the un-used plugins. Some wp plugins are CPU intensive costing, it is better to disable those you don’t need.
  2. Using WP Super Cache. In the settings, turn on the Caching to Caching on. In CDN,also enable the Enable CDN Support.

After these two steps,I got

4338730C-1A85-4C3F-A86A-59956E3F0101

The time to first byte resolved.

TOP