Kromco network failure highlights need for fault-tolerant design

“Rather than just offering support, Clarotech recognised the impact of our problem and responded like a business partner with an equal stake in its resolution.”

Rupert Swanepoel

Manager Information Systems, Kromco

Industry:

Packing

Offering:

The packing and marketing of apples and pears for both local and international markets

Problem:

Catastrophic network failure

Solution:

Total network redesign by Clarotech with consideration for stability and business needs

Services: 

Network support services, disaster recovery services, network design and engineering services, consulting, project management, and professional implementation

Result:

A fault-tolerant network that minimises risk and isolates failures while providing flexibility for growth

Clarotech qualifications

 Clarotech is an Information, Communications and Telephony Company offering consulting, products and support services to businesses of all sizes in South Africa and beyond. Operating since 2001, the company satisfies the need for advice, solutions and ongoing service from a team of experienced consultants and support engineers that can be trusted.

Clarotech prides itself in making IT and Telephony simple and effective for our clients. Colin Fair, Managing Director of Clarotech states, “We strive to ensure we remain relentlessly relevant, offer superior value and continue to provide sustainable solutions through proven end-to-end methodologies. This ensures our client’s investments are future-proofed’’.

Colin went on to add ‘’It is very simple, we are our clients trusted technology advisors and we take this responsibility seriously. By taking ownership of what happens to our clients systems proactively, we take on their worries so that they can get on with running their businesses, undistracted’’.

Clarotech holds key competencies in the following areas, namely: Computing, Communication, Call Centre and Endpoint Management.

 

About Kromco

Kromco is one of the largest deciduous fruit packing facilities in South Africa’s Western Cape Province, specialising in the packing and marketing of apples and pears for local and international markets. To maintain its excellent reputation for quality and prompt supply, the company employs the latest packing and storage technologies.

 

Network reliance

Kromco’s business complex spans 3 km2 and is located far from the metropolitan area. The company therefore depends heavily on its network for transferring and processing data across departments, facilitating operational continuity, and both internal and external communication. Many of its operations are heavily automated and highly sophisticated, with state-of-the-art production line equipment running 24/7. Kromco’s network is therefore a business-critical enabler that keeps the entire company functioning.

 

Clarotech’s role

Kromco has utilised Clarotech as its IT infrastructure partner since 1991. Currently, Clarotech mainly manages the administration and packing house sections of the plant because of the high concentration of networking components in those areas. The Clarotech team is involved in the VoIP PBX implementation as well as the ongoing support of the fibre connectivity and internet services.  It also provides technical input and other support services where required.

 

Network failure

In August 2016, the Kromco network suffered severe connectivity problems over a period of three days, which eventually led to total network failure. At that point, all production, logistics and administration activities at Kromco ceased, leaving the company unable to continue operating.

Clarotech network engineers were invited to troubleshoot the issue on site. They took three days and two nights to discover the source the problem, restore connectivity and allow production at the company to resume. Even after the problem had been found, Clarotech staff continued to monitor the network for errors.

“Clarotech didn’t only fix the problem. They worked towards a 100% stable environment.”

Rupert Swanepoel

Manager Information Systems, Kromco

Cause and consequences

The investigation team identified a network switch not managed by Clarotech (and therefore not visible to its managed services sensors) as the cause of the failure. The appliance was a temporary link that had never been replaced and, due to incorrect configuration, had started flooding the network with error broadcasts to the degree that it became overloaded and unable to service normal requests.

As a result, Kromco could not do business for three full days and its staff, unable to work, were sent home. The company incurred financial losses as well as compomising its reputation for reliability and prompt delivery.

Disaster review

Because of the severity of the network outage and its impact on the business, Clarotech defined the situation at Kromco as a disaster. This led the team to follow a predefined set of protocols to deal with and recover from such an event.

Although the network failure was simple to correct, the extensive damage it caused prompted Kromco management to reassess the stability of its network and acknowledge the need for disaster recovery planning. A team of IT engineers and management staff from Clarotech and Kromco was assembled and began the process by:

  • Investigating all possible recovery options thoroughly
  • Redesigning the entire Kromoco network based on best practice principles
  • Standardising on HP Aruba Switch equipment for core network infrastructure
  • Developing complete network redundancy for the entire site

Network redesign

Kromco’s network has to support eight different departments scattered across its 3 km2 site. The redesigned network therefore involved the installation of a 1Gb fibre backbone with redundant links to ensure that if one department experiences problems, the rest remain unaffected. The new design also proposed a second redundant data centre.

During the Project Definition Workshop, a detailed plan was developed on how to execute the new network design. Each step of the network implementation was documented and tasks were allocated to the project team members. Timelines, checkpoints and testing scenarios were written into the scope of the project.

The network switches have the ability to segment the network virtually, a key factor in the design of the new topology at Kromco. This functionality allows for the creation of separate VLANs for independent services like telephony, messaging, industrial control and more. The management of each virtual network is also simplified and much easier to control. Together, the increased security, improved network management and redundancy drastically improved the network performance and ultimately allows the company to operate at its full potential.

Most of the new network was implemented within a year,and it is now fully operational and functioning well,  providing Kromco with the following benefits:

  • Because of network segmentation, devices can be easily added when required; an important requirement due to Kromco’s high level of automation
  • Fewer network disruptions are experienced, leading to less downtime, increased production time and higher revenue
  • IT staff have higher visibility of the network and are able to proactively manage and make changes to it without adversely affecting existing devices
  • The network is flexible and easy to expand
  • IT staff can limit maintenance downtime to individual departments

“Regardless of a global IT skills shortage, Clarotech retains a complement of highly qualified and competent staff.”

Rupert Swanepoel

Manager Information Systems, Kromco

Challenges and solutions

A major challenge to the network restructure project was that Kromco operates 24/7 and can’t afford extended downtime. Clarotech worked closely with Kromco’s IT and management staff to plan the implementation of the new network in phases and at the least disruptive times.

Key success factors for this type of project are:

  • Engage a suitable IT vendor with high-quality network design and implementation experience
  • Ensure all key stakeholders are involved in the planning and design stage of the project
  • Understand and plan around the implementation constraints of industrial 24x7 operations
  • Collaborate closely with the IT vendor over the entire deployment and testing cycle to ensure consistency and compliance with standards
  • Understand the impact of arbitrary network changes
  • Set up clear accountability
  • Ensure the IT staff maintaining the network are thoroughly trained on the its layout and components
  • Select the test platform together with the IT vendor
  • Avoid making assumptions and follow strict testing procedures
  • Document all aspects of the project thoroughly

 

Outcomes

The new network ensures that future faults are isolated to their network segment and are easier to detect and correct. This means that total network failure is unlikely and Kromco is assured of continuous operations.

Kromco can now focus on other business-related projects that have come to the fore. Applications in the area of robotics and communications are being investigated. There is confidence in the network to support these new projects, which will ultimately lead to better business processes, improved production capabilities and a greater competitive advantage.

During the network restructure project, Kromco and Clarotech worked together to put the company in the best possible position for the future. Clarotech’s approach to the project was methodical and considered, and the proposed solution was optimal for Kromco’s business and financial objectives. An even higher level of respect developed between Kromco and Clarotech during the project, helping to ensure a successful outcome. The next phase of the project is underway and focuses on disaster simulation to evaluate the effectiveness of redundancy. Clarotech and Kromco are true partners in ensuring the ongoing success of the business.

“Regardless of a global IT skills shortage, Clarotech retains a complement of highly qualified and competent staff.”

Rupert Swanepoel

Manager Information Systems, Kromco

Fault-tolerant network design

Fault tolerance enables a system to continue operating despite failure in one or more of its components.

For most modern companies, their network is a vital component of their business. Without it, operations can grind to a halt. If this happens, the price they pay in lost business and opportunity costs can be devastating.  Companies should therefore do everything in their power to protect its integrity. However, many networks grow organically to provide capacity for an ever expanding set of data processing requirements. Because of this, critical design principles like fault tolerance are neglected, setting the organisation up for inevitable disaster.

An experienced and trusted technology partner like Clarotech, with high quality network design, engineering, management and support resources, is essential to the health of your network and business continuity. With Clarotech as their IT infrastructure provider, a company can turn their unstable network into a sustainable business platform that aligns with their strategic needs and enables future expansion.