Archive for June, 2009
Book Notes -> Cloud Computing: Web-Based Applications That Change the Way You Work and Collaborate Online
Posted by Will Eatherton in Uncategorized on June 7th, 2009
Cloud Computing: Web-Based Applications That Change the Way You Work and Collaborate Online
by Michael Miller
Publisher: Que
Pub Date: August 11, 2008
Print ISBN-10: 0-7897-3803-1
Print ISBN-13: 978-0-7897-3803-5
Web ISBN-10: 0-7686-8622-9
Web ISBN-13: 978-0-7686-8622-7
Pages: 312

Review :
Overall didnt care for this book. Trying to merge business aspects of cloud computing from an IT perspective with a book talking about how to use cloud computing at home with your family (e.g. family email) seems silly. There were a couple very small sections that were directly useful below but rest was waste of time overall.
What i took Away from Book :
Book Notes :
Discovering Cloud Services Development Services and Tools
So let’s settle back and take a look at who is offering what in terms of cloud service development. It’s an interesting mix of companies and services.
Amazon
The service in question is called the Elastic Compute Cloud, also known as EC2. This is a commercial web service that allows developers and companies to rent capacity on Amazon’s proprietary cloud of servers—which happens to be one of the biggest server farms in the world. EC2 enables scalable deployment of applications by letting customers request a set number of virtual machines, onto which they can load any application of their choice. Thus, customers can create, launch, and terminate server instances on demand, creating a truly “elastic” operation.
Amazon’s service lets customers choose from three sizes of virtual servers:
* Small, which offers the equivalent of a system with 1.7GB of memory, 160GB of storage, and one virtual 32-bit core processor
* Large, which offers the equivalent of a system with 7.5GB of memory, 850GB of storage, and two 64-bit virtual core processors
* Extra large, which offers the equivalent of a system with 15GB of memory, 1.7TB of storage, and four virtual 64-bit core processors
In other words, you pick the size and power you want for your virtual server, and Amazon does the rest.
EC2 is just part of Amazon’s Web Services (AWS) set of offerings, which provides developers with direct access to Amazon’s software and machines. By tapping into the computing power that Amazon has already constructed, developers can build reliable, powerful, and low-cost web-based applications. Amazon provides the cloud (and access to it), and developers provide the rest. They pay only for the computing power that they use.
AWS is perhaps the most popular cloud computing service to date. Amazon claims a market of more than 330,000 customers—a combination of developers, start-ups, and established companies.
Note
For more information about Amazon Web Services, go to aws.amazon.com.
Google App Engine
Google is a leader in web-based applications, so it’s not surprising that the company also offers cloud development services. These services come in the form of the Google App Engine, which enables developers to build their own web applications utilizing the same infrastructure that powers Google’s powerful applications.
The Google App Engine provides a fully integrated application environment. Using Google’s development tools and computing cloud, App Engine applications are easy to build, easy to maintain, and easy to scale. All you have to do is develop your application (using Google’s APIs and the Python programming language) and upload it to the App Engine cloud; from there, it’s ready to serve your users.
As you might suspect, Google offers a robust cloud development environment. It includes the following features:
* Dynamic web serving
* Full support for all common web technologies
*Persistent storage with queries, sorting, and transactions
* Automatic scaling and load balancing
*APIs for authenticating users and sending email using Google Accounts
In addition, Google provides a fully featured local development environment that simulates the Google App Engine on any desktop computer.
And here’s one of the best things about Google’s offering: Unlike most other cloud hosting solutions, Google App Engine is completely free to use—at a basic level, anyway. A free App Engine account gets up to 500MB of storage and enough CPU strength and bandwidth for about 5 million page views a month. If you need more storage, power, or capacity, Google intends to offer additional resources (for a charge) in the near future.
Note
For more information about the Google App Engine, go to code.google.com/appengine/.
IBM
It’s not surprising, given the company’s strength in enterprise-level computer hardware, that IBM is offering a cloud computing solution. The company is targeting small- and medium-sized businesses with a suite of cloud-based on-demand services via its Blue Cloud initiative.
Blue Cloud is a series of cloud computing offerings that enables enterprises to distribute their computing needs across a globally accessible resource grid. One such offering is the Express Advantage suite, which includes data backup and recovery, email continuity and archiving, and data security functionality—some of the more data-intensive processes handled by a typical IT department.
To manage its cloud hardware, IBM provides open source workload-scheduling software called Hadoop, which is based on the MapReduce software used by Google in its offerings. Also included are PowerVM and Xen virtualization tools, along with IBM’s Tivoli data center management software.
Note
For more information about IBM’s Blue Cloud initiative, go to www.ibm.com.
Salesforce.com
Salesforce.com is probably best known for its sales management SaaS, but it’s also a leader in cloud computing development. The company’s cloud computing architecture is dubbed Force.com. The platform as a service is entirely on-demand, running across the Internet. Salesforce provides its own Force.com API and developer’s toolkit. Pricing is on a per log-in basis.
Supplementing Force.com is AppExchange, a directory of web-based applications. Developers can use AppExchange applications uploaded by others, share their own applications in the directory, or publish private applications accessible only by authorized companies or clients. Many applications in the AppExchange library are free, and others can be purchased or licensed from the original developers.
Not unexpectedly, most existing AppExchange applications are sales related—sales analysis tools, email marketing systems, financial analysis apps, and so forth. But companies can use the Force.com platform to develop any type of application. In fact, many small businesses have already jumped on the Force.com bandwagon.
For example, an April 2008 article in PC World magazine quoted Jonathan Snyder, CTO of Dreambuilder Investments, a 10-person mortgage investment company in New York. “We’re a small company,” Snyder said, “we don’t have the resources to focus on buying servers and developing from scratch. For us, Force.com was really a jump-start.”
Note
For more information about Force.com and AppExchange, go to www.salesforce.com.
Other Cloud Services Development Tools
Amazon, Google, IBM, and Salesforce.com aren’t the only companies offering tools for cloud services developers. There are also a number of smaller companies working in this space that developers should evaluate, and that end users may eventually become familiar with. These companies include the following:
* 3tera (www.3tera.com), which offers the AppLogic grid operating system and Cloudware architecture for on-demand computing.
* 10gen (www.10gen.com), which provides a platform for developers to build scalable web-based applications.
* Cohesive Flexible Technologies (www.cohesiveft.com), which offers the Elastic Server On-Demand virtual server platform.
*Joyent (www.joyent.com), which delivers the Accelerator scalable on-demand infrastructure for web application developers, as well as the Connector suite of easy-to-use web applications for small businesses.
*Mosso (www.mosso.com), which provides an enterprise-level cloud hosting service with automatic scaling.
* Nirvanix (www.nirvanix.com), which offers a cloud storage platform for developers, as well as Nirvanix Web Services, which provides file management and other common operations via a standards-based API.
* Skytap (www.skytap.com), which provides the Virtual Lab on-demand web-based automation solution that enables developers to build and configure lab environments using pre-configured virtual machines.
* StrikeIron (www.strikeiron.com), which offers the IronCloud cloud-based platform for the delivery of web services, along with various Live Data services that developers can integrate into their own applications.
In addition, Sun Microsystems has an R&D project, dubbed Project Caroline (www.projectcaroline.net), that provides an open source hosting platform for the development and delivery of web-based applications. Access to Project Caroline’s grid is free to the general public.
The Maturity Level of Cloud Services
To understand where the web-based applications we call cloud services stand in the evolution of hosted computer software, we turn to our good friends at Microsoft, who defined four primary maturity levels.
The first level of maturity defines the traditional application service provider (ASP) model of software delivery, and dates back to the 1990s. At this level, each user has his own customized version of the hosted application and runs his own instance of the application on the host server.
The second level of maturity occurs when the vendor hosts a separate instance of the application for each customer. At this level, all instances use the same implementation; the code is not customized for each user, as it is in a level-one application. Instead, user personalization is provided by detailed configuration options within the application itself.
The third level of maturity signals a major change in how the application is hosted. At this level, the vendor runs a single instance of the application that serves every user. A unique user experience is provided via configurable metadata, and authorization and security policies ensure that each user’s data is kept separate from that of other users.
At the fourth and final level of maturity, the vendor hosts multiple users on a load-balanced farm of identical instances. Because the number of servers (and instances) can be increased or decreased as necessary to match demand, this type of system is scalable to a large number of users. In addition, patches and upgrades can be rolled out to the entire user base as easily as to a single user. It is to this level that cloud services aspire.
Book Notes -> Beautiful Teams
Posted by Will Eatherton in Uncategorized on June 7th, 2009

Beautiful Teams
by Andrew Stellman; Jennifer Greene
Publisher: O’Reilly Media, Inc.
Pub Date: March 27, 2009
Print ISBN-13: 978-0-596-51802-8
Pages: 512
Review :
This is a book about Software teams, and is a set of essays by different authors. I recognize first chapters authoer (O’Reilly), and another author Scott Berkun who wrote an easy to ready book about software program management.
Overall decent book (didnt feel great to me). Very specific examples for software programmers mostly in small groups within large company (which of course covers 100k’s of folks if not milliones on planet at this point).
What i took Away from Book :
I wouldn’t mind reading through some of the stories in more detail when I have time. More of a relaxtion type thing then taking notes type of actvity.
– would make a good audio book to listen too at some point
Book Notes :




Introduction - Leadership - Tim O’Reilly
– quote “The skill of mgmt is achieving your objectives through efforts of others”
– talks about how some devices have ‘truths’ in them that is the right way ti shoudl be. Touch screen on ipod, EVDO in kindle. Can’t imagine it any other way
– talks about open source quite a bit
– having systems architected right, dont want team overtly dependent on single vision or leader
– calls out importance of aesthetic in business and in software
+ goal of science/religon is to build vision of an aesthetic people can believe in
+ gives examples of relationships with people that have common views of truths and aesthetics and how it helps build you up since you share accomplishments
–
Chapter 2 - Why Ugly Teams Win - Scott Berkun
– “Team comprimishing only of 4.0 GPA prodigies will never get ugly. Will never take big risks, never make big mistakes. Deep personal trust can not grow without mistakes, risks. For a team to make something beautiful there must be ugliness along the way. A team of beautiful perfect people, when face with pressure their selfish drives willt ear the team apart.
– Only use of beauty applied to teams taht makes sense is japanese concept of Wabi-sabi // special beauty found in things that have been used
Chatper 5 - What makes developers Tick - Andy Lester, Interview format,
– on Perl develoepr team
– ”geeks don’t gauge their audience well generally“
Part III - Practicies
– Two big risks that teams face when trying to adopt a new practice
1) practice is stupid, but person pushing it doesnt know it. e.g. useless status meetings
2) practice is good, but team doesnt get it, e.g. code reviews
– to put new practice in place need to convince everyone worth doing, and those that have to ‘pay’ either in $$ , in time, or other to see value.
Talk about extremen programming and agile programming
Chatper 21 - Teams and Tools - Karl Fogel
– impact of good tools on team’s ability to collaborate
– example tool for helping to detail quickly the contributions of an invidiual on open source projects - automated dashboard
– example showing that adding a few seconds overhead to a common task is enough to make the task uncommon
+ in code review email, have everything needed to minimize effort
–
Chapter on Google
Chapter on Boeing Software design
Book Notes -> Interconnecting Data Centers Using VPLS
Posted by Will Eatherton in Uncategorized on June 7th, 2009
Interconnecting Data Centers Using VPLS (Ensure Business Continuance on Virtualized Networks by Implementing Layer 2 Connectivity Across Layer 3)

By: Nash Darukhanawalla - CCIE No. 10332; Patrice Bellagamba
Last Updated on Safari: 2009/06/10
Publisher: Cisco Press
Pub Date: June 24, 2009 (estimated
Pages: 384
Review :
Overall was a good book to run through detailed view of issues with data center interconnect. DCI does indeed have different set of issues and challenges then simply enterprise-enterprise, or hub and spoke enterprise arrangements.
I skipped a lot of the intermediate chapters as it seemed to deviate from the core topics that I wanted to pull from this book as far as looking at DCI technology considerations more at the top level. I liked the last chapter on future evolution of DCI.
What i took Away from Book :
Chapter 1 : Data Center Layer 2 Interconnect
Q. What is quorum disk ? “mechanisms such as the quorum disk avoid a split-brain condition”
Basic chapter summary is spanning tree protocols don’t scale well for large distributed interconnected data centers
Chapter 2. Appraising Virtual Private LAN Service
This chapter describes that L2 VPNs don’t scale well for large interconnected data centers, L3 VPNS can be unacceptable due to control issues between enterprise/SP’s. So one answer is Virtual Private Lans (VPLS)
Chapter 3. High Availability for Extended Layer 2 Networks
Chatper 6
Chapter 13. Evolution of Data-Center Interconnect
– TRILl and Cisco Data Center Ethernet (DCE) aim to use L3 routing protocols with L2 data layer bridging. This involves having hop count at L2 Layer.
– as part of TRILL/DCE there is a routing/bridge rBridge
– mac in mac is used
– there are edge Rbridges and core Rbridges (core Rbridges never see edge MACs
– IS-IS is used to crate paths between core nodes
In future expect to have routing protocols populating mac tables instead of automatic learning.
Book Notes :
Nash Darukhanawalla, CCIE No. 10332, has more than 25 years of internetworking experience. He has held a wide variety of consulting, technical, product development, customer support, and management positions. Nash’s technical expertise includes extensive experience in designing and supporting complex networks with a strong background in configuring, troubleshooting, and analyzing network systems.
Nash has been with Cisco for more than 10 years and is currently an engineering manager in the Enhanced Customer Aligned Testing Services (ECATS) group in the Advanced Services organization.
Patrice Bellagamba, has been in the networking industry for more than 25 years and has spent more than ten years in engineering development. He is a consulting engineer and is a recognized expert in IP and MPLS technologies. He is one of the influencers in the development of MPLS and has lead MPLS techtorials at Networkers in Europe since its inception.
He is the inventor of EEM semaphore concept and is the designer of VPLS based solutions this book describes.
Table of Contents
* Chapter 1, “Data Center Layer 2 Interconnect”: This chapter privides an overview of high availability clusters. In addition, this chapter also explains DCI legacy deployment models and problems associated with extending Layer 2 networks.
* Chapter 2, “Appraising Virtual Private LAN Service”: This chapter discusses Layer 2 and Layer 3 VPN technologies and provides introduction to VPLS, Pseudowires, Embedded Event Manager, and MPLS.
* Chapter 3, “High Availability for Extended Layer 2 Networks”: This chapter focuses on design components like MTU, core routing protocol, and convergence optimization techniques to achieve high availability.
* Chapter 4, “MPLS Traffic-Engineering”: This chapter covers the implemetation of MPLS-TE for load repartition of L2VPN traffic over parallel links. In addition, this chapter also introduces FRR for faster convergence.
* Chapter 5, “Data Center Interconnect: Architecture Alternatives”: This chapter highlights several options for implementing data center interconnect. In addition, this chapter provides the positioning of each solution to assist organizations in selecting an appropriate solution based on their requirements like scalability, ease of implementation, data center STP type, and so on.
*Chapter 6, “Case Studies for Data Center Interconnect”: This chapter provides real case studies of data center interconnect solutions described in this book.
*Chapter 7, “Data Center Multilayer Infrastructure Design”: This chapter highlights Cisco’s data center multitier model and provides network topology, hardware, software, and the traffic profiles used for validating designs described in this book.
* Chapter 8, “MST-Based Deployment Models”: This chapter covers “MST in Pseudowire” and “Isolated MST in N-PE” solutions and provides configuration details for implementing both these solutions.
*Chapter 9, “EEM-Based Deployment Models”: This chapter explains “EEM semaphore protocol” developed to achieve N-PE redundancy in the absence of ICCP. In addition, this chapter describes various EEM based VPLS and H-VPLS solutions, provides in-depth theory of operation of each solution and provides configuration details.
* Chapter 10, “GRE-Based Deployment Models”: This chapter provides VPLS and H-VPLS solutions over IP network using VPLSoGRE.
* Chapter 11, “Additional Design Considerations”: This chapter introduces various other technologies or isses that should be considered while designing data center interconnect solutions.
* Chapter 12, “VPLS PE Redundancy using Inter-Chassis Communication Protocol”: This chapter introduces ICCP protocols, provides different redundancy mechanisms and sample ocnfiguration.
*Chapter 13, “Evolution of Data Center Interconnect”: This chapter provides an brief overview of the emerging technologies and the future of DCI.
* Glossary: This element provides definitions for some commonly used terms associated with DCI and the various deployment models discussed in the book.
Overview of High Availability Clusters
High availability (HA) clusters operate by using redundant computers or nodes that provide services when system components fail. Normally, if a server with a particular application crashes, the application is unavailable until the problem is resolved. HA clustering remedies this situation by detecting hardware/software faults, and immediately providing access to the application on another system without requiring administrative intervention.
HA clusters usually are built with two separate networks:
* The Public Network, used to access the active node of the cluster from outside the data center
* The Private Network, used to interconnect the nodes of the cluster for private communications within the data center and to monitor the health and status of each node in the cluster

* The private network is a non-routed network that shares the same Layer 2 VLAN between the nodes of the cluster even when extended between multiple sites.
* some HA cluster vendors recommend disabling the Spanning Tree Protocol for the private interconnect infrastructure, such a drastic measure is neither necessary nor recommended when using Cisco Catalyst switches
Problems Associated with Extended Layer 2 Networks
A common practice is to add redundancy when interconnecting between two data centers to avoid split-subnet scenarios and interruption of the communication between servers
The split-subnet is not necessarily a problem if the routing metric makes one site preferred over the other. Also, if the servers at each site are part of a cluster and the communication is lost, mechanisms such as the quorum disk avoid a split-brain condition.

There are problems associated with an extended Layer 2 network.
Spanning Tree Protocol (STP) operates at Layer 2 of the Open System Interconnection (OSI) model and the primary function of the Spanning-Tree Algorithm (STA) is to prevent loops that redundant links create in bridge networks. By exchanging bridge protocol data units (BPDUs) between bridges, the STP elects the ports that eventually forward or block traffic.
The conservative default values for the STP timers impose a maximum network diameter of seven. Therefore two bridges cannot be more than seven hops away from each other.
When a BPDU propagates from the root bridge toward the leaves of the tree, the age field increments each time the BPDU goes though a bridge. Eventually, the bridge discards the BPDU when the age field goes beyond maximum age. Therefore, convergence of spanning tree is affected if the root bridge is too far away from some bridges in the network.
An aggressive value for the max-age parameter and the forward delay can lead to an unstable STP topology.
2. Appraising Virtual Private LAN Service
Key objectives for using Layer 3 routing instead of Layer 2 switching to interconnect data centers include the following:
* Resolve core links quality problems using L3 routing technology
* Remove data center interdependence using spanning-tree isolation techniques
*Storm propagation control
* Allow safe sharing of infrastructure links for both L2 core and L3 core traffics
*Adapt to any data center design such as RSTP/RPVST+ / MST / VSS / Nexus / Blade switches
* Allow VLAN overlapping and therefore virtualization in multi-tenants or service-provider designs
most common L3 VPN technology is MPLS L3 VPN, which delivers multipoint services over a Layer 3 network architecture.
However, MPLS Layer 3 VPNs do impose certain provisioning requirements on both service providers and enterprises that sometimes are unacceptable to one or the other party. Some enterprises are reluctant to relinquish control of their network to their service provider. Similarly, some service providers are uncomfortable with provisioning and managing services based on Layer 3 network parameters, as is required with MPLS L3 VPNs.
Layer 2 Virtual Private Networks
For multipoint L2 VPN services, service providers have frequently deployed Ethernet switching technology using techniques such as 802.1Q tunneling (also referred to as tag stacking or Q-in-Q) as the foundation for their Metro Ethernet network architecture.
Switched Ethernet network architectures have proven to be successful in delivering high-performance, low-cost L2 VPN multipoint services by service providers in many countries. However, as the size of these switched Ethernet networks have grown, the limitations of the scalability of this architecture has become increasingly apparent:
* Limited VLAN address space per switched Ethernet domain
* Scalability of spanning tree protocols (IEEE 802.1d) for network redundancy and traffic engineering
* Ethernet MAC address learning rate, which is important to minimize broadcast traffic resulting from unknown MAC addresses.
These limitations, which are inherent in Ethernet switching protocols, preclude the use of Ethernet switching architectures to build L2 VPN services that scale beyond a metropolitan area network domain.
To address the limitations of both MPLS L3 VPNs and Ethernet switching, and because classical Layer 2 switching was not built for extended, large-scale networks, innovations in network technology have led to the development of Virtual Private LAN Service (VPLS).
VPLS Overview
Key Summary :: Virtual Private LAN Service (VPLS) is an architecture that provides multipoint Ethernet LAN services, often referred to as Transparent LAN Services (TLS) across geographically dispersed locations, using MPLS as transport.
VPLS is being adopted by Enterprises on a self-managed MPLS-based metropolitan area network (MAN) to provide high-speed any-to-any forwarding at Layer 2 without the need to rely on spanning tree to keep the physical topology loop free. The MPLS core uses a full mesh of pseudowires and split-horizon to avoid loops.
IETF VPLS drafts describe the concept of linking virtual Ethernet bridges by using MPLS PseudoWires (PWs). At a basic level, VPLS is a group of Virtual Switch Instances (VSIs) that are interconnected by using EoMPLS circuits in a full mesh topology to form a single, logical bridge. In concept, a VSI is similar to the bridging function found in IEEE 802.1q bridges.
* a frame is switched based upon the destination MAC address and membership in a Layer 2 VPN (a virtual LAN or VLAN).
* VPLS forwards Ethernet frames at Layer 2, dynamically learns source MAC address to port associations, and forwards frames based upon the destination MAC address
* The virtual forwarding instance (VFI) identifies a group of pseudowires that are associated with a VSI.
![]()
Glossary : N-PE Network-facing Provider Edge router, acts as a gateway between the MPLS core and edge domain
VPLS service consists of three primary components:
* Attachment Circuits: Connection to customer edge (CE) or aggregation switches, usually Ethernet but can be ATM or Frame Relay
* Virtual Circuits (VC or pseudowire): Connections between N-PEs across MPLS network based on draft-martini-l2circuit-trans-mpls-11
* Virtual Switch Instances (VSI): A virtual Layer 2 bridge instance that connects attachment circuits to VCs.
The VPLS specification outlines five components specific to its operation:
* VPLS Architecture ( VPLS and H-VPLS): The Cisco 7600 series routers support flat VPLS architectures and Ethernet edge H-VPLS topologies.
* Auto-discovery of PEs associated with a particular VPLS instance: The Cisco 7600 series routers supports manual neighbor configuration and BGP-Autodiscovery.
*Signaling of Pseudowires (PWs) to interconnect VPLS VSIs: The Cisco 7600 series routers use LDP to signal draft-martini psuedowires (VC).
* Forwarding of Frames: The Cisco 7600 series routers forward frames based on destination MAC address.
* Flushing of MAC addresses due to topology change: From IOS release 12.2(33)SRC1, Cisco 7600 series routers support MAC address flushing and allows configuration of MAC address aging timer.
What is a Pseudowire?
A pseudowire is a point-to-point connection between pairs of PE routers.
– emulate services such as Ethernet, ATM, Frame Rely or TDM over an underlying core MPLS network through encapsulation into a common MPLS format.
– Pseudowire is a mechanism that carries the elements of an emulated service from one PE router to one or more PEs over a packet switched network (PSN).
When used as a point to point (P2P) Ethernet circuit connection the pseudowire is known as EoMPLS, This P2P connection can be used in the following modes:
*xconnect ports (EoMPLS port-mode) with whatever frame format on the port (that is, with or without dot1Q header). In this mode, the xconnected port does not participate in any local switching or routing.
* xconnect port VLAN (also known as sub-interfaces) allows to selectively extract a VLAN based on its dot1Q tag and point to point xconnect packets. Local VLAN significance usually is required in this mode and is supported today only on ES-modules facing edge also known as facing the aggregation.
* xconnect SVI (interface VLAN) in a point-to-point fashion. This approach requires SIP or ES module facing core (also called “facing the MPLS network”).
Pseudowires are the connection mechanisms between switching interfaces (VFI) when used in a multi-point environment (VPLS). This approach allows xconnecting SVI in a multipoint mode by using one pseudowire per destination VFI with MAC-address learning over these pseudowires.
Pseudowires are dynamically created using targeted-LDP based on either a “neighbor” statement configured in the VFI or automatically discovered based on MP-BGP announcements. This feature is beyond the scope of this book because it requires BGP to be enabled in the network.
In Cisco IOS Software Releases 12.2(33)SRB1 and 12.2(33)SXI, the targeted-LDP IP address for the pseudowire can be independent from the LDP router-ID, which is required for scalability and load-balancing.
VPLS to Scale STP Domain for Layer 2 Interconnection
EoMPLS requires that STP be enabled from site to site to provide a redundant path, so it is not a viable option.
VPLS is natively built with an internal mechanism known as Split Horizon, the core network does not require STP to prevent layer 2 loops.
![]()
Even though VPLS bridge is inherently protected against Layer 2 loops, loop prevention protocol must still be used against local Layer 2 loops in the access layer of the data centers where cluster nodes are connected. Therefore, for each solution described in this book, VPLS is deployed in conjunction with Embedded Event Manager (EEM) to ensure loop prevention in the core due to full mesh of pseudowires and redundant N-PEs in each location. Edge node or edge link failure is protected using VPLS and EEM to customize the solution behavior based on network events as they occur.
Embedded Event Manager (EEM)
The Cisco IOS Embedded Event Manager (EEM) is a unique subsystem within Cisco IOS Software. EEM is a powerful and flexible tool to automate tasks and customize the behavior of Cisco IOS and the operation of the device. EEM consists of Event Detectors, the Event Manager, and an Event Manager Policy Engine.
You can use EEM to create and run programs or scripts directly on a router or switch. The scripts are called EEM policies and can be programmed with a simple CLI or by using a scripting language called Tool Command Language (Tcl). Policies can be defined to take specific actions when the Cisco IOS software recognizes certain events through the Event Detectors. The result is an extremely powerful set of tools to automate many network management tasks and direct the operation of Cisco IOS to increase availability, collect information, and notify external systems or personnel about critical events.
EEM helps businesses harness the network intelligence intrinsic to Cisco IOS software and gives them the ability to customize behavior based on network events as they happen, respond to real-time events, automate tasks, create customer commands and take local automated action based on conditions detected by the Cisco IOS software.
EEM is a low priority process within IOS. Therefore, it is important to consider this fact when using EEM on systems that are exposed to environments in which higher priority process may monopolize routers CPU resources. Care should be taken to protect the routers CPU from being hogged by higher priority tasks such as broadcast storms. It is recommended to allocate more time for low priority processes using IOS command process-max-time 50. In addition, Control Plane Policing (CoPP), storm control, and event dampending should also be deployed to prevent CPU hog.
3. High Availability for Extended Layer 2 Networks
Pure IP Core
The deployment of MPLS in the core network might not always be the best option, particularly in the following cases:
* Intersite links leased from service providers that provide only IP connectivity
*The existing IP core network is too complex for integration of MPLS
*Data center traffic exiting the Enterprise network must be encrypted
In these situations, a better solution is a direct encapsulation of VPLS or EoMPLS traffic into Generic Routing Encapsulation (GRE), or MPLS encapsulated at the data center egress point via MPLSoGRE. Also, any crypto engine can be deployed if the IP traffic—L3VPN or L2VPN over GRE—needs to be encrypted.
In the pure IP core model, all Layer 2 traffic is tunneled using MPLSoGRE. Figure 3-5 illustrates the pure IP core model.

Some of the possible solutions mentioned earlier have the following characteristics:
* EoMPLSoGRE approach:
oWhen using a Cisco 7600 series router as the N-PE, EoMPLSoGRE is supported only in port-mode or sub-interface mode, which allows the point-to-point cross-connect of interconnect edge devices.
o When using a Cisco Catalyst 6500 series switch as the N-PE, EoMPLSoGRE can interconnect switched virtual interfaces (SVIs) in addition to provide support for port and sub-interface mode. When interconnecting SVIs, the N-PE can participate in local Spanning Tree Protocol (STP) or execute QinQ and tunnel all incoming VLANs to QinQ VLAN.
*VPLSoGRE approach:
oAs of this writing, VPLSoGRE is supported only on Cisco Catalyst 6500 switches with the SIP-400 module. One GRE tunnel is configured on the N-PE for each N-PE in remote data centers for encapsulation of VPLS traffic. MPLS must be enabled on the tunnel end-points.
* MPLSoGRE approach:
oWhen all the Layer 2 and Layer 3 traffic must be encapsulated into IP, a solution is to use a dedicated router to perform encapsulation and another router as the N-PE for standard L2VPN implementation.
o These successive encapsulations will increase packet MTU. The GRE header is 24 bytes while the L2VPN header can be up to 30 bytes leading to 1554-byte MTU, however this total byte count does not include the additional core link header.
These approaches, in which MPLS traffic is encapsulated in GRE, are less efficient than plain MPLS imposition. In addition, if the IP traffic, L3VPN or L2VPN over GRE must be encrypted, deployment of an IP crypto engine should be evaluated.’
6. Case Studies for Data Center L2 Interconnect
Data Center Interconnect (DCI) is attracting attention because it is one of the pillars of generalizing the concept of virtualization across data centers. Both new server middleware usages, which offer clustering for high availability or flexibility for servers’ virtual organization (VMotion), and more traditional requirements for physical server migration increasingly require an ability to extend VLAN (Layer2 bridging, L2) across data center sites.
GOV is a large government organization that provides networking solutions and data center infrastructure to several other government entities.
Challenges
To increase the availability of key applications, GOV’s IT department decided several years ago to implement a server clusters strategy. This strategy provided good application redundancy and scalability and significantly improved GOV’s ability to recover from server and operating system failures.
However, to benefit from new networking features, the implementations required cluster members to reside in the same network subnet. In addition, clusters relied on heartbeats that must run in a dedicated VLAN. To take advantage of current cluster technologies, GOV had to extend most VLANs within each data center.
Furthermore, GOV needed to improve its high-availability capabilities. In addition to handling server and operating system failures, clustering had to provide solutions for situations such as partial data-center power failures or site inaccessibility. Addressing these requirements meant extending clusters across multiple sites.
Like many other data centers, GOV’s data centers began also to encounter physical constraints. Insufficient power, limited space, and inadequate cooling posed insolvable issues with server physical organization and operation, which led to GOV not even being able to install a new cluster member when application performance required it.
Solution
To address these issues, GOV determined that it required a solution that included a multi-site VLAN extension.
The initial solution was a spanning tree protocol (STP) design that controlled four data centers in a global switching domain. GOV carefully followed best practices for L2 design, but the optical topology of the sites’ interconnection was unable to match standard STP recommendations; dual hub and spoke topology and Dense Wave Division Multiplexing (DWDM) protection are considered a must for STP. In addition, the size of the STP domain started to increase above any common implementation.
GOV operated its networks using this design for one year. During this time, several small failures occurred, which led to unpredicted results. In one instance, for example, a link failure did not report the loss of signal, leading STP to slow convergence. Every heartbeat over every data center timed out; consequently all clusters experienced a split-brain situation. Resynchronization took more than one hour to recover; during this time, all critical applications stopped operating. Other small failures had similar catastrophic effects. As a result, GOV contacted Cisco for recommendations about how to strengthen its network.
Working in partnership with the GOV networking team, the server cluster vendor, and an external consulting team, Cisco recommended a VPLS solution as described in this book.
The solution team also determined to provide Multiprotocol Label Switching (MPLS) features such as VRF (Virtual Routing & Forwarding), to provide user-group security segmentation, and traffic engineering, to better manage link loads.
After thorough testing and a pilot phase, the solution was deployed in three GOV data centers. A fourth data center was added soon
![]()
To build the L3-VPN network and the 10Gbps MPLS core, GOV selected a Cisco Catalyst 6500 switch with a 67xx line card. This approach allows the easy deployment of VRF within the aggregation layer. L3-VPN extends to all data centers and toward user sites.
To enable the MPLS Traffic-Engineering (TE) feature, the routing protocol had to be link-state based, so the choice was reduced to either open shortest path first (OSPF) or intermediate system to intermediate system (IS-IS) routing. In a network of this size, IS-IS and OSPF offer quite similar capabilities, but IS-IS allows a clear demarcation with existing OSPF routing that simplifies deployment. GOV decided to select IS-IS as its MPLS core routing protocol.
Routing fast convergence is set with a target of a few hundred of milliseconds. Bidirectional Forwarding Detection (BFD) is used to detect long-distance link failure, which allows the system to react in approximately ½ second to any non-forwarding link. (GOV plans to include the MPLS Fast-ReRoute (FRR) function in future implementations, with the objective of achieving even more convergence on clear link failures.)
To implement the VLAN extension design, the most advanced N-PE was Cisco 7600. Because Ethernet Service (ES) cards were not yet available at that time, GOV selected a SIP-600 card to provide 10Gbps. (An ES card would be the right choice now.)
GOV selected H-VPLS (Hierarchical-VPLS) with Embedded Event Manager (EEM) scripts to provide STP isolation and long-distance link protection.
Four data centers required a VLAN extension to allow cluster extension. The solution included the Cisco 7600 N-PE on a stick. Figure 6-2 illustrates he concept of a “node on a stick.”
![]()
In above the blue boxes with red iP traffic are cat6k, and the N-PE is 7600
VPLS technology was quite recent at the time of implementation. The “on a stick” design allowed GOV to avoid the insertion of new devices with the new Cisco IOS Software and new features along the existing L3 path. In this way, VPLS failure would affect only L2 traffic, not IP traffic.
L2 traffic first passes in a bridge fashion through the aggregation Cisco Catalyst switch, then is encapsulated in VPLS by the Cisco 7600 N-PE and pushed back to the Cisco Catalyst switch via a MPLS L3 port. Then traffic flows to the MPLS core.
The Cisco 7600 N-PE uses the 67xx LAN card toward the edge. Each ingress port is then encapsulated into a dual VLAN tag using the QinQ feature before being forwarded to VPLS. This QinQ encapsulation enables scalability to any number of VLANs. However, QinQ requires the careful management of overlapping inter-VLAN MAC addressees; this issue is analyzed in depth in Chapter 12, “Additional Design Considerations.”
Enterprises should avoid extending network services such as firewalls or load balancers across data centers. In addition, good data center design uses different Hot Standby Router Protocol (HSRP) groups in each data center. These rules were implemented with GOV, where VLAN extension is strictly reserved for issues with multiple data center clusters and not used for other requirements.
In addition, LAN ports are protected from a data-plane storm using “Storm-control for broadcast and Multicast,” which allows deployments to avoid the propagation of flooding across sites. This issue is also analyzed in depth in the “specific design consideration” chapter.
To enable N-PE backup, GOV deployed EEM scripting. The deployment did not include the Ethernet Virtual Circuit (EVC) feature because LAN port types do not allow it.
VLAN load repartition is performed at the edge by using two 10-Gbps edge ports, with per VLAN cost balancing.
To manage core load repartition over multiple paths, MPLS traffic engineering was deployed, with each virtual forwarding instance (VFI) targeted to a different path.
Chapter 13. Evolution of Data-Center Interconnect
One of the main topics of discussion is the need for VLAN extension across multiple sites. This requirement is being driven by the application and server side of the IT community.
The key question is will this requirement last for long? Or will server-to-server applications evolve and integrate IP routing for its control-plane traffic? It is a difficult task to predict the future, especially in the world of virtualization which is in its inception.
Another networking domain that is going toward L2 bridging is the Service Provider Wide Area Networking aggregation layer that is now even called “Carrier Ethernet.” During the past years, a lot of networking software has been developed to handle this huge market, and DCI is benefitting from it.
Networking Technology: Research Directions
The networking community is working hard on several paths to resolve VLAN extension across multiple sites. Depending on the background the research direction may be around hardening L2 bridging, increase IP flexibility or a mix of both worlds via either tunneling approach or fusion of concepts. To discuss future developments, we will consider three main approaches:
* Improve legacy L2 bridging with spanning tree
* Create a new concept with L2 bridging Data-plane and Routed control-plane
* Offer a L2 service over L3 transport with overlay approach
Improve Legacy L2 Bridging
A lot of development will still take place to improve the reliability of standard L2 bridging with Spanning Tree Protocol (STP). One direction is toward STP scaling with innovations such as Reverse Layer 2 Gateway Protocol (R-L2GP), which allows the building of STP domains interconnected via gateways. Another direction is automatic STP failure detection and correction; Guard and Bridging Assurance are typical features in this domain. Last but not least is the increasing usage of Multi-Chassis EtherChannel (MEC), in multiple implementations such as Virtual Switch System (VSS) and virtual Port Channel (vPC). The idea is to utilize channeling protocol such as Link Aggregation Control Protocol (LACP) to perform local repairs thus providing link and node redundancy and to let STP be transparent to these local repairs.
In spite of these improvements in L2 bridging, which may, in fact, be adequate for the needs of most medium-sized data centers, Spanning Tree is losing favor when expending its control beyond the access layer. We could argue forever about the reason behind this sentiment, but the fact is there: a new approach is required to increase the high availability and efficiency of bridging domains.
New Concepts in L2 Bridging
The effort of the networking community seems to be split in two main directions. One group is working on intra-DC scalability and availability, while the other is focusing on Data Center Interconnection (DCI).
The networking community is investing a lot to make Ethernet scalable, lossless, and efficient. To do so, it is creating a new concept for Ethernet L2 bridging and working on this concept from multiple directions: data-plane flow control and priority SAN traffic integration, data-plane encapsulation for scalability, data-plane loop avoidance using hop count field, and routed control-plane that allow loop-free multi-pathing transmission of unicast, unicast flooding, multicast, and broadcast traffic.
Several initiatives are ongoing with Cisco Data Center Ethernet (DCE), Converged Enhanced Ethernet, IEEE 802.1 DCB WG (Data Center Bridging working group), the IETF TRILL (Transparent Interconnection of Lots of Links) proposal and many other proposals, such as IEEE Shortest Path Bridging 802.1aq, Congestion notification 802.1Qau, and Enhanced Transmission Selection 802.1Qaz. It is still unclear which initiative will be the final one; most probably a merged approach will succeed.
This book does not approach the issue of convergence on a single network infrastructure for various types of traffic such as Local Area Network (LAN) and Storage Area Network (SAN), as this approach is more of an intra-DC issue. However, it’s worthwhile to pause for a more in-depth analysis of the potential replacements for STP.
The DCE and TRILL initiatives mostly revolve around the concept of the Routing-Bridge (Rbridge). A good reference for understanding the control-plane approach is TRILL working group at IETF.org
These new approaches rely on the following ideas:
* The ingress (customer) bridge frame should be encapsulated in a backbone MAC frame to avoid core switches to the bridge on the customer MAC. This approach is the “MAC in MAC” concept however; diverse approaches to this concept may propose quite different encapsulation formats.
*An edge Rbridge acts like a classic learning bridge, and learns every MAC address in the domain, but core Rbridges never see any edge MAC.
*The backbone MAC frame should include a hop count field that decrements on each hop, and drops the frame when the count reaches zero. With such a field, inherited from IP, no storm can occur if a control-plane creates permanent or transient loops.
*The Intermediate System to Intermediate System protocol (IS-IS) should be used instead of STP to create paths between core nodes.
* IS-IS creates loop-free unicast paths between edge nodes, the multicast replication tree, and the broadcast tree.
* Switching benefits from IS-IS Equal Cost Multi-Path (ECMP) technology that allows the network to efficiently balance traffic over multiple paths.
In short, the concept of Rbridging allows the network to benefit from the IP control-plane in conjunction with the data-plane hop count storm-breaker concept without having to encapsulate L2 frames in IP. The IP control-plane replaces STP but the data plane still bridges.
With the TRILL/DCE approach, the switch learns of the site, and so of the Rbridge where remote MAC addresses belongs, through the data-plane. An initial unknown frame is flooded via a safe tree, and the table is populated with the reply packet.
Another emerging idea is to avoid even the concept of learning bridges. A MAC routing approach will probably emerge in the near future. The concept is to populate the ingress device’s MAC address table via IS-IS, using the control-plane instead of the data-plane.
L2 Service over L3 Transport: MPLS or IP? Battle or Co-Existence?
With this evolution in L2 bridging, the choice to encapsulate frames via MAC in MAC implies that core nodes must be L2 switches. In several cases, the interconnection of data center sites may need to rely on a Layer 3 (L3) protocol, which means relying on either IP or MPLS (Multiprotocol Label Switching).
MPLS already offers such a transport, Virtual private LAN service (VPLS), the main subject of this book. MPLS offers the maximum toolset; it not only solves the L2 extension with the VPLS option, but also offers L3 virtualization with its extended VRF concept. As elaborated in the Service Provider domain, MPLS is enriched with powerful tools such as the ability to engineer traffic and reserve resources with local repair via Fast-ReRouting. In addition, MPLS is clearly the solution chosen by Service Providers to solve L2 aggregation of their Wide Area Networks, using Ethernet over MPLS (EoMPLS) and MPLS Transport Profile (MPLS-TP). Although MPLS is very promising, it is just emerging in the multicast world.
Many ongoing developments allow the same or an equivalent L2 transport approach over an IP network. A simple IP-based solution is required, because even though MPLS is well accepted by Service Providers, very large enterprises, and public organizations, it is not accepted by most enterprises. For these organizations, two solutions will probably arise. One approach is simply to encapsulate MPLS over IP with configuration simplification. (Chapter 10 is an example of such a notion.) Another approach is to natively encapsulate L2 frames over IP in an Over-the-Top approach, where the IP tunnel transports L2-unicast traffic over IP-unicast, but uses native IP-multicast to transport L2-multicast and broadcast.
Conclusion
To sum up, the networking community is slowly moving away from STP for data centers. The emerging model is based on the IS-IS Shortest-Path concept. But to implement this new model, Ethernet frame headers will need to be modified to incorporate hop count loop breakers. Proposed multiple concurrent or complementary encapsulation formats can transport bridging frame over core; this idea can scale from a simple MAC in MAC encapsulation up to an IP or an MPLS encapsulation. In the world of intra-DC scalability, no one is thinking about IP or MPLS, and so approaches like TRILL and DCE will probably succeed.
For DCI, the MPLS approach is by far the most ready and is gaining momentum in spite of its complexity, but in simple situations TRILL and DCE could be adopted. Most of the enterprise community expects IP to step up to the plate, and if IP fulfills this expectation it will probably become the main approach.