NFV vs SDN

As we are faced with more n more SDN and NFV terms in Telecom Networking these days, i thought of discussing same here and give you my understanding of what i think of these technologies.

Currently Communication service providers (CSPs) like BT, ATnT are facing numerous challenges from OTT (Over the Top) players like Netflix, Youtube, Hulu etc. CSP doesn’t get any revenues while subscribers like us use these OTT services. Still however, the infrastructure needed to handle all this growing data traffic needs to grow more to meet the expanding capacity and customer requirements. As a result, infrastructure costs are growing faster than customer/subscriber revenue growth.

Network functions Virtualization (NFV) offers a new way to design, deploy and manage networking services. NFV decouples the network functions, such as network address translation (NAT), firewall, domain name service (DNS), caching, etc., from proprietary hardware appliances, so they can run in software. (Think of GNS3 software if you have used it on your laptop). NFV is just much more that. You must have heard that Cisco or Juniper or any vendor’s hardware are some hundred thousand pounds. You can’t use Juniper Line card in Alcatel or Cisco or vice versa. This is a challenge for Service Providers. Previously Cisco or any Hardware vendor for that matter used to sell their products based upon traffic capacity they can handle like Gig, 10G per seconds however now the Dell, HP servers can meet those requirements without you having to buy the proprietary hardware from vendors like Cisco. All you need to do is take any server and run custom software on top of it which can acts as Firewall, DNS etc. etc.  NFV utilizes standard IT virtualization technologies that run on high-volume service, switch and storage hardware to virtualize network functions.

This will surely put a dent in hardware vendors profit but if they have to keep up with client expectations they have to take this turn. Offcourse there are limitations because of using server instead of dedicated vendor router but then Service providers are not going to replace their Core MPLS routers with NFV. NFV is still new to market and is in very nascent stage to understand its various usecases.

traditional-nfv

PIC Courtesy : http://www.moorinsightsstrategy.com

SDN (Software-Defined Networking) on the other hand is a concept related to NFV, but they refer to different domains. If you are aware of how any router works, you will be able to understand it very quickly. Every Router has 3 different planes. One is Management Plan, 2nd Control Plane and 3rd Forwarding Plane. Using Management Plane Router delivers Management Functions like SSH, TACACs etc. Control Plane is where all routing protocols is processed Like OSPF, BGP, RIP, etc etc. Forwarding Plane is using which Router sends/receive the Actual traffic out/in from its interfaces.

Now work of SDN is to separate out this Control Plane from Router or any network component and provide the centralized place to control the whole Network Topology. In this way the areas like Internal Data Centres of organizations where nothing much changes happens in Control Plane you can separate out this functionality from servers/network components and use servers purely for forwarding traffic as fast as possible. There are number of tools which helps in providing this functionality and with time I think we would be able to get more on that.

However SDN as a concept is not just using Openflow switches using open flow protocol.. Other vendors are implementing it as an Automatic provisioning tool using totally different concepts but still calling it as SDN as that what it is, you are using software to influence networks.

As you can see above, NFV and SDN are somewhat different concepts and can operate independently however they are generally implemented together and can act as powerful tool in today’s network environments.

That’s all for this blog. I will discuss more on these topics in later blogs. Do let me know your comments or feedback and what you think of these technologies!

 

Regards

Mohit Mittal

Advertisements

Netflow!!

Service providers these days are continuously facing a challenge and that challenge is someone intruding their network…Suspicious access from unknows IPs, hacking etc.. put pressure on service provider’s environment, their customer’s network and put a dent in their resources and revenues.

On the other hand, companies also spend quite a money in understanding user’s traffic patterns, monitoring network bandwidth utilization and WAN traffic, and performance monitoring. Whatever is their motive, some sort of protocol is needed to do all this as traditional method of monitoring via SNMP is just not enough and this give rise to Network protocol by Cisco called “Netflow”.

“NetFlow” is a network protocol developed by Cisco for collecting IP traffic information and monitoring network traffic. By analyzing flow data, a picture of network traffic flow and volume can be built. Using a NetFlow collector and analyzer, you can see where network traffic is coming from and going to and how much traffic is being generated.

While the term NetFlow was mostly used by Cisco, many other network hardware manufacturers support alternative flow technologies:

  • Juniper (Jflow)
  • 3Com/HP , Dell (s-flow)
  • Huawei (NetStream)
  • Alcatel-Lucent (Cflow)

Routers and switches that support NetFlow collect IP traffic statistics on all interfaces where NetFlow is enabled, and later export those statistics as NetFlow records, toward at least one NetFlow collector. Network collector is typically a server that does the actual traffic analysis. The NetFlow collector then processes the data to perform the traffic analysis and presentation in a user-friendly format. NetFlow collectors can take the form of hardware based collectors or software based collectors.

Netflow picture 1

 

NetFlow_Picture 2

 

NetFlow v1 was originally introduced in 1990 and has since evolved to NetFlow version 9. Today, the most common versions are v5 and v9. Major difference between v5 and v9 version is that v5 is restricted to IPv4 flows however v9 can be used to report flows like IPv6, MPLS, or even plain IPv4 with BGP nexthop.

Monitoring IP traffic flows ensures that resources are used appropriately in support of organizational goals. It helps IT determine where to apply Quality of Service (QoS), plays a vital role in network security to detect Denial-of-Service (DoS) attacks, and other undesirable network events.

One last thing is, Netflow is not a standardized version of protocol and it was developed by Cisco however other vendors uses the same concept for their routers/switches. IETF took the Netflow v9 and standardized this protocol into “IP-FIX” (IP Flow Information Export) with some additional changes which vendors are implementing these days to have a consistent view and avoiding any inter-operability issues.

We can go through the IP-FIX in other blogs but for now I hope you have understood the usage of Netflow :).

Thanks

Mohit Mittal

 

OSPF Special Area Types!!

The topic which I have chosen for today is special area types in OSPF. I have seen that people ((I was one of them ;)) find it hard to grasp these area types.

We know that OSPF is Link State Interior Gateway Protocol which works by advertising Link State Advertisements (LSA) to its neighbours. LSA are nothing but state of router Interface. More neighbors in Autonomous System means more LSA you need to share with your neighbours and more processing in terms of Power, Memory, CPU is needed on routers to process those incoming LSAs.

What is the solution?

To divide the whole autonomous system into different Areas with Area 0 or Back bone area being at the centre or you can say all other areas connects to Area 0.

What is the Benefit?

With division of whole autonomous system into different Areas, routers have to send the LSAs only to its neighbors inside that particular area and not with all OSPF routers of autonomous system.

Then how the information flows outside the Area?

This is with the help of OSPF Area Border Routers (ABRs) which summarizes the LSAs from One area and send it to another area in Type 3 LSA which is also called Summary LSA.

Till this point we have not introduced any special area types. Everything I have mentioned till now is mostly about Area 0 (Backbone Area) and any other area which is connected to Area 0. Let’s say that another area as Area 1 and common point among both of these areas is ABR which is at the border between these 2 areas.

Before learning Special area types you have to understand one type of LSA which is called “External LSAs or Type 5 LSAs”. These LSAs advertises external connectivity. External connectivity could be from some other Autonomous system or if redistribution is happening from any other IGP, Static protocol into OSPF.

Now, Special Areas are listed as:

1) Stub Area

2) Totally Stubby Area

3) Not-so-Stubby Area

4) Totally Not-so Stubby Area

:S… What is this all about?    Ok, I will try to explain in simple terms  😀

As I discussed above that we have divided the Autonomous systems into Areas to restrict the flow of LSAs however there can be situations that you have some router in your network which is of very low memory or very old legacy router which can’t take all the routes in its routing table but instead of replacing it you want to keep it in your network and serve customers via it. You don’t want to bombard that router will extern LSAs to reach those external prefixes instead you can configure that router as Stub router.

With Stub router configuration all External LSA gets suppressed. But then how that router reaches External prefixes. That is via Default route. As soon as you configure Stub ABR router and other Internal routers as stub router, ABR will automatically advertises a default route towards Internal stub routers which is only information Internal stub routers needs in order to reach External prefixes.

Now in Stub routing, routers will still have Type 3 Summary LSAs, Type 1 Router LSA and default route in their database however why you even want Type 3 Summary LSA when you have default route. This is achieved via configuring the router as Totally Stubby Area where summary LSA even will be suppressed.

People think that apart from Stub area, all other area types are Cisco proprietary however if you look at the original OSPF RFC, Not so Stubby Area has been defined over there so it’s not exactly a Cisco Proprietary feature and has been implemented by other Vendors. Totally Stub Area and Totally Not-so Stubby Area are not defined in RFC.

There are many instances in which companies want to connect to another company or 2 companies gets merged and they want to share any information with each other but in limited fashion. This can be achieved by NSSA (Not-So-Stuby Area) in which router which connects to other’s company network becomes NSSA ASBR (Autonomous System Boundary Router) and router which is connected to Backbone Area 0 Router becomes NSSA ABR.

Redistribution into an NSSA area creates a special type of link-state advertisement (LSA) known as Type 7 LSA, which can only exist in an NSSA area. An NSSA autonomous system boundary router (ASBR) generates this LSA and an NSSA area border router (ABR) translates it into a type 5 LSA, which gets propagated into the OSPF domain i.e Area 0. However one thing to note is that External routes i.e. Type 5 LSAs coming via NSSA ABR is still not allowed into NSSA as normal Stub area rules still apply.

Like for Stub area we have Totally Stubby Area, for NSSA we have Totally NSSA which is same as before that NSSA doesn’t allow Type 5 LSA however it allow Type 3 Summary LSA’s , with Totally NSSA, we are suppressing this summary LSA even.. 🙂

Note: When you configure an area as NSSA, by default the NSSA ABR does not generate a default summary route. In the case of a stub area or an NSSA totally stub area, the NSSA ABR does generate a default summary route.

I know text has become lengthy but I didn’t want to stop abruptly to add to more confusion 🙂 . If you have any doubts, please let me know.

R

Mohit

Ping vs Extended Ping!!!

I hope as a Network Engineer, you must have used Ping functionality (ping <ip address>) in routers to check the connectivity to destinations and you must have been relieved after seeing the 5 ‘!’ signs as an output. 🙂

Apart from this, you might have used Extended ping as well which is like normal ping but with some more options like specifying MTU, Don’t Fragment Bit, ToS etc etc.

Do you think there is any difference how Cisco routers process the Ping and Extended Ping??

*****************

Normal Ping

*****************

#ping 10.213.124.65

Type escape sequence to abort.

Sending 5, 100-byte ICMP Echos to 10.213.124.65, timeout is 2 seconds:

!!!!!

Success rate is 100 percent (5/5), round-trip min/avg/max = 1/1/1 ms

 

*****************

Extended Ping

*****************

#ping

Protocol [ipv4]:

Target IP address: 10.213.124.65 ?

Repeat count [5]: 10

Datagram size [100]:

Timeout in seconds [2]:

Extended commands? [no]: yes

Source address or interface:

Type of service [0]:

Set DF bit in IP header? [no]:

Ping is most used functionality on any system or router and Network Engineer’s most powerful tool to test the connectivity of circuit. Ping uses ICMP (Internet Control Message Protocol) encapsulated in Internet protocol.

You know that whenever a packet comes to router, it checks for the destination IP address in its routing table and sends the packet accordingly to next-hop. However there is more to this internally.

Whenever a packet comes to router, router CPU gets interrupted and it has to allocate certain CPU cycles to process that particular packet. More packets will interrupt more CPU time. Every-time a packet comes, router will check its routing table to find the next-hop and then based on that next-hop, find the MAC address of destination to build the Layer 2 Frame to send the frame onto the Wire. This is very CPU intensive process and this process is called Process Switching on router.

Now, to avoid the above issue, one way is to send fewer packets through the router which I don’t think is sensible solution ;)… The other way is what is called CEF Switching. CEF (Cisco Express Forwarding) is the default mode used these days on high end routers and for other routers, it is highly recommended to enable it if available. CEF switching is based on specific hardware switching on Router’s ASIC (Application-specific integrated circuit). In CEF switching, as soon as routing table is built, Router calculates the next-hop and Mac address of each destination in routing table and add all this info into CEF Table. Now once packet comes, Router has to just consult the CEF Table for all the info it needs to build the Layer 2 frame and switch it. There is no need to interrupt the CPU and CPU power can be used to build the other enhanced features in router.

There is another variant of CEF which is called “Distributed CEF” which is just like CEF only however in this case, whole CEF table is copied to individual Line cards on router and accordingly faster switching is achieved. Incoming packet don’t have to query the main processor or routing table in order to get the next-hop information. Instead, switching will be performed on the line card itself. Distributed CEF is available on platforms like CRS-1, CRS-3, ASR, GSR etc.

Now you must be thinking that I started with Ping and now I am on CEF, what is the relation between both of them. OK I am coming to that point now.

Suppose we have 4 Routers in sequence,

R1 — R2 — R3 — R4

If you do Normal ping from R1 to R4, on R2 it will be CEF switched because it’s a transient traffic for R2 and packet’s ultimate destination is not destined for R2. Same for R3 as well. However on R4 as it is destination for the packet it will be Process switched on it and specifically CPU or Route processor card needs to look into the packet.

For same situation, if you do Extended Ping with any options from R1 to R4, on each hop it will be Process switched and there will be no CEF switching involved in it. This powerful tool in Ping can be used to isolate lots of issues related to internal hardware or CEF related bugs.

The above process is for Cisco routers as CEF is Cisco proprietary. However other Vendors like Juniper also implement this feature using Switching in ASIC (hardware) rather than Software (Process Based).

So that was for Ping and Extended Ping.. I hope you will like this Blog on how Cisco routers process both types of Pings and do let me know if you have any queries related to this.

Regards

Mohit Mittal

4 Byte AS Number

You already know how IPv4 addresses are being depleted and how all Telecom Providers are looking at next Generation IP addressing scheme i.e. IPv6 for rescue. However there is one more resource which is depleting rapidly and that is AS Number (Autonomous System Number) or specifically 2 Byte AS Numbers.

As per official statement “An Autonomous System (AS) is a collection of connected Internet Protocol (IP) routing prefixes under the control of one or more network operators that presents a common, clearly defined routing policy to the Internet.” i.e. each Service provider or Enterprise network will have its own AS number where it can apply its own routing-policies and connect to other AS number using BGP (eBGP).

A 16-bit number (i.e. 2 Bytes) will give 65,536 possible numbers (2^16) (AS numbers 0 – 65535). Out of these, the IANA reserves 1,026 of them: 64512 – 65534 for private, reusable ASNs (similar to private RFC1918 IPv4 addresses) and a few others such as 0, 65535 and 23456. I will come back to 23456 AS number after short while. From total of 65536 ASs, around 63000 have already been allocated, 1026 are for private use and around 1500 are remaining for Public allocation. So you can estimate yourself, how much important is this resource and something needs to be done very quickly.

Fortunately, we have new 4 Byte AS number to rescue and this is the topic of my blog.

4-byte (32bit) AS Number provides 2^32 or 4,294,967,296 autonomous system numbers ranging from 0 to 4294967295. The first thing to notice about these numbers is that they include all of the older 2-byte ASNs, 0 through 65535. That greatly helps with interoperability between autonomous systems using 2-byte ASNs and those using 4-byte ASNs.

Now main thing about 4 Byte AS number is representation. How you will represent these lengthy AS Number in meaningful way (same like for IPv6 address we have some tricks). However unlike IPv6, AS number representation is not so much complex and easy to understand.

  1. asplain –> asplain is a simple decimal representation of the ASN, from 0 to 4294967295.
  2. asdot –> in asdot, any ASN in the 2-byte range i.e. between 0 – 65535 is written in asplain (so 65535 is written as “65535”) however any ASN above that range is written in different format. Suppose 65536 is ASN which you know is outside the range (0 – 65535) and it will be represented as 1.0. 65537 would be 1.1, 65680 is 1.144, and so on. So if you guessed it, basically what we are doing is subtracting multiples of 65,536 from the asplain representation of the ASN, with the high-order value representing the multiples of 65536. 
  3. So 134576 can be represented as 2.3504 because 134576 = 2*65536 + 3504

HDFC Bank in India has one 4 Byte AS number allocated to it and it is:

AS131283 –> HDFC Bank

I hope you know that in BGP, AS number is used to determine the shortest path to the destination and also as a loop avoidance mechanism. So how these new AS Number notation works in environment where both types of AS number exists i.e. 2 byte and 4 byte

Ok, so let’s define the BGP implementations supporting 4-byte ASNs as BGP-New, and legacy BGP implementations that only support 2-byte ASNs as BGP-Old.

The first requirement for a BGP-New implementation is to discover whether a neighbor is BGP-New or BGP-Old. It does this by using the BGP Capability Advertisement when starting a BGP session. In addition to advertising itself as BGP-New, it includes its 4-byte ASN in the Capability advertisement.

If a neighbor responds that it also is a BGP-NEW speaker, the neighbor includes its 4-byte ASN in its own Capability advertisement. Thus two BGP-New neighbors can inform each other of their 4-byte ASNs without using the 2-byte Autonomous System field in the Open message.

If a neighbor is BGP-Old, it either responds that it does not support the 4-byte ASN capability or does not respond to the Capability advertisement at all. In this case, the BGP-New neighbor can still bring up a session with the BGP-Old neighbor, but cannot advertise its 4-byte ASN. The neighbor wouldn’t understand it. Instead, BGP-New uses a reserved 2-byte ASN which I defined earlier i.e. 23456, called AS_TRANS. Router which is configured for 4 byte number will send the BGP Open message with 23456 AS Number so that neighbor Router can understand it. Because AS_TRANS is reserved, no BGP-Old speaker can use it as its own ASN; only BGP-New speakers can use it.

Interoperable peering, then, is achieved because the BGP-New speaker “knows” its neighbor is a BGP-Old speaker and adapts to it; the BGP-Old speaker simply continues using legacy BGP rules.

Cisco has started to include this functionality from IOS-XR 3.4 and Juniper Network has included this from Junos 9.1.

There is much more to 4 byte AS but I hope you will get some idea from this blog  🙂

Regards

Mohit Mittal

NSF, GR or NSR??

There are number of terms we use in today’s High Availability Network like NSF (Non-Stop Forwarding), GR (Graceful Restart) and NSR (Non-Stop Routing). Companies these days want 99.5% availability of their networks and these High Availability features play a vital role in that. However have you ever wondered what’s the difference between all these terms??. Add to our confusion is different vendors and their usage of terms.

Let’s try to understand what’s basically these terms are and whether there is any commonality between terms used by different vendors!!!! We will compare Cisco and Juniper over here.

Modern high-performance routers physically separate the forwarding plane and the control plane and both have their own memory and processors. The control plane runs the routing protocols, and derives a forwarding table (FIB). The FIB is given to the forwarding plane, which is then responsible for actual packet forwarding through the router. The advantage of physically separating the forwarding and control planes is that in case of congestion i.e. huge traffic is flowing through the routers; forwarding plane becomes very busy however in that case it doesn’t impact the control plane’s ability to process new routing information. Similarly in case router’s routing plane/control plane becomes clogged due to route flapping or any other issues, it doesn’t impact the forwarding plane to continue forwarding packets as forwarding plane has a copy of the FIB which it previously got from Control plane. This is called Non-Stop Forwarding (NSF).

Now you must be thinking that this is not a good architecture as Router is forwarding on the path which is corrupt or not optimum at this moment or you can say that there might be good path somewhere which is not being used by Router. So why do I need NSF?

Well, you need NSF so that routers can use redundant control planes. Cisco calls their control planes as Route Processors and Juniper calls them Routing Engines. With 2 processors or routing engines, NSF switches from a primary to a backup control plane without disrupting forwarding. The FIB could still become invalid during the period between when the primary control plane goes down and the backup control plane takes over, but this is acceptable for time being 😉

So problem now is how you can make this switchover from primary to backup control plane shorter so that FIB is less prone to invalid information. Routers do this by maintaining the copy of the active configuration on backup processor/routing-engine as well. Now Cisco calls this process as Stateful Switchover (SSO) and Juniper calls it as Graceful Routing Engine Switchover (GRES). J

So what is Non-Stop Routing (NSR) then?

Ok as I stated above that Control plane has Stateful Switchover at its disposal to decrease the switchover time however problem is that once router do the switchover all the routing protocol adjacencies like OSPF, LDP, IS-IS etc. goes down. So when routing protocol goes down, neighboring routers by principle update their neighboring routers of this mis-happening and those routers will in-turn update other neighboring routers in chain. This all process will un-stabilize the network and CPU processing on all routers will increase. Same will happen at the time when back up control plane comes up. So you guessed it right, the use of NSR in that case is to minimize this un-stability.

Initially, to control this un-stability, GR (Graceful Restart) principle was proposed, where on router’s control plane switchover, router doesn’t report the switchover information immediately to its own neighbor rather it wait for certain period of time (which is called grace interval) and this saves the network from impact. However to have this GR capability all the neighbors should support GR which may not be the case everywhere like on small routers in Enterprise Networks..So they proposed NSR..

In NSR, router’s backup routing-engine/processor keeps the information of routing-protocol i.e. OSPF, LDP, IS-IS state as well and as this information is already with backup processors, switchover is transparent to neighbors. So why this doesn’t impact small routers? Because NSR is vendor specific and neighboring router doesn’t have to support it unlike GR.

Different vendors use all these terms differently. Juniper, for example, calls its graceful restart implementation as Graceful Restart, whereas Cisco calls it’s as Non-Stop Forwarding Awareness Also people consider Juniper’s GRES and GR as same however if you read above they both are  two different things.

So, that’s all for NSF, GR and NSR. I hope you find this information useful and I am able to lessen your confusion. If you still have any questions, please let me know. 🙂

Thanks

Regards

Mohit Mittal

Advertisements
Advertisements