BestianEN

Sunday, January 10, 2016

Building a fault tolerant network infrastructure for your company.

Part 3: Additional tuning for boundary and user ports on Cisco and HP ProCurve.

Part 1: "MSTP on HP ProCurve"

Part 2: " Connect Cisco and HP ProCurve using MSTP"

Today we look at what can do to protect our STP topology on the access ports, and at borders to other networks. Let's start with the most basic settings, such as PortFast, BPDUGuard, BPDUFilter and specific option for ProCurve - PVST-Filter.

PortFast

PortFast - This mode is intended to "speed up" the initialization of the ports that are connected to servers or user stations. In the normal state, without PortFast, a port with active Spanning Tree before switching to data forwarding mode goes through several pretty long phases: blocked, listening, learning, and only after 40-50 seconds, port switched to forwarding and begins to transmit data. This is done in order for have time for discover a loop or double link on initialization stage of Spanning Tree, and we do not had problems with our network even for a short time. If to the port connected a workstation or a server, we do not need such a long procedure, it may be even harmful for the workstation or server! For example, a computer may not have a time for get an address from DHCP, if loading of computer faster than is initialized port.
So, when we activate PortFast, a port skip phases of listening and learning, and immediately becomes to forwarding state. When you turn on PortFast, Spanning Tree continues to work on these ports, it will not turn off! Continues to send and receive BPDU. Moreover, if the port with activated PortFast, has received message BPDU, it immediately loses its status, despite settings. Nevertheless, is not recommended to enable PortFast on ports connected to another network equipment, because it may lead to appearing routing loops for short time.

Cisco:
On Cisco have two ways to activate this mode: locally on specific ports or globally in configuration mode.
Global:
spanning-tree portfast default
This command enables PortFast on all ports which in access mode.

Local:

interface GigabitEthernet0/1

spanning-tree portfast

Or despite an active global settings, you can disable PortFast on the ports where it is not needed:

interface GigabitEthernet0/1

spanning-tree portfast disable

Also, in some cases you may to enable PortFast on trunk's, for example, it may be necessary if the trunk connected to a server that needs to get several VLAN's from trunk.

interface GigabitEthernet0/1

spanning-tree portfast trunk

Checking PortFast state on interface:

sh spanning-tree int gi0/1 portfast

MST0 enabled

MST2 enabled

Also:

sh spanning-tree int gi0/1 detail

Port 1 (GigabitEthernet0/1) of MST0 is designated forwarding

Port path cost 20000, Port priority 128, Port Identifier 128.1.

Designated root has priority 4096, address 001b.3fc1.a800

Designated bridge has priority 32768, address 001d.e691.2800

Designated port id is 128.1, designated path cost 0

Timers: message age 0, forward delay 0, hold 0

Number of transitions to forwarding state: 3

The port is in the portfast mode by default

Link type is point-to-point by default, Internal

Bpdu guard is enabled by default

Loop guard is enabled by default on the port

BPDU: sent 174100, received 0

Port 1 (GigabitEthernet0/1) of MST2 is designated forwarding

Port path cost 20000, Port priority 128, Port Identifier 128.1.

Designated root has priority 2, address 0018.71b6.a000

Designated bridge has priority 32770, address 001d.e691.2800

Designated port id is 128.1, designated path cost 22000

Timers: message age 0, forward delay 0, hold 0

Number of transitions to forwarding state: 2

The port is in the portfast mode by default

Link type is point-to-point by default, Internal

Bpdu guard is enabled by default

Loop guard is enabled by default on the port

BPDU: sent 174100, received 0

Here you can see if the port has lost its PortFast status. To return PortFast mode, you need to disable and enable this interface:
interface GigabitEthernet0/1

shutdown

no shutdown

HP ProCurve:

On ProCurve have analog of PortFast called "edge-port", for manual setup port mode to "EDGE" you may use option "admin-edge-port". By default for all ports active "auto-edge-port" mode, which determine port mode (edge(PortFast) or no). And I should be noted that this automatic feature on ProCurve works very well! Port very quickly turns into EDGE mode when needed, and quickly lose it status, if received BPDU message.

How to enable:
spanning-tree A1-A2 admin-edge-port

sh run:
spanning-tree A1 admin-edge-port
spanning-tree A2 admin-edge-port

Forced disable automatic detection of edge ports:

no spanning-tree A1-A2 auto-edge-port

sh run:
no spanning-tree A1 auto-edge-port
no spanning-tree A2 auto-edge-port

Return to auto-edge-port mode and disable admin-edge-port:

spanning-tree A1-A2 auto-edge-port

and

no spanning-tree A1-A2 admin-edge-port

Checking edge port status:
sh spanning-tree A1
Port Type | Cost rity State | Bridge Time PtP Edge
------ ---------- + --------- ---- ------------ + ------------- ---- --- ----
A1 100/1000T | 20000 128 Forwarding | 001b3f-582100 2 Yes No

and also
sh spanning-tree A1 detail
AdminEdgePort : No
Auto Edge Port : Yes

OperEdgePort : No

BPDUFilter

This feature disables the reception and transmission of BPDU on the port. It can be used for different purposes, such as protection against foreign Spanning Tree on the link to the provider or to other neighbors, with whom you have only single link. To protect against attacks based on BPDU and as a consequence, instability of your topology.

Cisco:

On Cisco BPDUFilter as PortFast may be enabled through two ways: globally in configuration mode, and locally on the interface. But unlike PortFast, BPDUFilter have different behavior, depending on the method of activation.

Local:

interface GigabitEthernet 0/1

spanning-tree bpdufilter enable / disable

As in the case PortFast, "disable" disables the global settings for a particular port.
Global (automatically activate BPDUFilter on all PortFast interfaces):

spanning-tree portfast bpdufilter default

When you turn on BPDUFilter locally on the interface, this works as "the wall", it equivalent to disabling Spanning Tree on the interface. BPDU do not go to any direction. In some cases, it may be dangerous. For example, in case connecting port with activated BPDUFilter to neighboring port, you get a loop that will not eliminate by Spanning Tree, and if you not have other anti-loop means, your network will be collapse. So, be careful using this function locally at the interface.

If this feature is enabled globally, it is not dangerous, because it is activated on the PortFast ports, and blocks only sending BPDU, but not blocks receiving. Global BPDUFilter disabled on the port when receiving first BPDU packet from outside. This feature is more suitable to mask your BPDU from foreign neighbors, than for blocking it. And as soon as your interface received the first BPDU package, you start working with a neighbor on the Spanning Tree.

When you turn BPDUFilter globally, there is another interesting feature: through the port during initialization will be sent a several BPDU messages, to make sure that this port is not contains devices STP. When you turn on BPDUFilter on the local port, will be blocked any BPDU for sent and receive.

Checking BPDUFilter state:

sh spanning-tree summary
Switch is in mst mode (IEEE Standard)
Root bridge for: none
Extended system ID is enabled
Portfast Default is enabled
PortFast BPDU Guard Default is enabled
Portfast BPDU Filter Default is disabled
Loopguard Default is enabled
EtherChannel misconfig guard is enabled
UplinkFast is disabled
BackboneFast is disabled

Checking BPDUFilter state on interface:

sh spanning-tree int gi0/1 detail

Port 1 (GigabitEthernet0/1) of MST0 is designated forwarding

Port path cost 20000, Port priority 128, Port Identifier 128.1.

Designated root has priority 4096, address 001b.3fc1.a800

Designated bridge has priority 32768, address 001d.e691.2800

Designated port id is 128.1, designated path cost 0

Timers: message age 0, forward delay 0, hold 0

Number of transitions to forwarding state: 3

The port is in the portfast mode by default

Link type is point-to-point by default, Internal

Bpdu guard is enabled by default

Bpdu filter is enabled

Loop guard is enabled by default on the port

BPDU: sent 174100, received 0

Port 1 (GigabitEthernet0/1) of MST2 is designated forwarding

Port path cost 20000, Port priority 128, Port Identifier 128.1.

Designated root has priority 2, address 0018.71b6.a000

Designated bridge has priority 32770, address 001d.e691.2800

Designated port id is 128.1, designated path cost 22000

Timers: message age 0, forward delay 0, hold 0

Number of transitions to forwarding state: 2

The port is in the portfast mode by default

Link type is point-to-point by default, Internal

Bpdu guard is enabled by default
Bpdu filter is enabled

Loop guard is enabled by default on the port

BPDU: sent 174100, received 0

HP ProCurve:

On ProCurve BPDUFilter need to enable on each port:
spanning-tree A1-A2 bpdu-filter
sh run:
spanning-tree A1 loop-guard bpdu-filter
spanning-tree A2 loop-guard bpdu-filter

Show ports with activated BPDUFilter:
sh spanning-tree | i Filter
BPDU Filtered Ports : A1-A2

also:

sh spanning-tree A1 detail

BPDU Filtering : Yes

Also, HP has another nice feature, which is useful in the case of using MSTP, and may help in described in this article case: Part 2: " Connect Cisco and HP ProCurve using MSTP". In the article described the case when Cisco in MST mode, going crazy with receiving PVST message, flown through ProCurve topology.

PVST-Filter! Enabling like BPDUFilter:

spanning-tree A1-A2 pvst-filter

sh run:

spanning-tree A1 pvst-filter

spanning-tree A2 pvst-filter

Show PVST-Filter status:

sh spanning-tree | i PVST

PVST Filtered Ports : A1-A2

also:

sh spanning-tree a1 det

PVST Filtering : Yes

Maybe, if you are using MSTP, this function must be activated on all ports, because nothing good from receiving PVST not will be. At least MSTP and PVST are inconsistent, and may be many problems like in my previons article: Part 2: " Connect Cisco and HP ProCurve using MSTP"...

BPDUGuard

On Cisco BPDUGuard may be switched locally on the interface or globally. Regardless of the method of activation, when receiving an incoming BPDU on the port with the activated BPDUGuard, a port will blocked.

Cisco:
Local:
interface GigabitEthernet 0/1
spanning-tree bpduguard enable / disable
As in the case Portfast, "disable" disables the global settings for a particular port.

Global (automatically activate BPDUGuard on all PortFast interfaces):
spanning-tree portfast bpduguard default

If the port is blocked, you need to disable and enable this interface:

interface GigabitEthernet0/1

shutdown

no shutdown

Either setup automatic enable blocked ports using errdisable recovery:

errdisable recovery cause bpduguard

errdisable recovery interval 60

Checking BPDUGuard state:

sh spanning-tree summary

PortFast BPDU Guard Default is enabled

Checking BPDUGuard state on interface:

sh spanning-tree int gi0/1 detail

Port 1 (GigabitEthernet0/1) of MST0 is designated forwarding

Port path cost 20000, Port priority 128, Port Identifier 128.1.

Designated root has priority 4096, address 001b.3fc1.a800

Designated bridge has priority 32768, address 001d.e691.2800

Designated port id is 128.1, designated path cost 0

Timers: message age 0, forward delay 0, hold 0

Number of transitions to forwarding state: 3

The port is in the portfast mode by default

Link type is point-to-point by default, Internal

Bpdu guard is enabled by default

Loop guard is enabled by default on the port

BPDU: sent 174100, received 0

Port 1 (GigabitEthernet0/1) of MST2 is designated forwarding

Port path cost 20000, Port priority 128, Port Identifier 128.1.

Designated root has priority 2, address 0018.71b6.a000

Designated bridge has priority 32770, address 001d.e691.2800

Designated port id is 128.1, designated path cost 22000

Timers: message age 0, forward delay 0, hold 0

Number of transitions to forwarding state: 2

The port is in the portfast mode by default

Link type is point-to-point by default, Internal

Bpdu guard is enabled by default

Loop guard is enabled by default on the port

BPDU: sent 174100, received 0

HP Procurve:

On ProCurve BPDUGuard called BPDU-Protection:

spanning-tree A1-A2 bpdu-protection

sh run:

spanning-tree A1 loop-guard bpdu-protection

spanning-tree A2 loop-guard bpdu-protection

Show ports with activated BPDU-Protection:

sh spanning-tree | i Protect

BPDU Protected Ports : A1-A16,A18-A24,B1-B24,C1-C24,D1-D24,G...

sh spanning-tree a1 detail | i Protect

BPDU Protection : Yes

In my opinion BPDUGuard very useful, as it allows to protect against unauthorized connection of network equipment, as well BPDUGuard is instantly blocks the majority of the loops. But BPDUGuard can also cause problems, for example if you activate BPDUGuard on interface that connected to the provider. From providers, by my experience, often arrives STP, which immediately lead to port blocking. I am very cautious, but on the interface to the service providers with whom we have only single connect, i am typically activate BPDUFIlter :)

In the next article I will add information about loop-guard, loop-protect and udld. I will additionally compose in one article all about fight against loops on managed and unmanaged hardware. It will be last article about Spanning Tree and LAN topology. Then we turn to fault-tolerant of internet access and cluster technologies :)

Sorry for my English...

Monday, January 4, 2016

Building a fault tolerant network infrastructure for your company.

Part 2: Connect Cisco and HP ProCurve using MSTP.

Part 1: "MSTP on HP ProCurve"
Part 3: "Additional tuning for boundary and user ports on Cisco and HP ProCurve"

на Русском

Continuing the series for "novice professionals" :) On the last time we made single network based on HP ProCurve, and today will be connecting Cisco equipment to our network, as on this scheme:

Connecting our Cisco 1, 2, 3 and 4, as shown in the scheme, but one of the links in each group of the two switches is turned off to prevent a loop, until Spanning Tree in not activated.

On ProCurve, we raised MSTP, it would be very good to raise MSTP on Cisco. In the Internet has a certain amount of information on the topic of linking ProCurve and Cisco, and even the official documentation from HP and from Cisco. But unfortunately...

Setup MSTP on Cisco.

For starters, if you have not read Part 1: "MSTP on HP ProCurve", I advise you to read! There i told the basic principles of setting MSTP on the switches, which are common to HP and for Cisco. Here is a brief list of recommendations Cisco and common sense to configure MSTP:

Use the same "region" in all your network.
Minimum number of instances.
Set up the priorities for the "root bridge".
In advance to divide the entire possible range of a VLAN on "instances".
All instances on all switches in the region must contain the same list of VLANs! On Cisco will help protocol VTP v3, which is can to serve not only VLAN's, but also instances of MSTP. HP ProCurve has only GVRP, which is similar to VTP V1/2, but not work with MSTP, synchronization of instances settings between switches, perform by hands.
Region name and config revision number must be the same throughout the network!
Permission is granted an identical list of VLAN's on all tanks between commutators!

Also there are rules for connecting ProCurve and Cisco:

Cisco supports 802.1s MSTP only since 2005, make sure IOS later than 2005 year. In general FirmWare update on all equipment is good idea :)
Not to be confused Pre-STD MST with MSTP - they are not compatible.
Verify, that native/untagged VLAN 1 was set up on trunks between Cisco and HP.

Compliance of these regulations will allow your Spanning Tree topology to work stably, to work load-balancing between links, and is not recalculated without the critical need with dropping your network :)

Check trunk between Cisco and ProCurve for permit of all necessary VLANs, and then proceed:
conf t
spanning-tree mst configuration
Do all as on the ProCurve. Region name the same as on ProCurve, the same as in all our the region:
name H2SO4
Config revision number must be the same in all network:
revision 1
Divide all VLAN on two "instances" according to the load on them in my network. In our case, as on ProCurve:
instance 1 vlan 1-35,101,111-500,1001-4094
instance 2 vlan 36-100,102-110,501-1000
Do not exit the configuration MST check the resulting configuration:
show pending
Pending MST configuration
Name [H2SO4]
Revision 1 Instances configured 3

Instance Vlans mapped
-------- ---------------------------------------------------------------------
0 none
1 1-35,101,111-500,1001-4094
2 36-100,102-110,501-1000
-------------------------------------------------------------------------------
Type "exit or press CTRL-Z for exit and applying the configuration:
exit

If Cisco is root in the MSTP region, or your network is Cisco-Only, specify that our switches are root for specific instances in the MSTP region, as we did it on an HP in the last article, and set priorities for instances:

Cisco 1/3:
conf t
spanning-tree mst 1 root primary
spanning-tree mst 0-1 priority 0
spanning-tree mst 2 priority 4096
Cisco 2/4:
conf t
spanning-tree mst 2 root primary
spanning-tree mst 0-1 priority 4096
spanning-tree mst 2 priority 0

For our configuration with Cisco and HP do not need to set up this priorities for Cisco, as the root bridge we use ProCurve, and all higher priorities are on the ProCurve!

When we finished configuration of MSTP on all Cisco of group, you need to activate MSTP:
spanning-tree mode mst

All should work, if you followed the instructions, and your network is not hiding a unexpected surprise, as it happened in my network :)
I spent a few days trying to understand why MSTP between Cisco and HP do not work! As a result, in a dark corner was found the Cisco 3560, about which everyone has forgotten, and who worked in PVST mode... I never imagined that BPDU can fly through the entire network topology, and spoil life. In general, BPDU flys through network and Cisco drove crazy, when I enabled MSTP:

000068: 00:49:18: %SPANTREE-2-PVSTSIM_FAIL: Blocking root port Gi0/8: Inconsistent inferior PVST BPDU received on VLAN 110, claiming root 32878:0015.c6d7.9900

Gi0/7 Mstr BKN*20000 128.7 P2p Bound(PVST) *PVST_Inc

By the way, ProCurve have an option pvst-filter, perhaps this option can help without searching source of wrong BPDU.

So, what I did ... I turn on the Cisco debug for received BPDU:

term mon
debug spanning-tree bpdu receive

And I see this:

002836: Jan 5 16:36:57.625: STP: MST0 rx BPDU: config protocol = mstp, packet from GigabitEthernet0/8 , linktype IEEE_SPANNING , enctype 2, encsize 17
002837: Jan 5 16:36:57.625: STP: enc 01 80 C2 00 00 00 00 1B 3F 58 31 EF 00 89 42 42 03
002838: Jan 5 16:36:57.625: STP: Data 000003023C1000001B3FC1A800000000001000001B3FC1A80080110000140002000F00
002839: Jan 5 16:36:57.634: STP: MST0 Gi0/8:0000 03 02 3C 1000001B3FC1A800 00000000 1000001B3FC1A800 8011 0000 1400 0200 0F00
002840: Jan 5 16:36:58.238: STP: MST0 rx BPDU: config protocol = mstp, packet from GigabitEthernet0/8 , linktype SSTP , enctype 3, encsize 22
002841: Jan 5 16:36:58.238: STP: enc 01 00 0C CC CC CD 9C 4E 20 B2 2E 98 00 32 AA AA 03 00 00 0C 01 0B
002842: Jan 5 16:36:58.238: STP: Data 000000000080649C4E20B22E800000000080649C4E20B22E8080180000140002000F00
002843: Jan 5 16:36:58.238: STP: MST0 Gi0/8:0000 00 00 00 80649C4E20B22E80 00000000 80649C4E20B22E80 8018 0000 1400 0200 0F00

IEEE_SPANNING - it is normal.
А вот SSTP - It is abnormal. It is not our MSTP, it is a stranger Spanning-tree, source of this packet need to search!
Find the source of the package is very easy. The first 6 digits in the ENC is the header, it is always the same: 01 00 0C CC CC CD, but next 6 digits this is the sender's MAC address: 9C 4E 20 B2 2E 98 - in my case it was the Cisco 3560. After disabling STP all became normal. MSTP is working, and all became well.

Several commands for checking MSTP configuration on Cisco and HP:
HP:

sh spanning-tree mst-config

MST Configuration Identifier Information

MST Configuration Name : H2SO4

MST Configuration Revision : 1

MST Configuration Digest : 0xF1AD53AD5D69827DFCB5C5B5D00F6D88

IST Mapped VLANs :

Instance ID Mapped VLANs

----------- ---------------------------------------------------------

1 1-35,101,111-500,1001-4094

2 36-100,102-110,501-1000

Highlighted in blue is control sum, it must coincide with all configurations on HP and Cisco. If it is different - look for differences. So long as the digest is different, MSTP will work through the instance 0, and all the ports will be Boundary.

Cisco:

sh spanning-tree mst configuration

Name [H2SO4]

Revision 1 Instances configured 3

Instance Vlans mapped

-------- ---------------------------------------------------------------------

0 none

1 1-35,101,111-500,1001-4094

2 36-100,102-110,501-1000

-------------------------------------------------------------------------------

sh spanning-tree mst configuration digest

Name [H2SO4]

Revision 1 Instances configured 3

Digest 0xF1AD53AD5D69827DFCB5C5B5D00F6D88

Pre-std Digest 0x79EA425B9595B8B88B3E715854CC0CC8

In this case, MSTP blocked not profitable route between switches for all instances, leaving him for the event of an accident.

Some statistics from Cisco:

sh spanning-tree mst 1

##### MST1 vlans mapped: 1-35,101,111-500,1001-4094
Interface Role Sts Cost Prio.Nbr Type
---------------- ---- --- --------- -------- --------------------------------
Gi0/18 Desg FWD 20000 128.18 P2p
Gi0/20 Desg FWD 200000 128.20 P2p
Gi0/21 Desg FWD 200000 128.21 P2p
Gi0/22 Desg FWD 200000 128.22 P2p
Gi0/23 Desg FWD 200000 128.23 P2p
Gi0/24 Altn BLK 20000 128.24 P2p
Po1 Root FWD 20000 128.36 P2p

sh spanning-tree mst 2

##### MST2 vlans mapped: 36-100,102-110,501-1000
Interface Role Sts Cost Prio.Nbr Type
---------------- ---- --- --------- -------- --------------------------------
Gi0/24 Altn BLK 20000 128.24 P2p
Po1 Root FWD 20000 128.36 P2p

Documentation for configure HP and Cisco equipment together: from HP and from Cisco

The switches in a network we have collected, in following articles we will talk about setting ports for users, add protection from users and loops, make the fault-tolerant access to the internet and the fault-tolerant mail relay.

Talk about BPDUGuard, BPDUFilter, PortFast, and about some features of the use of these services on the Cisco and HP ProCurve.
We make a cluster of two servers based on FreeBSD + CARP for distribution Internet to users.
We will make the cluster of two Debian / Ubuntu + UCARP for mail relay, relay mail between the Internet and the mail server or a cluster of company.
We will make the cluster of two Cisco + HSRP for two channels from different internet providers.

Sorry for my English...

Saturday, January 2, 2016

UnixDaemonReloader - Update 2016.01.03

на Русском

Updated version of the program. Added new features:

Delay before running the script;
Opportunity to perform before basic script, "pre-app" script for checking configuration or execute other actions, and on the returned response from the "pre-app" script to restart service, or to run "error-script". All these parameters are optional and are not mandatory to use.

Full article about Unix Daemon Reloader

Configuration file:

Update from 2016.01.03:
Added parameters "pre-app script", "result of pre-app script" and "error script" into listing of files for watching. "Pre-app" script must to return result to stdout. Example for return: "OK" :) If returned value equals value from configuration file, will running script for restarting or reloading service, else after ending amount of attempts to execute "pre-app", will running "error script". See README.md for study new syntax of WatchList.
PS: You can add into "pre-app script" syntax check and backup configuration file. "Error script" may contain sending E-Mail or SMS and restore from backup copy of the config.
Added parameter UDR_ScriptsPath, pointing the path to pre-app scripts.
Added parameter UDR_PreAppAttempt, indicating the number of executing times the "pre-app" script, after that execute "error-script" or stop attempts.
Fixed restart for all services after first initialize database of files

Update from 2015.12.23:

UDR_PauseBefore - pause before running the script (seconds). This setting for save your daemons from "your hands". If you during editing configuration file, accidentally press "save a file" with error or unfinished, then you have time to correct the error before the daemon will be restarted.

Full description for configuration file in: Full article about Unix Daemon Reloader

Source codes:
UnixDaemonReloader Source Code

Compilled binary files for FreeBSD and Linux:
UnixDaemonReloader on SourceForge

UnixDaemonReloader on My Google Drive

Sunday, December 27, 2015

Building a fault tolerant network infrastructure for your company.

Part 1: Spanning Tree Protocol. MSTP on HP ProCurve

Part 2: " Connect Cisco and HP ProCurve using MSTP"

Part 3: "Additional tuning for boundary and user ports on Cisco and HP ProCurve"

In Russian

In the previous articles we solved the problems created by unmanaged network hardware. In this article we begin a series for "novice professionals", build fault-tolerant network based on expensive equipment with more good features :)

There is a task to create a network in which is no single point of failure. All commutators are connected to at least two links to two other commutators, and all the servers are connected to two commutators. At the moment the scheme is not contains channels to the internet, VPN channels and telephony, redundancy of the channels will be considered in the future. To complicate the our scheme, we have in the network equipment of different brand names. It good equipment, but different, is not fully compatible for some protocols. It will be Cisco and HP ProCurve. Sometimes it happens: ready for Cisco 6000, then communicate to you, that will be Huawei instead of Cisco, and get as result HP ProCurve... ;)
Using equipment from different brands in the network - not the best idea! We must avoid a zoo in network equipment. But if it happened, you have be able to configure :) In fact, our scheme does not hide the big problems.. First, the ProCurve is good equipment, and all the basic LAN entirely on HP, and Cisco commutators only perform additional functions, and will not perform key tasks. We should not have many problems during the project.
Of course, I devoted fan of Cisco. It is known that who love VTP, HSRP and EIGRP, who with tears and pain uses GVRP, VRRP and OSPF :)

This is our network scheme:

Cisco equipment are not young, but Gigabit. HP ProCurve is new and modern, all with links of 10G, and HP will be the main core of the network. Cisco use as a additional equipment, for various servers that do not generate a lot of traffic. Cisco and HP also connected through multiple trunks, though without excess.

We collect, assemble, connect, turn on, make the basic settings!
All additional links between commutators must be physically disabled, until so long we not set up Spanning Tree. Otherwise many loops will kill our network.
Configure GVRP, and sadness without beautiful VTP :)
Permission is granted an identical list of VLAN's on all tanks between commutators! The identity is necessary that the STP would not paralize our network by prohibiting VLAN on permitted for this VLAN trunks, and allowing on prohibited trunks.
Configure virtual IP VRRP on HP1 and HP2, for users VLAN's on the different floors. The default gateway for users must be alive always, if at least one of these HP1 and HP2 is working. Oh, where are you, HSRP ?:)

Configure Spanning Tree on HP ProCurve.

We will configure MSTP, because it is the most logical choice for a large network with a large number of VLAN. In MSTP VLAN's can be divided per "instances", and how many instances you describe, the same number of processes will be. In PVST the number of processes is equal to the number VLAN's in your network. Memory will spend the more, than more you have a VLAN. For example, in the case of 100 VLAN's, you will have 100 processes - it's horror.

Although we configure a ProCurve, but will use the recommendations to configure MSTP from Cisco. Cisco recommends that you use the same "region" in all your network, the minimum number of instances, and set up the priorities for the "root bridge". If you do not know what is "region", "root bridge", and "instances" it is necessary to first read this: Wikipedia: Spanning Tree Protocol - This will help you understand everything :)

Also highly recommend advance to divide the entire possible range of a VLAN on "instances"! Because each following change "instance" later (when you have a working network), will lead to a recalculation topology of the network and unpleasant stop network work. And if "instances" become different on the commutators, all in general will fall, and topology recalculated into a single "instance" 0 to communicate between the commutators with do not match settings.

So, start! Create two instance, given the fact that the main links between commutators in pairs, and the number of instances more than the number of links it makes no sense. We confirm that the spanning tree is disabled and we start the configuration:

Configure root HP1:
Set up the name of the region MSTP. It must be the same throughout the network:
spanning-tree config-name "H2SO4"
Config revision number must be the same in all network:
spanning-tree config-revision 1
I divide all VLAN on two "instances" according to the load on them in my network. Distributing the load evenly between instances:
spanning-tree instance 1 vlan 1-35 101 111-500 1001-4094
spanning-tree instance 2 vlan 36-100 102-110 501-1000
This commutator will be as root for instance 1:
spanning-tree instance 1 root primary
In general priority of the commutator, this root bridge in spanning tree region:
spanning-tree priority 1

Configure root HP2:
spanning-tree config-name "H2SO4"
spanning-tree config-revision 1
spanning-tree instance 1 vlan 1-35 101 111-500 1001-4094
spanning-tree instance 2 vlan 36-100 102-110 501-1000
Everything here is the same as for HP1, but it has root for instance 2, and the priority of the root bridge in the region lower: 2
spanning-tree instance 2 root primary
spanning-tree priority 2

Configuring the commutators on the floors HP Ax:
spanning-tree config-name "H2SO4"
spanning-tree config-revision 1
spanning-tree instance 1 vlan 1-35 101 111-500 1001-4094
spanning-tree instance 2 vlan 36-100 102-110 501-1000
There must be the same configuration as for HP1 and HP2, but do not set any priorities. This commutators for users on the floors, it are a low priority.

Enable spanning tree on all Commutators:
spanning-tree enable

After entering this command, you lose the connection to the commutator until miscalculated Spanning Tree topology and all ports are enabled. After that we can connect additional links between commutators and wait for activation :)

All setting on HP ProCurve is completed! We can now see the statistics in the console of the commutators :)

So, we look at the HP1:
sh spanning-tree
Multiple Spanning Tree (MST) Information

STP Enabled : Yes
Force Version : MSTP-operation
IST Mapped VLANs : 1025-4094
Switch MAC Address : 001871-b6a000
Switch Priority : 32768
Max Age : 20
Max Hops : 20
Forward Delay : 15

Topology Change Count : 9
Time Since Last Change : 87 secs

CST Root MAC Address : 001871-b6a000
CST Root Priority : 32768
CST Root Path Cost : 0
CST Root Port : This switch is root

IST Regional Root MAC Address : 001871-b6a000
IST Regional Root Priority : 32768
IST Regional Root Path Cost : 0
IST Remaining Hops : 20

HP1, as prescribed for it, became a root in the MSTP region.

sh spanning-tree instance 1

E1 10GbE-SR 2000 128 Designated Forwarding 001b3f-c1a800
E2 10GbE-SR 2000 128 Designated Forwarding 001b3f-c1a800
E3 10GbE-SR 2000 128 Designated Forwarding 001b3f-c1a800
E4 Auto 128 Disabled Disabled
F1 10GbE-SR 2000 128 Designated Forwarding 001b3f-c1a800
F2 10GbE-SR 2000 128 Designated Forwarding 001b3f-c1a800
F3 Auto 128 Disabled Disabled
F4 Auto 128 Disabled Disabled

sh spanning-tree instance 2

E1 10GbE-SR 2000 128    Alternate Blocking   001b3f-582100
E2 10GbE-SR 2000 128    Alternate Blocking   001b3f-57c800
E3 10GbE-SR 2000 128 Alternate Blocking   0019bb-11ac00
E4 Auto 128 Disabled Disabled
F1 10GbE-SR 2000 128 Alternate Blocking   0019bb-0e2b00
F2 10GbE-SR 2000 128 Root Forwarding 001871-b6a000
F3 Auto 128 Disabled Disabled
F4 Auto 128 Disabled Disabled

In instance 2, all ways to the floors commutators labeled as alternative and closed.

Look at HP Ax located on the floors:

sh spanning-tree ins 1

L1 10GbE-SR 2000 128 Root Forwarding 001b3f-c1a800

L2 10GbE-SR 2000 128 Alternate Blocking 001871-b6a000

sh spanning-tree ins 2

L1 10GbE-SR 2000 128 Designated Forwarding 001b3f-582100

L2 10GbE-SR 2000 128 Root Forwarding 001871-b6a000

Result: instance 2 is blocked by HP1 and comes to commutators located on the floors from HP2. Commutators located on the floors block instance 1 towards HP2, and receive it from HP1. The load is distributed across the two links, which we actually wanted :) And need second trunk between HP1 and HP2 :)

Our scheme. Instances 1 is marked as blue, instance 2 marked as red:

Well, MSTP between the HP we raised, the next time we try to connect Cisco Catalyst to this scheme :)

Saturday, December 19, 2015

Broadcast storm or how to win "the small scoundrels" in your network. Part 2 Part 1

на Русском

So, we continue to win the storm. Today, I will share the experience of using a dedicated loop-protection on Cisco and HP ProCurve.
Setting broadcast-limit on HP and storm-control with Cisco have helped, the situation in the network becomes much better, but not completely normalized. At the Cisco and HP have specialized tools to battle with loops, including loops to "unmanaged" equipment.

HP ProCurve.

On ProCurve this function is named a loop-protect, and works on a simple and reliable technology: send broadcast packet to the port, and if it returned, it means that on this port the "loop". Commutator sending broadcast packet to all ports excluding port from which it received the packet. If your network has the correct topology, packet can't return back. It is simple and reliable protocol for detection and blocking loops, even if the port is connected to a whole garland of unmanaged switches.
Configuration of this feature is very simple:

loop-protect A1-A24,B1-B24

A1-A24, B1-B24 - it ports range for apply the settings. Also this function has additional parameters for describe actions if detected loop:
... receiver-action send-disable - block the port and no other actions,
... receiver-action send-recv-dis - block and try to recover after some time.

Example: "loop-protect A1-A24, B1-B24 receiver-action send-disable" - block the port if detected loop and do not restore it automatically. Maybe automatic port recovery is not necessary, because at first need to find and remove the loop, and only after removing, enable the port: int A17 enable
You can also specify global settings for loop-protect, define timers and etc:

loop-protect disable-timer - how much time after block the port to try to restore the port,

loop-protect mode port / vlan - to work with the ports or vlan's ?

loop-protect transmit-interval 1-10 - the time interval between sending the "detecting packets" to port,

trap, vlan и PORT-LIST - it is clear without explanation :)

Today I tested the loop-protect technology - works perfectly! Port which connected to D-Link with loop on the two ports, almost immediately blocked by ProCurve:

Cisco.

Spanning-tree loopguard - It will not help us. This technology is based on "loss of BPDU communication", it for links between commutators, and not helps from loops on unmanaged network hardware.

UDLD:

Or global enable for all intefaces:

conf t

udld enable

Or on each interface separately:

conf t

interface GigabitEthernet0/1

udld port enable

This technology works well, but interval longer than the HP, 15 seconds against 5 at HP, and is not configurable. For this reason, network can a little shake before blocking the port. Also UDLD has two modes of operation: normal (just enable) and aggressive, but for our small tasks aggressive mode do not need, so we will not review aggressive mode :)

In general, good idea set BPDU Guard on client access port Cisco. BPDU Guard will block the port if incoming BPDU has been seen, for example in the case of a simple loop. In general, this is the fine tuning of STP, in the some following articles I will describe STP in more detail. BPDU Guard, and BPDU Filter, and much more i can tell :)

For the tests I used the HP ProCurve 5412zl and Cisco 2960G. As an unmanaged "the small scoundrel" i used the 8-port unmanaged D-Link with a loop in its ports.

Friday, December 18, 2015

UnixDaemonReloader - restarting daemons after modification configuration files!
(Update 2016.01.03)

на Русском

The series "my crafts" or "re-invent the bicycle" is still continues ;)

Recently, I began creating clusters for all services of the company. For such services as mail, proxy, VoIP and etc. It needs for load balancing and fault tolerance, and it is useful for my experience and knowledge, and also for my enterprise :) I had the task to sync configuration files between cluster nodes, and also to react for changes of configuration files... For example, on the master host changed the config of postfix mail server, file replicate to the second node of the cluster, so, what is next ? We must somehow restart postfix, or rather it reload. And so, i created some scenarios of syncronization and compare files for finding differences, i wrote many scripts and methods, but there was not a unified schema. One morning I woke up and decided to make a program that work as daemon, rereading own configuration before each cycle of work, and performing prescribed actions when the specified files changed. No sooner said than done! And the name for this service - UnixDaemonReloader :)

Configuration file:

The configuration file is very simple. Path to Unix shell and the parameter allows you to execute an external command, like so: /bin/sh -c "ps ax". Then, specify a list of strings to track files and directories:

["/directory", "file", "action", "pre-app script", "result of pre-app script","error script"],
["/directory", "mask*of*the*files*", "action", "", "",""],
["/directory", "!all*files*except*this,!except*this,!and*except*this", "action"]

Update from 2016.01.03:
Added parameters "pre-app script", "result of pre-app script" and "error script" into listing of files for watching. "Pre-app" script must to return result to stdout. Example for return: "OK" :) If returned value equals value from configuration file, will running script for restarting or reloading service, else after ending amount of attempts to execute "pre-app", will running "error script". See README.md for study new syntax of WatchList.
PS: You can add into "pre-app script" syntax check and backup configuration file. "Error script" may contain sending E-Mail or SMS and restore from backup copy of the config.
Added parameter UDR_ScriptsPath, pointing the path to pre-app scripts.
Added parameter UDR_PreAppAttempt, indicating the number of executing times the "pre-app" script, after that execute "error-script" or stop attempts.
Fixed restart for all services after first initialize database of files

Update from 2015.12.23:

UDR_PauseBefore - pause before running the script (seconds). This setting for save your daemons from "your hands". If you during editing configuration file, accidentally press "save a file" with error or unfinished, then you have time to correct the error before the daemon will be restarted.

UDR_ScriptsPath - path to "pre-app" scripts
UDR_PreAppAttempt - Number of attempts to try execute "pre-app" scripts.
UDR_PauseBefore - pause before running the script (seconds). This setting for save your daemons from "your hands". If you during editing configuration file, accidentally press "save a file" with error or unfinished, then you have time to correct the error before the daemon will be restarted.
Sleep_Time - How much time to sleep between checks files,
SQLite_DB - SQLite_DB - the way to the base SQLite, which stores the checksums of files.

Actions may be different, not necessarily the restart, reload, or "kill -HUP". For example, can send the message to administrator :)
This program not only for clusters, but for all other systems. This helper that will save you from manual restarting a services with frequently changing configs :)

The program is written on Go language, It can be compiled from source codes for Linux, BSD, Mac, Android and other.

Source codes:
  UnixDaemonReloader Source Code

Compilled binary files for FreeBSD and Linux:
  UnixDaemonReloader on SourceForge
  UnixDaemonReloader on My Google Drive

Saturday, December 12, 2015

Broadcast storm or how to win "the small scoundrels" in your network. Part 1 Part 2

на Русском

Possible, for many people Broadcast Storm is something out area of fiction, as well unrealistic as the existence of aliens and hacking your network. But, at least hack and storm - is absolutely realistic, and it may happen at any time, even in the worst possible time, in your opinion :)

So, how work the switch when all is well: the switch builds a table of mac-addresses where each address corresponds to the port, and the customer traffic has been going not to any ports, like old hubs, but in specific port following the table of mac-addresses. Before that, when the switch learned the customer port, the first traffic sending going to all ports, after which the client send reply, and the switch put mac-address of client to table of mac-addresses. After that, switch communicates with the client via a specific port.
But one day, an inattentive employee creates a loop in your network, or there is an network attack, or any crafty equipment failure, and everything become is bad... The switch sends the packet, but instead of an answer, packet comes back in a few copies, and the switch not knowing the correspondence between the mac-address and port, solicitously sends packets back to all ports and again gets many copies of the packet back... Packages reproduce very quickly, traffic is growing like an avalanche, ports overloaded, trunks fall, cpus overloaded, SpanningTree collapses, up to that moment trying to block everything, and avalanche smashes into adjacent segments and drowns them. Your network no longer exists...
In this situation, not many variants, because your network are paralyzed, to monitor and search the source of storm is simply nowhere, because there is no network :) I see one of the variants as successively to shutdown the network segments or conversely: shutdown of all network segments, and successively connecting them, to detect which segment was the source of the chaos. Search in the segments, after that in links, ports - very sad and very long time for solution of the problem. Downtime can be just awful! So, us necessary to preparing for possible malfunctions, to minimize the harm from them!

To begin, you should never switch off Spanning Tree on the equipment! if it possible, you may tune options of the protocol in accordance with the topology of the network. If you do not have experience of STP and you have a small network - leave the default settings, STP cope with almost all problems! Why "almost"? Because, STP can not cope with loop, which will be on unmanaged hardware without STP support...

It would seem that can overload or even paralize such a network:

10 Gigabits links between switches, 1 Gigabit links to end-users + configured and tuned MSTP. No problems with any loops on the main equipment of the network. MSTP will block any double-links or loops in the main equipment. Moreover, all of the switches have double-links between themselves for fault tolerance, and successfully working through MSTP. I will not be consider possible attacks and malfunctions. I will be consider the option, if your network has a "stupid" unmanaged switch like D-Link... and employee made a loop on it ... And STP and MSTP will be powerless in this situation :) This "the small scoundrel" will bring down your network in few of minutes or even few tens of seconds :)
I seems to me, to help you protect yourself from unmanaged equipment at 100% there is only one way - ban for use it in your network. But if the ban and "throw in the trash" impossible for you for some reason, then you should configure equipment for the fight against the storm.
Modern switches have a tools such as the storm-control, broadcast limit and perhaps some variations of this same functions on the equipment of different manufacturers.
Setting broadcast-limit to 10% on HP and storm-control broadcast level 10% on Cisco - my network is alive by 80% with loops on the "stupid" equipment. The only thing that bothers me is that STP continues be naughty. Still are blocking wrong ports, and the storm from blocked port continues to storm anyway, though not much. Perhaps this is because the transmitted BPDU continues when the port locked by STP, and after blocking the initiator, broadcast storm changes to BPDU storm.
Here's an example:

Loop on the port A5, port immediately blocked, and folowed by the port L2 is also blocked. It does not paralyze the network, because there are additional links, but still not very nice, because in this way it can block an important port. Some problems still remain,, but thanks to "broadcast limit", the storm to very weak and I can already say for sure that the limits on Broadcast traffic are helping! Not for 100%, but very well help :) Next, we will configure the MSTP, but more on that next time :)

Here is an example of ping, when the loop in the D-Link activated. Without these settings ping did not works. So, it is pre-victory :)