Quantcast
Channel: All Ethernet Switching posts
Viewing all 10307 articles
Browse latest View live

Switch in 4300-MP virtual chassis shuts down PPMD_PFE_SHUTDOWN

$
0
0

We have a fairly new EX4300-MP virtual chassis, running 18.4R2.7.   No one in the building most days at this point.  We had a switch in that chassis disappear.  Power and stack cables all seated and good.  Did not lose redundant power, just took a nap.  Doing a "request system reboot member 2" did nothing.  We had a facilities person go into the IDF, unplug the switch, and cold boot it.  Switch rejoined the community.  Looking at he logs, we see this error from 4 days ago:

4300VC-IDF ppmd[6710]: PPMD_PFE_SHUTDWN: PPMD: Connection Shutdown/Closed with  PFE: fpc2

 

Anyone familiar with this type of error?  Appreciate any insight.

 


Re: Switch in 4300-MP virtual chassis shuts down PPMD_PFE_SHUTDOWN

$
0
0

Hi jmorrowCSTR,

There is a process PPMD on the routing engine that maintains a TCP connection with PPMAN process on the FPC's forwarding engine (PFE). Purpose of PPMAN is to manage some low level protocols that do not need RE to process like LACPDUs, STP BPDUs etc.

This log message is indicating loss of this connection that has caused the relevant FPC to go down. Please monitor the CPU usage of the RE and anything special in the usage of this particular FPC (no. of LAGs, chances of L2 loop or ARP flood) if other FPCs were stable.

Commands:
RE CPU:
show chassis routing-engine
show system processes extensive | except 0.00

FPC:
show chassis fpc


Also, there's a PR that adds an enhancement in handling a jlock hog that I believe should be fixed on your Junos version 18.4R2.7
https://prsearch.juniper.net/InfoCenter/index?page=prcontent&id=PR1401507

Check these to match the symptoms of the PR:
a) Look for syslog message "kernel: jlock hog detected" prior to the "PPMD_PFE_SHUTDOWN" message.
and/or
b) Monitor the output if the counter is increasing:
start shell
sysctl -a | grep jlock | grep cnt

If the symptoms match, said, it may be worthwhile running this case by JTAC because there was a PR fixed. If the symptoms do not match, then it's better to check/monitor the RE/FPC CPU and find the culprit on either side and fix it or replace it.

Hope this helps.

Regards,
-r.

--------------------------------------------------

If this solves your problem, please mark this post as "Accepted Solution."
Kudos are always appreciated Smiley Happy.

Re: Switch in 4300-MP virtual chassis shuts down PPMD_PFE_SHUTDOWN

$
0
0

Also the counter might be named slightly differently, something like generic "net_jlock.hog.cnt" on this EX.  In either case, it may be worth quoting this PR and reporting to TAC on a case if the syslog messages is seen.

 

Hope this helps.

 

Regards,
-r.

--------------------------------------------------

If this solves your problem, please mark this post as "Accepted Solution."
Kudos are always appreciated Smiley Happy.

 

Re: EX4300: Framing error with macsec enabled

$
0
0

Hello,

 

I'm from ATAC, I did some quick search in our database and found a PR that is currently under investigation related to framing errors when mac sec is enable. As per the description, the problem is seeing when there is high utilization in the interface around 70% of the bw.

 

I would suggest you to open a case to confirm if it is the same problem.

 

If this solves your problem, please mark this post as "Accepted Solution".

 

 

 

 

Re: QFX VC suddenly stops handling packets

$
0
0
Hello ehsab,

Changing MTU on the interface is a disruptive process and causing the interface address to be reset, so I don't believe these log messages are giving much clue rather the logs may be expected on any QFX you change MTU on. Also check if the interface itself is flapping after the change? So you may just be masking the problem which temporarily fixes due to this reset, unless you have any technical reason like large packets getting dropped etc. that correlates to MTU.

It's better to spend a few minutes in troubleshooting when the issue occurs to get to the bottom of this. Some questions that come to mind are:
a) Check if there are protocol adjacencies over the impacted trunks and if they remain stable.
b) Check if all traffic is impacted or some? When you say VC stops handling packets, does it mean there's ingress but no egress?
c) Check and clear interface statistics - for broadcast packet count, errors if any.
d) Check if MAC learning works as expected:
show ethernet-switching table
show ethernet-switching mac-learning-log <<<<<<<<<<<
e) Do we have STP enabled? Check if there's any frequent change topology changes.
Etc.

Hope this helps.

Regards,
-r.

--------------------------------------------------

If this solves your problem, please mark this post as "Accepted Solution."
Kudos are always appreciated Smiley Happy.

Re: EX2300 not seeing LLDP Neighbor

$
0
0

Hello,

 

The EX switch should be sending LLDP packets as long as it has a family configured, it shouldn't matter what vlans are configured. Have you confirmed that the EX is receiving LLDP packets from the cisco?

 

You can run these commands to see the LLDP packets. 

 

>monitor traffic interface ge-0/1/2 size 1500 no-resolve

>monitor traffic interface ge-1/1/2 size 1500 no-resolve

 

If this solves your problem, please mark this post as "Accepted Solution".

 

 

Re: QFX VC suddenly stops handling packets

$
0
0

Hi, and thanks for taking the time.

I'll share the information i gathered in both cases.

 

Case 1

> show configuration interfaces ge-0/0/24
mtu 1600 unit 0 { family ethernet-switching { interface-mode trunk; vlan { members [ v804 v805 ]; } } }

Traffic stops for no obvious reason, no mac-addresses are learnt on that port. The interface has no errors and has not flapped.

> show ethernet-switching interface ge-0/0/24.0
Routing Instance Name : default-switch

Logical          Vlan          TAG     MAC         STP         Logical           Tagging
interface        members               limit       state       interface flags
ge-0/0/24.0                            294912                                     tagged
                 v804          804     294912      Forwarding                     tagged
                 v805          805     294912      Forwarding                     tagged

Looks normal, but no traffic or macs learnt.

Physical interface: ge-0/0/24, Enabled, Physical link is Up
  Interface index: 676, SNMP ifIndex: 573
  Description: 
  Link-level type: Ethernet, MTU: 1600, MRU: 0, Speed: 1000mbps, BPDU Error: None, MAC-REWRITE Error: None, Loopback: Disabled, Source filtering: Disabled,
  Flow control: Disabled, Auto-negotiation: Enabled, Remote fault: Online, Media type: Fiber
  Device flags   : Present Running
  Interface flags: SNMP-Traps Internal: 0x4000
  Link flags     : None
  CoS queues     : 12 supported, 12 maximum usable queues
  Current address: ec:3e:f7:97:8d:db, Hardware address: ec:3e:f7:97:8d:db
  Last flapped   : 2020-04-16 08:17:42 UTC (01:37:03 ago)
  Input rate     : 1304 bps (2 pps)
  Output rate    : 1040 bps (1 pps)
  Active alarms  : None
  Active defects : None
  Interface transmit statistics: Disabled
  Logical interface ge-0/0/24.0 (Index 591) (SNMP ifIndex 574)
    Flags: Up SNMP-Traps 0x24024000 Encapsulation: Ethernet-Bridge
    Input packets : 0
    Output packets: 0
    Protocol eth-switch, MTU: 1600
      Flags: Trunk-Mode

ge-0024-bps.jpgge-0024-non-unicast.jpgge-0024-packets.jpg

 

 

Case 2

Traffic drops at 20:48:37 (hh:mm:ss).

> show configuration interfaces ae0
description "Trunk to sw01";
aggregated-ether-options {
    minimum-links 1;
    lacp {
        active;
    }
}
unit 0 {
    family ethernet-switching {
        interface-mode trunk;
        vlan {
            members [ v357 v361 v362 v364 v365 v356 v159 ];
        }
    }
}
> show ethernet-switching interface ae0.0
Routing Instance Name : default-switch

Logical          Vlan          TAG     MAC         STP         Logical           Tagging
interface        members               limit       state       interface flags
ae0.0                                  294912                                     tagged
                 v159          159     294912      Forwarding                     tagged
                 v356          356     294912      Forwarding                     tagged
                 v357          357     294912      Forwarding                     tagged
                 v361          361     294912      Forwarding                     tagged
                 v362          362     294912      Forwarding                     tagged
                 v364          364     294912      Forwarding                     tagged
                 v365          365     294912      Forwarding                     tagged

There is OSPF/BGP running over vlan 159 with an irb interface, i see no OSPF neighbor and the BGP session is down.

No mac-addresses are seen on the port.

Flapping the interface has no affect.

22:52:08 i change mtu from 1514 to 1516 and do a commit confirmed 4 sync, during that 4 minutes nothing changes, still no traffic. But when the VC rollsback the commit, the traffic flows fine again.

 

ae0-bps.jpgae0-non-unicast.jpgae0-packets.jpg

 

STP is globally disabled.

Any ideas what could cause this or what to look for when it happends? Any particular logfile that i should look och search in?

 

Kind Regards

ehsab

Re: EX4300: Framing error with macsec enabled

$
0
0

Hi

this is actually our case, lldp and lacp are out of macsec encryption.

 

Thanks


Re: EX4300: Framing error with macsec enabled

$
0
0

Hi

facing our configuration if I go to span the family ether-switch port (ae0) I obtain to see traffic after decryption, I was expecting to see traffic before encryption and this was useful to see if I get some traffic of control protocol recevied by the carrier.

Our issue is not related to the mentioned PR since also with 1% of the traffic on the link we hit continuos framing error increment (1 every 4-6 seconds).

Is there a way to dump on the EX4300 the traffic before decryption ?

Thanks for your help

Re: EX4300: Framing error with macsec enabled

$
0
0

@rambo - did you try and filter out LLDP from the ISP or get them to stop?

 

For capture traffic prior to hitting EX4300 I think only method would be to add in a tap or a switch/hub, and then mirror that traffic prior to it getting to EX4300.

 

Otherwise turn of MACSEC (both ends) mirror and then re-enable MACSEC, . . .

Re: QFX VC suddenly stops handling packets

$
0
0

Hello ehsab,

 

You can keep "set system syslog file messages any any" to see if there's anything on the device logging at the time.  However, from your notes, it seems as though traffic is not even coming in on this interface, is that right? If that's right, then there's a lot of possibilities here.  It's impossible for the L3 protocols to be up and you won't learn any MAC addresses on the port.  So better to take a copy of the MAC table "show ethernet-switching table" in working state and compare in non-working.

 

Also, you've shown two different interfaces altogether (ae0 and ge-0/0/24) does "ae0" have links in both FPCs and are these both expected to be uplinks or connect to complete different network segments? This data isn't given much meat to think about as far as I know.

 

Hope this helps.

Regards,
-r.

--------------------------------------------------

If this solves your problem, please mark this post as "Accepted Solution."
Kudos are always appreciated Smiley Happy.

Re: QFX VC suddenly stops handling packets

$
0
0

Hello Ehsab,

 

It should be good if you open a case with JTAC.  Those kinkd of issues requires some live troubleshooting.  I may suspect of several things, but at the end a troubleshooting session is a better approach that just assuming things.

 

Those kind of situations most of times are related to memory leaks, high CPU, DDoS policers, file system corruption or a misprogramming issue.

 

Possible workarounds could be a mastership change, a reboot of the affected member or all members in a virtual-chassis, an upgrade or a format install.  However, you will need some troubleshooting to determine the root cause.

 

 

Regards,

 

 

Randall

Re: EX2300 not seeing LLDP Neighbor

$
0
0

Hi , 

 

Did you make sure the date is properly configured?  > show system uptime

can you try to disable it and re-enable it again?

make sure you LLDP statistics are increasing for IN/OUT

 

Thanks!

Franky

Re: QFX VC suddenly stops handling packets

$
0
0

@mriyaz

I'm not sure if there is any traffic coming in to the interfaces when this happens. I will setup a mirror port to better be able to troubleshoot this and to see what actually is on the interface next time.

I did do a copy of the mac table when the problem was active, and compared it with the table once it was "fixed" and it was only missing mac-addresses on that port.

Ge-0/0/24 is connected to a different segment then what ae0 is, and ae0 only has one interface (from the same fpc as ge-0/0/24) at the moment.

There must be some logic to why this has happend to two different trunks/interfaces, there are lots of other interfaces and ae's that could be affected, i would like to narrow it down to whats causing this.

There surely must be some logfiles i could search in to find more information?

What happends in the switch/interface when mtu is changed?

Re: QFX VC suddenly stops handling packets

$
0
0

@randero

Hi and thanks for your reply.

I don't have a valid support contract on the switches, so i', fairly sure that JTAC whont handle the case.

The workarounds you suggest are last resort for me Smiley Happy

 

Kind Regards

ehsab

 


Re: EX4300: Framing error with macsec enabled

$
0
0

It seems I was able to find some spurious traffic coming from 00:02:ab:0e:28:91 broadcast destinated and with unknown 

ethertype 0x1021 which is Private, Protocol unavailable.

 

Macaddress owner seems to be CTC Union Technologies Co.

 

I'm not sure how I can filter it, I guess only the carrier can do it.

Thanks

Re: QFX VC suddenly stops handling packets

$
0
0

Hi Ehsab,

 

There are some outputs you may check to find a root cause.

 

>show system processes extensive | except 0.00

>show chassis fpc 

>show system core-dumps

>request pfe execute command "show syslog messages" target fpc0 

>show ddos-protection protocols statistics terse

 

 

Not sure if you have those workarounds, but you can try rebooting the devices, upgrade to another version or a recovery install to refresh the file system.

 

Hope this helps Smiley Happy

 

Randall

 

Re: Mirroring of firewall internal and external from core 4600 to IDS device connected to edge 4600

$
0
0

I wanted to update everyone on this, we were hitting an issue with our firmware (14.1X53D46.7), so this is most likely why we weren't able to make rspan work. JTAC also ran into this issue when they performed this in their lab. I guess this can be closed now. Thanks everyone to all the suggestions and help on this.

Re: Switch in 4300-MP virtual chassis shuts down PPMD_PFE_SHUTDOWN

$
0
0

mriyaz -

Appreciate the insight.  I'm not sure if the PR is relevant, though.  There is one LAG on this particular switch, supporting a Aruba AP.  And also, this occurred on a Saturday afternoon with noone in the office.  Our monitoring software shows CPU was in the 4% range.  That VC has been upp and stable since October 2019.  Just strange that the switch disappeared, and needed a cold boot to recover.  I did put the incident in Juniper's JTAC hands. 

So thanks, and if nothing else, you taught me a few new commands, which I appreciate.

Stay safe.

Re: EX2300 not seeing LLDP Neighbor

$
0
0

Hi,

 

I would recommend making sure both are in the same VLAN, and also if the Planet switches are sending the traffic untagged/tagged to your Juniper device, and that it is configured to match as well (with native-vlan-id for example). Also, explicity enable the interface under protocols lldp to see if it comes online.

Viewing all 10307 articles
Browse latest View live


<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>