2021年6月12日 星期六

Anywhere maintenance & troubleshooting cheat sheet

Useful Command

1. Remove node's log  

rm -rf /mnt/logs

2. Stream content of p2_ha log continuously

tail -f /mnt/logs/p2ha.log

3. Dump troubleshooting statistics for bug fixing

dumpStat.sh

4. Get neighbor from neighbor table

cat /sys/kernel/p2_nbr/nbr_tbl_dump

5. Restart syslogd when log is not outputting for some reasons

/etc/init.d/syslogd restart

6. View p2 routing protocol information

p2rp_cli -u -D

7. To find packets to and from eth1 interface filtered with only port 16001 related 

tcpdump -i eth1 -p 16001

Useful Meta Data

1. Log path of launcher in Windows

C:\Program Files\Anywhere Node Manager Launcher\user_data\logdata


Troubleshooting Notes

Scenario 1: In A-NM, you find node information of any nodes not showing up in mesh topology after performing node recovery (given the lost node is physically on)

Procedure

1. Check console log http response output to compare the result with UI behavior

2. Check whether the node is in managed device list (Note: node recovery failure may cause lost node being unamanged)

3. If 1. returns fail to retrieve, get the reason of failure, try to map the failure object with the error object documented in "doc_ui_p2_controller", try to resolve through the detailed error message returned from the controller

4. Try to isolate the problem from UI using "controller_restful_tester", run get-nodeinfo command to the remote node from the controller, if success, connection of controller <--> node should be good

5. Try to further isolate the issue from controller by directly retreiving node information using protobuf tool. Remember to run the meshTopology command first to achieve the access port of the remote node first

& '<pythonEXE_dir>' .\cli\ha.py --pw mgnt_pwd mgnt_ip -p access_port 1 > mesh_topology.txt

The "1" refers to the request type, no need to place the whole action name

6. If in 5, we still cannot get the result, troubleshoot, in host node, using tcpdump, data-network packet anaylzer to at least confirm packets are flowing normally between controller <---> host_node and host_node <--> remote_node with their perspective interfaces and ports

e.g.:

Let's say we have a controller (10.240.2.34), a host node (10.240.222.224) and a remote node. 

First we run "tcpdump -i eth1 port 16001" on host node to ensure packet is running to remote node through host node (as 16001 is the NAT port of the remote node), if there are packets routing through port 16001, it implies there are some traffic from 10.240.222.224 to the remote node through the ethernet port connecting between controller

Then we run "tcpdump -i mesh0 port 12381" on remote node, making sure packet is coming to and from the remote node

If the result is positive, we can be sure the layer 3 and layer 4 connectivity is working as expected.

7. Next, use command 4 to make sure neighbor link can be discovered in both nodes


Scenario 2: Capturing logs and output to other parties for bug fixing

1. Before capturing any logs, we need to make sure time stamp of 1) Node 2) Controller and 3) UI is in sync first.

2. For node to sync time, navigate to cluster configuration, and set the timezone to Hong Kong, reboot of nodes in cluster to apply the changes

3. Navigate to system settings, configure the ntp server (IP only, same L2 environment). Therefore we should find a PC (better Linux based), install the ntpd and configure with the HK time server first before proceeding 4) 

4. Login to the cluster and run "date" command on all nodes make sure their time is in sync with the time server you have configured in 3)

5. Since controller and UI should share the same time (Windows system time), just make sure they are in sync with the Windows time should be fine

6. Navigate to /mnt/logs to remove all logs from the target nodes

7. Remove all controller logs in "C:\Program Files\Anywhere Node Manager Launcher\user_data\logdata"

8. 

Miscellaneous Materials

1. SNAT, DNAT and masquerade

https://www.huaweicloud.com/articles/90a13a644803d0efcd024df76fb130ae.html

2021年6月5日 星期六

Frequently used web programming / design techniques

1. Use "data-*" to pass data to event listener

There are many times we need to pass parameters to the event listeners but may not want to use bind or arrow function to prevent frequent unnecessary re-rendering in React, in this case we can add "data-*" html attribute to accomplish.

e.g.: 

<a href="#someLink" onCLick={onClickHandler}>
  <span data-idx={valToOnclickListener}>someSpanText</span>
</a>

Then in "onClickHandler" function, we can use event object "e" to retrieve the data-idx parameter we would like to make use for further processing in this way.

const onClickHandler = (e) => {
  const valToOnclickListener = e.target.dataset.idx;

  // do anything here with valToOnclickListener

}

Sometimes the listener will capture the child element even though you actually put the "data-idx" attribute to the parent, in this case use e.currentTarget,  the target capture must be the one which attaches the listener.


2. Specify child DOM using JSS

Use > to specify the child, use together with the class name, e.g.:

{
  someClassName: {
        '& > :first-child': {
                borderLeft: '2px solid transparent',
         }
    }
}


References