Jira Slow Down And Inaccessible
Scenario
Receive ticket saying the application(jira) server is down and was slow on responding. Proxy server report proxy error; received invalid response from the upstream server. Browser can’t open webpage.
Determine
-
Identify the impact of the application ( internal team, clients, etc.)
-
Identify the issue is on which level
-
Network
-
Open up browser and access the webpage from the internet to confirm seeing the error mentioned in the tickets.
-
Make sure server is accessible from the internet
ping jira.xyz.com
-
Make sure DNS record is correct.
nslookup x.x.x.x
dig jira.xyz.com
-
-
System
-
Make sure server is up
-
Try access into the server (SSH)
-
View from the the management console
-
-
Check system resource usage
-
Check disk usage:
df -h
-
Check memory usage:
lsmem
-
Check other monitoring tool (nagios, observium, etc.)
-
-
-
Application
-
Check the application service status
-
service [httpd | haproxy | nginx | etc.] status
orsystemctl status [httpd | haproxy | nginx]
-
ps auxf | grep [service]
-
-
-
Assumption
- Webpage is unavailable
- Server is up and could be ping
- All resource usage is normal (CPU, MEM, Storage, etc..)
- Application(jira) is running
- Proxy application (nginx/haproxy) is running
Diagnose
-
Make sure public facing interface is receiving packets
tcpdump -vvvs 1024 -A -l -i [interface-name]
-
Make sure application(jira) is still functioning
curl https://127.0.0.1:8080
(assume ssl/tls only applied on the proxy side)curl -kv https://127.0.0.1:8083
(if https is enforced) -
Search for application(jira) logs, same apply to other applications
- Log files can be found under
/opt/atlassian/jira/logs
I usually search for the application process. Usually you can find the config file path. From there, read the config file and search for keyword “log”. Otherwise, I use whereis to find out the path. Last, I google for information.
-
ps auxf | grep [service]
-
whereis [service]
- Log files can be found under
-
Check
error_log
andssl_error_logs
-
tail [log_file]
usually the latest error will appear -
If nothing helpful or no error, move on to the actual application log
-
-
Take a sneak peak of the log file or filter the keyword.
-
tail /opt/atlassian/jira/logs/catalina_log.xxxx-xx-xx.log
check the last portion of the log -
cat /opt/atlassian/jira/logs/catalina_log.xxxx-xx-xx.log | less
view the log file-
The navigation works similar to
vi
, I use/
,G
,gg
to browse the output./
: searchG
: go to bottomgg
: go to top -
Look for the timestamps that are close or before the time the ticket was created
-
-
grep 'keyword' [log_file]
-
At this point, if the error has found and could be easily resolve (10-15min), document the error message and resolve the issue, record while resolving.
However, if no useful information found. Since jira is a ticketing application, temporarily bring it down and restart it will not cause too much impact to the team. And since no one is able to access jira at the moment, I will restart jira immediately and check the result.
Document the date and time and the issue. If similar problem occur frequently, review all logs and perform a in-depth root cause analyze.
Error
Output from catalina.log
:
23-Jan-2019 10:01:55.997 SEVERE [ContainerBackgroundProcessor[StandardEngine[Catalina]]] org.apache.catalina.core.ContainerBase$ContainerBackgroundProcessor.run Unexpected death of background thread ContainerBackgroundProcessor[StandardEngine[Catalina]]
java.lang.OutOfMemoryError: GC overhead limit exceeded
Solution
Increase JIRA tomcat server memory
vi /opt/atlassian/jira/bin/setenv.sh
Set JVM_MINIMUM_MEMORY
and JVM_MAXIMUM_MEMORY
to desire capacity:
JVM_MINIMUM_MEMORY="512m"
JVM_MAXIMUM_MEMORY="1024m"