Server Troubleshooting

From Multiverse

Jump to: navigation, search

Contents

Check server log files

A great troubleshooting resource is the servers' log files.

The servers save log files to mv_home/logs/worldname. So, if you installed the servers to c:/multiverse, the logs for sampleworld will be in c:/multiverse/logs/sampleworld. Each server saves a separate log file, as described in Server Logging.

Can't run Java

If you see an error (using the Windows batch file) such as:

Java not recognized as an internal or external command, operable program or batch file

or (using Cygwin):

java: command not found

Then you can't run the Java virtual machine (VM).

First, make sure you have installed the appropriate version of Java. Look at your C:\Program Files\Java folder and make sure there is a sub-directory jdk1.5xxx or jdk1.6xxx. If you only see a jrexxx sub-directory, then you need to install the JDK (The JRE does not include the server Java VM, which the servers require). If you only see a jdk1.4.xxx sub-directory, then you need to install a more recent version of Java.

If you confirmed that you installed Java, but are getting this error, then the Java VM isn't registered in your Windows system path.

To add it to your system path:

  1. Write down the absolute path to the installed JDK's bin folder.
  2. Click on Start.
  3. Right click on My Computer.
  4. Select Properties.
  5. Click on the Advanced tab.
  6. Click on the Environment Variables button.

If you have a PATH variable in the top list:

  1. Click on the PATH variable.
  2. Click the Edit button.
  3. In the line Variable Value, click anywhere in that window.
  4. Hit the End key so that you go to the end of the line.
  5. Add a semicolon, then type in the path you wrote down that points to the Java SDK bin directory.
  6. click OK three times.

If you don't have a PATH variable:

  1. Click the New Button
  2. In the variable name field type in: PATH
  3. in the Variable Value type in: %PATH%;
  4. After the semicolon type in the path to your JDK bin folder (that you wrote down earlier).
  5. Click OK three times.

Restart your computer. You should get the Java Help information instead of the command not found error message.

Error: No Server Java VM

The Multiverse servers require the "server Java Virtual machine", only available with the Java Development Kit (JDK), not the Java Runtime Engine (JRE). For information on the differences between client and server VMs, see Sun's document Java Virtual Machines.

If you try to run the Multiverse servers with the JRE, you will get startup errors like the following:

Starting domain server: Error: no `server' JVM at 
   `c:\ProgramFiles\Java\jre1.5.0_08\bin\server\jvm.dll'.
FAILED
Starting combat server: Error: no `server' JVM at 
   `c:\Program Files\Java\jre1.5.0_08\bin\server\jvm.dll'.
FAILED
...

To run the servers, you must install and configure the JDK. In theory, you can make Java use this runtime instead of a previously-installed JRE by setting environment variables PATH and JAVA_HOME environment to the JDK installation directory, not the JRE installation directory. However, this procedure problematic; simply copying the JDK sever directory to the JRE directory as described below is more reliable.

To correct the above error and enable the server JVM, follow these steps:

  1. Download and install the Sun JDK. <li>At the command prompt (Windows or Cygwin/Linux), enter this command: java -version
  2. This will return something like: java version "1.5.0_08" Java(TM) 2 Runtime Environment, Standard Edition (build jre1.6.0_03-b03) Java HotSpot(TM) Client VM (build jre1.6.0_03-b03, mixed mode)
  3. This means that your Java runtime (JRE) directory is C:\Program Files\Java\jre1.6.0_03\bin. If you look in this directory, you will notice there is a \client sub-directory for the client JVM, but not a \server sub-directory for the server JVM.
  4. Now, find the directory where you just installed the JDK, for example C:\Program Files\Java\jdk1.6.0. Copy the sub-directory jre\bin\server and all its contents from the JDK to the JRE directory. For example, copy C:\Program Files\Java\jdk1.6.0\jre\bin\server to C:\Program Files\Java\jre1.6.0_03\bin.

When you are done, your JRE directory should contain a jre\bin\server directory, for example C:\Program Files\Java\jre1.6.0_03\bin\server containing several files. To confirm you can run the server JVM, enter the command:

java -server -version

This command should return version information, not an error message.

Confirm all processes are running

Because all Multiverse Java procesess run in the background, if one crashes, you won't necessarily get an error message or notification. On Linux or on Windows with Cygwin, to confirm all processes are running, use these commands:

cd mv_home/bin
./multiverse.sh status

If you are not using Cygwin, then use:

status-multiverse.bat

These commands both list the server processes and whether they are running or not.

Stopping servers

Starting the servers twice in a row, or doing similarly unfriendly acts can cause a process to hang in the running state in the background even if the script thinks its stopped.

If this happens one or more of your processes won't start up. The error in the error log will be something to the effect that a script cant be started. This isn't the only reason this message happens so it isn't exclusive to this error.

To deal with this problem, kill the Java processes as follows.

Windows

Use this command to kill all the running Java processes:

stop-multiverse.bat

This is basically foolproof, since it terminates all java.exe processes. But be careful that you don't terminate some other Java that you want to keep running.

Linux

On Linux, or Windows/Cygwin, use this command to get the process ID of all Java processes:

cd mv_home/bin
ps -ef | grep java

This command will give results such as:

MyUserName    2240       1 con  11:12:15 /cygdrive/c/java/jre1.5.0_08/bin/java
MyUserName    4008       1 con  11:12:17 /cygdrive/c/java/jre1.5.0_0 8/bin/java
MyUserName    3920       1 con  11:12:18 /cygdrive/c/java/jre1.5.0_08/bin/java

The second colum lists the process ID (pid) of each Java process. Kill them manually with the command:

kill pid

For example,

kill 2240

Problems starting the servers

If you have problems starting the servers, check the following possible causes.

Check Firewall Settings

One common cause of errors is a local firewall. See Configure firewall for more information.

Examine log files to find errors

Each server process writes error messages and other useful information to a log file in the mv_home/logs/worldname directory. Each server process creates one log file. Examine these files with your favorite text editor (or cat or less on Linux/Cygwin).

If there is a failure in the execution of one of the scripts, you will probably find the string "ERROR" in one or more of the log files.

Linux or Windows/Cygwin

To quickly find errors, use the grep command on your log files. For example, the following command searches all the log files for the string "ERROR":

cd mv_home/logs/sampleworld
grep ERROR *

The result might be something like this:

mobserver.out:--ERROR--: [Thu Oct 19 10:55:13 PDT 2006] 
{Thread[main,5,main]} could not execute script- skipping: Traceback (innermost last):
proxy.out:--ERROR--: [Thu Oct 19 10:55:05 PDT 2006] 
{Thread[main,5,main]} could not execute script- skipping: Traceback (innermost last):
worldreader.out:--ERROR--: [Thu Oct 19 10:55:08 PDT 2006] 
{Thread[main,5,main]} could not execute script- skipping: Traceback (innermost last):

Then, when you examine proxy.out, you will see

...
debug: [Thu Oct 19 10:55:05 PDT 2006] {Thread[main,5,main]} Executing script file:
c:\multiverse/config/sampleworld/global_props.py
debug: [Thu Oct 19 10:55:05 PDT 2006] {Thread[main,5,main]} runPYFile: 
file=c:\multiverse/config/sampleworld/global_props.py
--ERROR--: [Thu Oct 19 10:55:05 PDT 2006] {Thread[main,5,main]} could not execute script- skipping: 
Traceback (innermost last):
  File "<iostream>", line 26, in ?
NameError: red
...

This indicates that there was an error in line 26 of global_props.py.

Windows

In the mv-home/logs/world-name directory, use the following command to find errors in the log files:

find "ERROR" *.out

This command examines every log file and display any lines containing the string "ERROR". If the file does not contain any errors, then it will only display the file name. For example, the output might look like this:

---------- ANIM.OUT

---------- COMBAT.OUT

---------- LOGIN_MANAGER.OUT

ERROR [2007-07-27 16:36:21,906] main      
Engine.main: could not create session java.net.ConnectException: Connection refused:   
connect java.net.ConnectException: Connection refused: connect

This indicates that there is an error with the login manager.


Make sure MySQL is running

On Windows, use Task Manager to display all the running processes. You should see "mysqld-nt.exe" in the list, which is the process for MySQL.

On Linux, use the command

ps -ef | grep mysql

You should see one or more "mysqld" processes running.

Check your MySQL/J JDBC Connector

If you see an error such as the following, it indicates that the server is not finding the MySQL/J JDBC driver:

ENGINE: could not find class: java.lang.ClassNotFoundException: com.mysql.jdbc.Driver
could not find class: java.lang.ClassNotFoundException: com.mysql.jdbc.Driver 

Make sure you have installed the JDBC driver JAR file and you have set the value of the multiverse.jdbcJarPath property in the property file to the path of the JAR file. The line looks like this:

multiverse.jdbcJarPath=c:\\mysql-connector-java-3.1.14\\mysql-connector-java-3.1.14-bin.jar

Set the value to the exact path and name of the JDBC driver JAR file. If you are on Linux, use a Unix-style path with forward slashes.

Running servers on Ubuntu Linux

You must take some special steps to run the Multiverse servers Ubuntu Linux. See Running Multiverse servers on Ubuntu Linux.

Diagnosing startup errors

One or more of the servers may not start, for a variety of reasons, including errors in Python scripts. To diagnose such problems, use the status command to determine which servers are running.

On Linux or with Cygwin:

multiverse.sh status

On Windows (without Cygwin):

status-multiverse.bat

If one or more of the servers is not running, then you may have a Python script error.

How to track down a script error:

  1. Stop the servers.
  2. Go to your logs directory, <mv-home>/logs/worldname, and search the log files for errors. On Linux/Cygwin, enter: grep ERROR * On Windows, enter: find "ERROR" * This will identify one or more log files which contain errors. If there is more than one, it is usually best to start with the server that starts earliest in the startup sequence, because often when one server fails it will cause ones that start later to fail also. The startup sequence is:
    1. Message domain server (domain.out)
    2. Combat server (combat.out)
    3. Object manager (objmgr.out)
    4. Login manager (login_manager.out)
    5. World manager (wmgr1.out)
    6. Proxy server (proxy.out)
    7. World reader (worldreader.out)
    8. Mob server (mobserver.out)
    NOTE: On Windows, there will also be .err files (of the same base file name) that contain certain irregular stacktraces. You may need to look in these files, too.
  3. Once you have identified the problematic server, edit the corresponding log file (as noted above). Then:
    • Search for the string "ERROR". This line will also give you some information about what the error is. In some cases, it may give you the line of code in question.
    • From the "ERROR" line, search backwards for the string "Executing script file". This will tell you the name and location of the script file. In some cases, it may tell you the line number of the error as well.
  4. Examine the indicated line and script file.

java.nio.channels.ClosedChannelException

If you see an error stacktrace that looks like this:

RuntimeException: MessageServer.onMessage
       at multiverse.msgsvr.MessageServer$OnMessageHandler.run(MessageServer.java:145)
       at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown Source)
       at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
       at java.lang.Thread.run(Unknown Source)
Caused by: java.nio.channels.ClosedChannelException
...

Or

... Engine.onMessage exception:  ioerror ioerror
      at multiverse.server.engine.EnginePlugin$PluginStateMessageHook.processMessage(EnginePlugin.java:316)
      at multiverse.server.engine.EnginePlugin.onMessage(EnginePlugin.java:430)
...
      at java.lang.Thread.run(Thread.java:595)
Caused by: java.nio.channels.ClosedChannelException
...

Then there is another instance of the servers running. Be sure to stop all running server instances before trying to start the servers. Use the command

multiverse.sh -v stop

or

stop-multiverse.bat

to stop the servers. Then try starting them again.

NOTE: You can run two instances of the servers on a system, but you have to configure them to use different ports. You cannot run two identically-configured sets of servers on the same system. See Running multiple servers on one system for more information.

SyntaxError: inconsistent dedent

Python is very picky about things such as indentation. If you get an error such as:

SyntaxError: ('inconsistent dedent', ('<iostream>', 82, 9, .... ))

Then you have one or more lines of Python that are not indented properly. In this case, you likely need to delete one or more spaces at the beginnig of the line. You may see a slightly different error if you need to add on or more spaces. In general, leading spaces ARE signficant in Python.

Error: "can't create package cache dir"

The Multiverse servers use a Java implementation of Python called Jython. If you see startup errors like:

*sys-package-mgr*: can't create package cache dir, 'cachedir\packages' 

or

ImportError: no module named multiverse 

Then, you need to check your system for existing Python setttings.

If you have a previous Python installation, it may cause problems with Jython, specifically with some of the paths that Jython uses. See The Jython Registry for more information.

Error: "can't write cache file..."

If you see errors like

can't write cache file for 'C:\Program Files\Java\jre1.6.0_02\lib\ext\sunpkcs11.jar'
...

You need to make sure that write permissions are enabled for the Jython cache directory. Generally, the Jython cache directory will be mv-home\bin\cachedir\packages, unless you have previously installed Python (in which case, Jython may use the "standard" Python cache directory).

See the Jython FAQ for more information.

Changing the directory of the server files can cause this error on Windows. Shut down the server, delete all of the files in the mv-home\bin\cachedir\packages directory, and restart. The error should go away.

Error: classes not found in gnu.gcj.runtime.SystemClassLoader

If you see errors on startup like this:

Exception in thread "main" java.lang.NoClassDefFoundError: multiverse.msgsvr.MessageServer
 at java.lang.Class.initializeClass(libgcj.so.70)
Caused by: java.lang.ClassNotFoundException: 
java.util.concurrent.locks.Lock not found in gnu.gcj.runtime.SystemClassLoader
...

Then you are using a non-standard version of Java (GNU Java) that will not work with Multiverse. Some operating systems, such as Ubuntu Linux, come preinstalled with GNU Java.

So, you need to:

  1. Install the standard Sun Java distribution (see Getting Started ).
  2. Make sure that the "java" command invokes it and not GNU Java. You can generally confirm this by entering the command java -version. This should yield the version of Sun Java you downloded, not Java 1.4.2, which is apparently the default that comes with Ubuntu.

For more information, see https://help.ubuntu.com/community/Java#head-fef9352fb26820bb774df978180c9dd3a60e777b.

ImportError: no module named java

If you see an error like this:

ERROR [2007-12-19 15:34:11,774] main                 Engine.processPreScripts: got exception running 
script '../config/common/mvmessages.py' Traceback (innermost last):
  File "<iostream>", line 2, in ?
ImportError: no module named java
 Traceback (innermost last):
  File "<iostream>", line 2, in ?
ImportError: no module named java
...
(stack trace)
...

You have encountered a known problem with Jython. The first time you run a fresh server install Jython may encounter synchronization problems with class caching.

Stop and restart the servers and this problem should not re-occur.

Problems connecting to the servers

Configure or disable firewall software

One of the most common reasons for not being able to connect to the servers is a firewall. If you have a network firewall on the server machine, you must either disable the firewall or configure it open up the necessary ports.

For the servers, set up port forwarding as folllows:

  • TCP port 5040 is the default for the world manager, as specified by the multiverse.worldmgrport property.
  • UDP port 5050 is the default used by the proxy server, as specified by the multiverse.proxyport property.

The Multiverse Client connects to the master server (run by Multiverse), which listens on these ports:

  • TCP port 9005
  • UDP port 9010

If you set up a web server for your asset repository, then you need to open a port for it to use, typically port 80, but it can be anything.

Also make sure the multiverse.proxyserver property is set to the externally-accessible DNS name or IP address of the server (or localhost if everything is running on the same system).

If you are running Fedora Core with SELinux, you may need to take some additional steps; see Installing the Servers on Linux for more information.

Further information

If you are not familiar with network administration, here are some basic references:

Remote connections: change Multiverse proxy server hostname setting

If your client cannot connect to a server running on a different system, make sure you have set the multiverse.proxyserver property to the server hostname or IP address. By default, it is set to "localhost" in the multiverse.properties file.

Change router NAT settings

NOTE: Developer eavabfreelight created a video tutorial Servers, Routers, and You, a guide to bypassing firewalls and getting your server running with remote connections. Check it out if you are having router / firewall issues.

If you can connect to your servers from within your firewall, but not from outside (or vice-versa), you may be encountering a NAT issue. If you have a NAT that won't forward connections from inside the firewall to the external IP address of the NAT, then you need to use names rather than IP addresses, and map the names differently for hosts that are inside and outside of the firewall in your client hosts file, located by default in %SystemRoot%\system32\drivers\etc, typically C:\Windows\system32\drivers\etc. For more information on the hosts file, see Wikipedia.

The following diagram illustrates the situation. Notice that the client outside the firewall and the client inside the firewall have different mappings for server1 in their hosts file.

Image:Nat-router1.jpg

If you are behind NAT you cannot control

Sometimes you will have following situation: 1)Linux server with MV server software 2)Windows machine with MV client And all of this is under NAT you cannot control(becouse it's not your router box, it's provider-level NAT,there such "ISPs" in some countries-you want 'unlimited' internet-you will get RFC1918 IP,attempt to decrease population of eMules/torrents I guess)

In this case one of possible solutions will be using commercial VPN providers(like StrongVPN.com, there are others,but you better choose 'unlimited' VPN). (They just forward all packets to your 'dedicated' IP to your end of OpenVPN tunnel) if you add ifconfig <your_ext_ip> up to linux startup scripts(so mv server will actually understood that <your_ext_ip> is local one) and add routing to it from Windows machine(otherwise packets will travel twice via your provider) and register world using this IP(or register dns name to it) you will make your world accessible both inside your internal network and from outside world I have to use such configuration myself sometimes. (except that in my case my linux server has bind setup with resolves names differently depending is it outside or inside connection)

Personal tools