Quick Guide for Packmol's Input [Open to print]
Contents
1. What do you need?
2. How to compile Packmol.
3. Running Packmol.
4. Basic input structure.
5. More types of molecules.
6. Atom selections.
7. Types of constraints.
8. Periodic boundary conditions.
9. Solvating a large molecule automatically.
10. Parallel version.
12. Controlling residue numbering in PDB files.
12. Convergence problems: what to try.
13. Additional input options and keywords.

Important: always download the latest version of Packmol in order that all features are available.
What do you need?
You need coordinate files for each type of molecule you want your simulation to have. For example, if you are going to simulate a solution of water and ions, you will need a coordinate file for one water molecule, and independent coordinates files for each of the ions. This coordinate files may be in the PDB, TINKER, MOLDEN or MOLDY format.

Of course, you also need the Packmol package, which you can get from

http://www.ime.unicamp.br/~martinez/packmol

by clicking on the Download link. By following this link you will download the file packmol.tar.gz which contains the whole source code of Packmol. If you are a Windows user, download the packmol.zip file instead.

If you plan to use MOLDY as your MD package, read THIS
How to compile Packmol
Once you have downloaded the packmol.tar.gz file from the home-page, you need to expand the files and compile the package. This is done by:

Expanding the files:

tar -xvzf packmol.tar.gz

This will create a directory called packmol inside which you can find the source code. You can build the executable by:

cd packmol
make


That's it, if no error was reported the packmol executable was built.

-----

If you have problems, let the configure script find a suitable compiler for you:

chmod +x ./configure (this makes the script executable)

./configure (this executes the script)

If the script was not able to find a suitable compiler, then you can manually set the compiler by:

./configure /path/to/your/compiler/yourcompiler

Then, run the "make" command again:

make

If no error was detected, an executable called packmol is now ready.

Windows users:

Although we highly recommend the use of Linux and Free Software in general, you can use packmol within Windows, but it is not as simple. You will need to install a Fortran77 compatible compiler for Windows. For instance, it can be "gfortran", which can be found at:

http://gcc.gnu.org/wiki/GFortran

Extract the packmol.zip file that was downloaded from this (the packmol's) site. This will create a directory called packmol. Open this folder and delete these two files:

feasyparallel.f and geasyparallel.f

Using the DOS prompt, go to the directory that contains the extracted files and, if you installed gfortran, use the following command to compile:

gfortran -O3 *.f -o packmol.exe

The executable packmol.exe should be ready.
Running Packmol
Once you have compiled and built your input file, run Packmol with

packmol < packmol.inp

Were packmol.inp is the input file (you can obtain example files by clicking at the 'Input examples' link on the left).

A successful packing will end with something like

------------------------------

Success!
Final objective function value: .25587E-05
Minimum distance between atoms: 2.19
Maximum violation of the constraints: .25359E-05

------------------------------

Where the minimum distance must be greater or equal to the tolerance up to a precision of 10-2 and the maximum violation of the constrains must not be greater than 10-2.

A good idea is to check if your constraints are correct by using the "check" keyword in the input file. With this option a rough initial approximation will be built but no actual packing will be performed. You can look at the output to see if the molecules are within the desired regions (but do not expect a good structure at this point!). Just add the word "check" to any line of your input file (available since 28 Feb 2008).

A common issue: If you get "Command not found" when running Packmol, use
./packmol < packmol.inp
(with a "./" before "packmol") or add the directory where the packmol executable is located to your path.
Basic input structure
The minimal input file must contain the distance tolerance required (for systems at room temperature and pressure and coordinates in Angstroms, 2.0 Å is a good value). This is specified with

tolerance 2.0

The file must contain also the name of the output file to be created, specified with

output test.pdb

and the file type (pdb, tinker, xyz or moldy, pdb is the default value),

filetype pdb

At least one type of molecule must be present. This is set by the structure ... end structure section, for example, if water.pdb is the file containing the coordinates of a single water molecule, you could add to your input file something like

structure water.pdb
  number 2000
  inside cube 0. 0. 0. 40.
end structure


This section specifies that 2000 molecules of the water.pdb type, will be placed inside a cube with minimum coordinates (x,y,z) = (0,0,0) and maximum coordinates (40,40,40). Therefore, this minimum input file must be:

tolerance 2.0
output test.pdb
filetype pdb
structure water.pdb
  number 2000
  inside cube 0. 0. 0. 40.
end structure


Running Packmol with this input file will fill a cube of side 40.0 Å with 2000 water molecules. Every pair of atoms of different molecules will be separated by, at least, 2.0 Å and the molecules will be randomly distributed inside de cube.
More types of molecules
You can add more types of molecules to the same region, or to different regions of the space, simply adding other structure ... end structure section to the input file.
Atom selections
The coordinate file of a single molecule contains, for example, 10 atoms. You can restrain a part of the molecule to be in a specified region of the space. This is useful for building vesicles where the hydrophilic part of the surfactants must be pointing to the aqueous environment, for example. For the 10 atoms molecule, this is done by using the keyword atoms, as in

structure molecule.pdb
  inside cube 0. 0. 0. 20.
  atoms 9 10
    inside box 0. 0. 15. 20. 20. 20.
  end atoms
end structure


In this case, all the atoms of the molecule will be put inside the defined cube, but atoms 9 and 10 will be restrained to be inside the box.
Types of constraints
There are several types of constraints that can be applied both to whole molecules or to parts of the molecule. These constraints define the region of the space in which the molecules must be at the solution. Very ordered systems can be built in such a way. The constraints are:
1. fixed
Usage: fixed x y z a b g

This options holds the molecule fixed in the position specified by the parameters. x, y, z, a, b, g, which are six real numbers. The first three determine the translation of the molecule relative to its position in the coordinate file. The former three parameters are rotation angles (in radians). For this option it is required that only one molecule is set. It may be accompanied by the keyword center. If this keyword is present the first three numbers are the position of the baricenter (not really the center of mass, because we suppose that all atoms have the same mass). Therefore this keyword must be used in the following context:

structure molecule.pdb
  number 1
  center
  fixed 0. 0. 0. 0. 0. 0.
end structure


In this example, the molecule will be fixed with its center the origin and no rotation.
2. inside cube
Usage: inside cube xmin  ymin  zmin  d

xmin , ymin , zmin and d are four real numbers. The coordinates (x,y,z) of the atoms restrained by this option will satisfy, at the solution:


xmin < x < xmin + d
ymin < y < ymin + d
zmin < z < zmin + d
3. outside cube
Usage: outside cube xmin  ymin  zmin  d

xmin , ymin , zmin and d are four real numbers. The coordinates (x,y,z) of the atoms restrained by this option will satisfy, at the solution:


x < xmin or x > xmin + d
y < ymin or y > ymin + d
z < zmin or z > zmin + d
4. inside box
Usage: inside box  xmin  ymin  zmin  xmax  ymax  zmax

xmin , ymin , zmin , xmax , ymax  and zmax are six real numbers. The coordinates (x,y,z) of the atoms restrained by this option will satisfy, at the solution:

xmin < x < xmax
ymin < y < ymax
zmin < z < zmax
5. outside box
Usage: outside box  xmin  ymin  zmin  xmax  ymax  zmax

xmin , ymin , zmin , xmax , ymax  and zmax are six real numbers. The coordinates (x,y,z) of the atoms restrained by this option will satisfy, at the solution:

x < xmin or x > xmax
y < ymin or y > ymax
z < zmin or z > zmax
6. inside (or outside) sphere
Spheres are defined by equations of the general form


and, therefore, you must provide four real parameters a, b, c and d in order to define it. The input syntax is, for example,

inside sphere 2.30 3.40 4.50 8.0

and therefore the coordinates of the atoms will satisfy the equation


Other input alternative would be:

outside sphere 2.30 3.40 4.50 8.0

The outside parameter is similar to the inside parameter, but the equation above uses instead of and, therefore, the atoms will be placed outside the defined sphere.
7. inside (or outside) ellipsoid
Ellipsoids are defined by the general equation


The parameters must be given as in the sphere example, but now they are 7, and must be entered in the following order:

inside ellipsoid   a1  b1  c1  a2  b2  c2  d

The coordinates (a1,b1,c1) will define the center of the ellipsoid, the coordinates (a2,b2,c2) will define the relative size of the axes and d will define the volume of the ellipsoid. Of course, the commands

outside ellipsoid   a1  b1  c1  a2  b2  c2  d

can also be used in the same manner as the parameters for spheres. Note that the case a2 = b2 = c2 = 1.0 provides the exactly the same as the sphere parameter. The parameters for the ellipsoid are not normalized. Therefore, if a2, b2 and c2 are large, the ellipsoid will be large, even for a small d.
8. over (or below) plane
The planes are defined by the general equation

ax + by + cz - d = 0

And it is possible to restrict atoms to be over or below the plane. The syntax is

over plane 2.5 3.2 1.2 6.2

below plane 2.5 3.2 1.2 6.2

where the over keyword will make the atoms satisfy the condition

2.5x + 3.2y + 1.2z - 6.2 0

the below keyword will make the atoms satisfy

2.5x + 3.2y + 1.2z - 6.2 0

9. inside (or outside) cylinder
In order to define a cylinder, it is necessary first to define a line oriented in space. This line is defined in Packmol by the parametric equation

p = ( a1 , b1 , c1 ) + t ( a2 , b2 , c2)

where t is the independent parameter. The vector (a2 , b2 , c2) defines the direction of the line. The cylinder is therefore defined by the distance to this line, d, and a length l. Therefore, the usage must be:

inside cylinder  a1  b1  c1  a2  b2  c2  d  l

outside cylinder  a1  b1  c1  a2  b2  c2  d  l

Here, the first three parameters define the point where the cylinder starts, and l defines the length of the cylinder. d defines de radius of the cylinder. The simpler example is a cylinder oriented in the x axis and starting at the origin, such as

inside cylinder 0. 0. 0. 1. 0. 0. 10. 20.

This cylinder is specified by the points that have a distance of 10. to the x axis (the cylinder has a radius of 10.). Furthermore, it starts at the origin, therefore no atom restricted by this cylinder will have an x coordinate less than 0. Furthermore, it has a length of 20. and, as such, no atom will have an x coordinate greater than 20. The orientation of the cylinder, parallel to the x axis is defined by the director vector (1,0,0), the fourth, fifth and sixth parameters. Cylinders can be oriented in space in anyway.
Periodic Boundary Conditions
Periodic Boundary Conditions for cubic and rectangular boxes are often requested by users. We aim to implement that in the future. At the same time, there is a simple workaround we suggest: If your system will be simulated in a, for example, 100. Angs box, define your cubic constraints such that the packing is done in a 98. Angs box. That way, when the actual simulation box with PBC is built, images will be 2. Angs apart from each other, as illustrated in the figure below. There will be an empty space at the boundary, but that will be readily disappear with energy minimization and equilibration.



Solvating large molecules automatically
The Packmol distribution includes the solvate.tcl script, which is used to solvate large molecules, usually proteins, with water and ions (Na+ and Cl-). Given the PDB file of the biomolecule, just run the script with:

solvate.tcl PROTEIN.pdb

And the script will create a input file for packmol called packmol_input.inp. With this file, run Packmol with

packmol < packmol_input.inp

And your large molecule will be solvated by a shell of 15. Angs. of water, and ions to keep the system neutral and a physiological NaCl concentration of 0.16M. The script usually makes reasonable choices for every parameter (number of water molecules, number of ions, etc.), but these may be controlled manually with additional options, as described below:

solvate.tcl structure.pdb -shell 15. -charge +5 -density 1.0 -i pack.inp -o solvated.pdb

Where: structure.pdb is the pdb file to be solvated (usually a protein)

"15." is the size of the solvation shell. This is an optional parameter. If not set, 15. will be used.

+5 is the total charge of the system, to be neutralized. This is also and optional parameter, if not used, the package considers histidine residues as neutral, Arg and Lys as +1 and Glu and Asp as -1. The Na+ and Cl- concentrations are set the closest possible to 0.16M, approximately the physiological concentration. Alternatively, use the -noions to not add any ions, just water.

1.0 is the desired density. Optional. If not set, the density will be set to 1.0 g/ml.

solvated.pdb: is the (optional) name for the solvated system output file. If this argument is not provided, it will be the default solvated.pdb file.

pack.inp: is the (optional) name for the packmol input file that will be generated. If not provided, packmol_input.inp will be used.
All these options are output when running the "solvate.tcl" script without any parameter. The script also outputs the size of the box and the suggested periodic boundary condition dimensions to be used.
Parallel version.
Note: Some people has experienced crashes with the parallel version on new gfortran compilers. The issues are being investigated. If you have problems using this version, please use the serial version for the momment.

The parallel version might be useful for very large systems, or if you want just to save some time. The function value calculation is parallelized. That means that it will be useful if each iteration of packmol takes a lot of time (not necessarily if it does many iterations). The parallel version is distributed along with the serial version. To use it, you must first compile it:

1. You must have gfortran > 4.2 installed.
2. Change the FORTRAN variable in the Makefile to the command of gfortran.
3. Compile the package with: make parallel

To run the parallel version in multiple processors, use the script ppackmol which is provided, for example:

ppackmol 4 inputfile.inp

Where "4" is the number of threads and inputfile.inp is the usual input file.

Important note: When the parallel version is compiled, packmol must be run with the ppackmol script. If you want to run the serial version instead, you will need to recompile the package.
Controlling residue numbering in PDB files.
Since Packmol will create one or more copies of your molecules in a new PDB file, there are some options on how residue numbers are set to these new molecules. There are three options, which are set with the resnumbers keyword. This keyword may assume three values, 0, 1 or 2, and may be inserted within the structure ... end structure section of each type of molecule. The options are:

resnumbers 0

In this case the residue numbers of all residues will correspond to the molecule number in the final pdb file, independently of the residue numbering of the original pdb file. This means that if you pack 10 proteins of 5 residues, the numbering of the residues will change from 1 to 5 from molecule to molecule.

resnumbers 1

In this case, the residue numbers of the original pdb files are kept unmodified. This means that if you pack 10 proteins of 5 residues, the residue numbers will be preserved and, therefore, they will be repeated for equivalent residues in each molecule of the same protein.

resnumbers 2

In this case, the residue numbers of all residues for this structure will be numbered sequentially according to the number of residues that are printed previously in the same file. This means that if you pack 10 proteins of 5 residues, there will be residue numbers ranging from 1 to 50.

For example, this keyword may be used as in:

structure peptide.pdb
number 10
resnumbers 1
inside box 0. 0. 0. 20. 20. 20.
end structure


Default: The default behavior is to use 0 for structures with only one residue and 1 for structures with more than one residue.

Convergence problems: what to try
Sometimes Packmol is not able to find an adequate packing solution. Here are some tips to try to overcome these difficulties:

  • Look at the best solution obtained, many times it is good enough to be used.

  • Simulate the same problem with only a few molecules of each type. For example, instead of using 20 thousand water molecules, put 300, and see if they are in the correct regions.

  • If you have large molecules, try running the program twice, one to pack these molecules, and then use the solution as fixed molecule for the next packing, in which solvation is included. This may be particularly useful for building solvated membranes. Build the membrane first and then use it as a fixed molecule for a solvation run.

  • You can change some options of the packing procedure to try improve the optimization:

    1. discale [real]
    This option controls the distance tolerance actually used in the local optimization method. It was found that using larger distances helps sometimes. Try setting discale to 1.5, for example.
    2. maxit [integer]
    This is the maximum number of iterations of the local optimizer (GENCAN) per loop. The default value is currently 20, changing it may improve (or worse) the convergence.
  • Additional input options and keywords
    There are some input options which are probably not of interest of the general user, but may be useful in specific contexts. These keywords may be added in the input file in any position.

    Add the TER flag betwen every molecule (AMBER uses this):
    Usage: add_amber_ter

    Add box side information to output PDB File (GROMACS uses this):
    Usage: add_box_sides 1.0
    Where the "1.0" is an optional real number that will be added to the length of each side, if the actual sides of your simulation box will not be exactly the same as the maximum and minimum coordinates of the molecules of the system (usually, if you are using periodic boundary conditions, you may want to add "1.0" to avoid clashes at the boundary).

    Change random number generator seed:
    Usage: seed [integer] (ex: seed 191917)

    Use a truly random initial point for the minimization (the default option is to generate an homogeneous-density initial point with no overlaps to fixed molecules):
    Usage: randominitialpoint

    Change the maximum number of Gencan iterations per loop:
    Usage: maxit [integer]

    Change the maximum number of loops:
    Usage: nloop [integer]

    Change the frequency of output file writing:
    Usage: writeout [integer]

    Write the current point to output file even if it is worst than the the best point so far (used for checking purposes only):
    Usage: writebad

    Check the initial point: This is only for testing purposes, just build the initial approximation, writes it to the output file and exits.
    Usage: check

    Change the optimization subroutine printing output:
    Usage: iprint1 [integer] and/or iprint2 [integer]
    where the integer must be 0, 1, 2 or 3.

    Compare analytical and finite-difference gradients: This is only for testing purposes. Writes chkgrad.log file containing the comparison.
    Usage: chkgrad