Advanced input and output commands
Overview
Teaching: 20 min
Exercises: 30 minQuestions
How do I calculate a property every N time-steps?
How do I write a calculated property to file?
How can I use variables to make my input files easier to change?
Objectives
Understand the use of
compute,fix, and variables.
Advanced input commands
LAMMPS has a very powerful suite for calculating, and outputting all kinds of physical properties during the simulation.
Calculations are most often under a compute command,
and then the output is handled by a fix command.
We will now look at three examples, but there are (at the time of writing) over 150 different compute commands with many options each.
LAMMPS trajectory files
We can output information about a group of particles in our system by using the LAMMPS dump command.
In this example, we will be outputting the particle positions, but you can output many other attributes, such as particle velocities, types, angular momentum, etc.
The list of attributes (and examples of dump commands) can be found in the relevant LAMMPS manual page.
If we look at the starting input script in exercises/3-advanced-inputs-exercise/in.lj_start, we find that the following command has been added:
dump 1 all custom 100 nvt.lammpstrj id type x y z vx vy vz
dump_modify 1 sort id
The dump command defines the properties that we want to output, and how frequently we want these output.
In this case, we have set the ID of the dump command to 1 (we could have used any name/number and it would still work).
We want to output properties for all particles in our system.
We’ve set the output type to custom to have better control of what’s being output – there are default dump options but custom is generally the one that gets used.
We’ll be outputting every 100 time-steps to a file called nvt.lammpstrj
– a *.lammpstrj file can be recognised by some post-processing and post-analysis tools as a LAMMPS trajectory/dump file, and can save some time down the line.
Finally, we name the properties we want to output – in this case, we want to output the particle ID, type, (x,y,z) components of position, and the (x, y, z) components of velocity.
We’ve added a dump_modify command to get LAMMPS to sort the output by ID order
– we tell the dump_modify command which dump command we’d like to sort by giving the dump-ID
(in this case, our dump command had an ID of 1).
We then specify how we want to modify this dump command – we want to sort it by ID,
but there are many more options (that you can find in the LAMMPS manual).
The output nvt.lammpstrj file looks like this:
ITEM: TIMESTEP
100000
ITEM: NUMBER OF ATOMS
8000
ITEM: BOX BOUNDS pp pp pp
0.0000000000000000e+00 4.3088693800637678e+01
0.0000000000000000e+00 4.3088693800637678e+01
0.0000000000000000e+00 4.3088693800637678e+01
ITEM: ATOMS id type x y z vx vy vz
1 1 37.8488 34.4941 42.367 -0.288953 -0.71502 -0.690526
2 1 8.70692 28.34 10.3539 -1.23763 -0.708975 -1.07684
3 1 14.2888 33.2234 10.3076 -0.717269 0.696605 0.669823
4 1 2.31473 26.109 36.5071 -1.93238 -1.09695 1.34787
5 1 42.0214 26.0015 20.3317 -0.767786 0.693569 -0.0684248
6 1 7.36511 39.4736 37.7819 0.011605 0.376106 -0.680507
The lines with ITEM: let you know what is output on the next lines
(so ITEM: TIMESTEP lets you know that the next line will tell you the time-step for this frame – in this case, 100,000).
A LAMMPS trajectory file will usually contain the time-step, number of atoms, and information on box bounds before outputting the information we’d requested.
Restart files
To allow the continuation of the simulation (with the caveat that it must continue to run in the same number of cores as was originally used), we can create a restart file.
This binary file contains information about system topology, force-fields, but not about computes, fixes, etc, these need to be re-defined in a new input file.
You can write a restart file with the command:
write_restart restart2.lj.equil
An arguably better solution is to write a data file, which not only is a text file, but can then be used without restrictions in a different hardware configuration, or even LAMMPS version.
write_data lj.equil.data
You can use a restart or data file to start/restart a simulation by using a read command.
For example:
read_restart restart2.lj.equil
will read in our restart file and use that final point as the starting point of our new simulation.
Similarly:
read_data lj.equil.data
will do the same with the data file we’ve output (or any other data file).
Radial distribution functions (RDFs)
First, we will look at the Radial Distribution Function (RDF), g(r). This describes how the density of a particles varies as a function of the distance to a reference particle, compared with a uniform distribution (that is, at r → ∞, g(r) → 1). We can make LAMMPS compute the RDF by adding the following lines to our input script:
compute RDF all rdf 150 cutoff 3.5
fix RDF_OUTPUT all ave/time 25 100 5000 c_RDF[*] file rdf_lj.out mode vector
We’ve named this compute command RDF and are applying it to the atom-group all.
The compute is of style rdf, and we have set it to have with 150 bins for the RDF histogram (e.g. there are 150 discrete distances at which atoms can be placed).
We’ve set a maximum cutoff of 3.5σ, above which we stop considering particles.
Compute commands are instantaneous – they calculate the values for the current time-step, but that doesn’t mean they calculate quantities every time-step.
A compute only calculates quantities when needed, i.e., when called by another command.
In this case, we will use our compute with the fix ave/time command, that averages a quantity over time, and outputs it over a long timescale.
Our fix ave/time has the following parameters:
RDF_OUTPUTis the name of the fix,allis the group of particles it applies to.ave/timeis the style of fix (there are many others).- The group of three numbers
25 100 5000are theNevery,Nrepeat,Nfreqarguments. These can be quite tricky to understand, as they interplay with each other.Nfreqis how often a value is written to file.Nrepeatis how many sets of values we want to average over (number of samples)Neveryis how many time-steps in between samples.Nfreqmust be a multiple ofNevery, andNeverymust be non-zero even ifNrepeat = 1.- So, for example, an
Neveryof 2, withNrepeatof 3, andNfreqof 100: at every time-step multiple of 100 (Nfreq), there will be an average written to file, that was calculated by taking 3 samples (Nrepeat), 2 time-steps apart (Nevery). So, time-steps 96, 98, and 100 are averaged, and the average is written to file. Likewise at time-steps 196, 198, and 200, etc. - In this case, we take a sample every 25 time-steps, 100 times, and output at time-step number 5000 – so from time-step 2500 to 5000, sampling every 25 time-steps.
c_RDF[*], is the compute that we want to average over time.c_defines that we’re wanting to use a compute, andRDFis our compute name. The[*]wildcard in conjunction withmode vectormakes the fix calculate the average for all the columns in the compute ID.- The
file rdf_lj.outargument tells LAMMPS where to write the data to.
For this command, the file looks something like this:
# Time-averaged data for fix RDF_OUTPUT
# TimeStep Number-of-rows
# Row c_RDF[1] c_RDF[2] c_RDF[3]
105000 150
1 0.0116667 0 0
2 0.035 0 0
3 0.0583333 0 0
4 0.0816667 0 0
5 0.105 0 0
6 0.128333 0 0
...
RDFs at different densities
How does the RDF change as you vary the system density? We’ve been running with a density of 0.8. How does the RDF change if we increase the density to 1.2? And if we reduce it to 0.1?
Solution
At a reduced temperature of T* = 1.0, a Lennard-Jones system with a reduced density ρ* = 0.8 is in liquid state. At ρ* = 1.2, the system is in solid state, and at ρ* = 0.0025, the system is a gas. The RDFs are different for each of these phases.
YAML output
Since version 4May2022 of LAMMPS, writing to YAML files support was added. To write a YAML file, just change the extension accordingly
file rdf_lj.yaml. The file will then be in YAML format:# Time-averaged data for fix RDF_OUTPUT_YAML # TimeStep Number-of-rows # Row c_RDF[1] c_RDF[2] c_RDF[3] --- keywords: ['c_RDF[1]', 'c_RDF[2]', 'c_RDF[3]', ] data: 105000: - [0.011666666666666691, 0, 0, ] - [0.035000000000000045, 0, 0, ] - [0.05833333333333334, 0, 0, ] ...
Mean-squared diplacement (MSD)
The mean-squared displacement (MSD) is a measure of the average displacement that particles travel from their origin position at some given time.
The slope of the MSD is directly proportional to the diffusion coefficient of the system.
As with the RDF, we will require a compute command and a fix command to call that compute:
compute MSD all msd
fix MSD_OUTPUT all ave/correlate 1 50000 50000 c_MSD[4] file msd_lj.out ave running
As before, the compute command has a name (MSD), a group to which it applies (all), and a type (msd).
We’re using fix ave/correlate command to calculate how the total unwrapped squared displacement of every particle in the system changes throughout the simulation.
This works in a very similar way to the fix ave/time command that we used before.
For this, we define the following parameters:
- The fix name (
MSD_OUTPUT). - The group of particles to which this applies (all particles).
- The type of fix (
ave/correlate). - The
Nevery,Nrepeat, andNfreqvalues – here, we want to calculate the correlation every timestep throughout the simulation. - The
computecommand that’s being averaged (as before, we usec_to define that we need acompute, and give the ID of thecomputecommand). - The ouput type and name (
file msd_lj.out). - The style of correlation being used – here, we’re taking a running average of the MSD.
Confusingly, if we set this command to run for the entire simulation,
LAMMPS will output the MSD at the start of the simulation (when particles have not moved at all, and the MSD will be 0) and once more at the end of the simulation.
As a result, the start of the output msd_lj.out file is not very informative:
# Time-correlated data for fix MSD_OUTPUT
# Timestep Number-of-time-windows
# Index TimeDelta Ncount c_MSD[4]*c_MSD[4]
0 50000
1 0 1 0
2 1 0 0.0
3 2 0 0.0
4 3 0 0.0
5 4 0 0.0
6 5 0 0.0
7 6 0 0.0
8 7 0 0.0
9 8 0 0.0
10 9 0 0.0
...
Additionally, we’re only interested in the final part of the msd_lj.out file.
To get the output we’re interested in, we’ll run:
tail -n 50000 msd_lj.out > msd_end.out
Then, we can plot the final output MSD using e.g. gnuplot:
- The first column tells you which timestep this was output for.
- The second column is the amount of simulation time since the simulation start (for a Lennard-Jones system, these are the same as time-steps).
- The third column tells you how many times this value was averaged over.
- The fourth column tells you the mean squared displacement for your system.
Outputting a velocity autocorrelation function (optional)
It’s possible to use a similar approach to calculate the velocity autocorrelation function of a system. Try to use the
compute vacfandfix ave/correlatecommands to output the velocity autocorrelation function.Solution
You should be able to get a VACF with:
compute VACF all vacf fix VACF_OUTPUT all ave/correlate 1 25000 50000 c_VACF[*] file vacf_lj.out ave running
Variables and loops
LAMMPS input scripts can be quite complex, and it can be useful to run the same script many times with only a small difference (for example, temperature). For this reason, LAMMPS have implemented variables and loops – but this doesn’t mean that we can only use variables with loops.
A variable in LAMMPS is defined with the keyword variable, then a name, and then style and arguments, for example:
variable temperature equal 1.0
There are several variable styles. Of particular note are:
equalis the workhorse of the styles, and it can set a variable to a number,thermokeywords, maths operators or functions, among other things.deleteunsets a variable.loopandindexare similar, and will result in the variable changing to the next value in a list every time anextcommand is seen. The difference thatloopaccepts an integer or range, whileindexaccepts a list of strings.
To use a variable later in the script, just prepend a dollar sign, like so:
fix 1 all nvt temp $temperature $temperature 5.0
Example of loop:
variable a loop 10
label loop
dump 1 all atom 100 file.$a
run 10000
undump 1
next a
jump SELF loop
and of index:
variable d index run1 run2 run3 run4 run5 run6 run7 run8
label loop
shell cd $d
read_data data.polymer
run 10000
shell cd ..
clear
next d
jump SELF
LAMMPS Rerun
LAMMPS allows us to take already generated trajectory (
dump) files, and do a re-run of thesimulation, but allowing to calculate new properties. In this exercise, you will run a LJ simulation, outputting the trajectory asnvt.lammpstrjand, then, re-run the trajectory and calculate a RDF.What differences do you see in runtime, and output between the two inputs? How can you quickly plot the RDF, to check if your re-run was successful?
Solution
A full run takes around 1m, while the re-run takes less 20 seconds. This effect would be even more evident on larger simulations. Re-runs are, comparatively, much faster than proper simulations.
To quickly plot the RDF, you can load gnuplot with
module load gnuplotand, assuming your ssh session was started with X forwarding, you can start gnuplot with thegnuplotcommand, and plot the RDF with:plot 'rdf_lj.out' using 2:3.
Key Points
Using
computeandfixcommands, it’s possible to calculate myriad properties during a simulation.Variables make it easier to script several simulations.