Instructions common to all modes of structure solution | Structure solution by direct methods | Heavy-atom location by Patterson Interpretation | Partial structure expansion | Commands for communication with the program PATTSEE |
---|---|---|---|---|

TITL CELL ZERR LATT SYMM SFAC elements SFAC a1 b1... UNIT REM MORE TIME OMIT s... OMIT h k l ESEL EGEN LIST FMAP GRID PLAN MOLE HKLF END |
INIT PHAN TREF |
PATT VECT |
TEXP PHAS atomname MOVE |
SPIN FRAG PSEE |

TITL [ ]Title of up to 76 characters, to appear at suitable places in the output. The characters '!' and '=' may form part of the title. The title could include a chemical formula and/or space group, but one must be careful to update these if the UNIT or SYMM instructions are later changed !

CELL a b c alpha beta gammaWavelength and unit-cell dimensions in Angstroms and degrees.

ZERR Z esd(a) esd(b) esd(c) esd(alpha) esd(beta) esd(gamma)Z value (number of formula units per cell) followed by the estimated errors in the unit-cell dimensions. This information is not actually required by SHELXS-96 but is allowed for compatibility with SHELXL-96.

LATT N [1]Lattice type: 1=P, 2=I, 3=rhombohedral obverse on hexagonal axes, 4=F, 5=A, 6=B, 7=C. N must be made negative if the structure is non-centrosymmetric.

SYMM symmetry operationSymmetry operators, i.e. coordinates of the general positions as given in International Tables. The operator X, Y, Z is always assumed, so may NOT be input. If the structure is centrosymmetric, the origin MUST lie on a centre of symmetry. Lattice centering should be indicated by LATT, not SYMM. The symmetry operators may be specified using decimal or fractional numbers, e.g. 0.5-x, 0.5+y, -z or Y-X, -X, Z+1/6; the three components are separated by commas. At least one SYMM instruction must be present unless the structure is triclinic.

SFAC elementsThese element symbols define the order of scattering factors to be employed by the program. The first 94 elements of the periodic system are recognized. The element name may be preceded by '$' but this is not obligatory (the '$' character is allowed for logical consistency with certain SHELXL-96 instructions but is ignored). The program uses absorption coefficients from International Tables for Crystallography (1991), Volume C. For organic structures the first two SFAC types should be C and H, in that order; the E-Fourier recycling generally assigns the first SFAC type (i.e. C) to peaks.

SFAC a1 b1 a2 b2 a3 b3 a4 b4 c df' df" mu r wtScattering factor in the form of an exponential series, followed by real and imaginary corrections, linear absorption coefficient, covalent radius and atomic weight. Except for the atomic weight the format is the same as that used in SHELX-76. In addition, a 'label' consisting of up to 4 characters beginning with a letter (e.g. Ca2+) may be included before a1 (the first character may be a '$', but this is not obligatory). The two SFAC formats may be used in the same .ins file; the order of the SFAC instructions (and the order of element names in the first type of SFAC instruction) define the scattering factor numbers which are referenced by atom instructions. Not all numbers on this instruction are actually used by SHELXS-96, but the full data must be given for compatibility with SHELXL-96. For neutron data, c should be the scattering length (which may be negative) and a1..b4 will usually all be zero.

UNIT n1 n2 ...Number of atoms of each type in the cell, in SFAC order.

REMFollowed by a comment on the same line. This comment is ignored by the program but is copied to the results file (.res). Note that comments beginning with one or more blanks are only copied to the .res file if the line is completely blank; REM comments are always copied.

MORE verbosity [1]More sets the amount of (printer) output; verbosity takes a value in the range 0 (least) to 3 (most verbose).

TIME t [#]If the time t (measured in seconds from the start of the job) is exceeded, SHELXS performs no further blocks of phase permutations (direct methods), but goes on to the final E-map recycling etc. In the case of Patterson interpretation no further vector superpositions are performed after this time has expired. The default value of t is installation dependent, and is usually set to a little less than the maximum time allocation for a particular job class. Usually t is 'CPU time', but on some simpler computer systems (eg. PC's) the elapsed time has to be used instead.

OMIT s [4] 2theta(lim) [180]Thresholds for flagging reflections as 'unobserved'. Note that if no OMIT instruction is given, ALL reflections are treated as 'observed'. Internally in the program s is halved and applied to F*F, so the test is roughly equivalent to suppressing all reflections with F < s * sigma(F), as required for consistency with SHELX-76. Note that s may be set to 0 (to suppress reflections with negative F*F) or even to a negative threshold (to suppress very negative F*F) which has no equivalent in SHELX-76. If 2theta(lim) is POSITIVE, it specifies a 2theta value ABOVE which the data are treated as 'unobserved'; if it is NEGATIVE, the absolute value is used as a LOWER 2theta cutoff.

OMIT h k lThe reflection h k l is flagged as 'unobserved' in the list of merged reflections after data reduction. It will not be used directly in phase refinement or Fourier calculations, but is retained for statistical purposes and as a possible cross-term in a negative quartet. Thus if it is known that a strong reflection has been included accidentally in the .hkl file with a very small intensity (e.g. because it was cut off by the beam stop), it is advisable to delete it from the .hkl file rather than using OMIT (which is intended for imprecisely measured data rather than blunders).

ESEL Emin [1.2] Emax [5] dU [.005] renorm [.7] axis [0]Emin sets the minimum E-value for the list of largest E-values which the program normally retains in memory; it should be set so as to give more than enough reflections for TREF etc. It is also the threshold used for tangent expansion and 'peak-list optimisation'. It is advisable to reduce Emin to about 1.0 for triclinic structures and pseudosymmetry problems. If Emin is negative, acentric triclinic data are generated for use in ALL calculations. The other parameters control the normalisation of the E-values ('**' = raised to the power of):

new(E) = old(E)*exp[8*dU*(pi*sin(theta)/lambda)**2]/[old(E)**-4+Emax**-4]**.25

renorm is a factor to control the parity group renormalisation; 0.0 implies no renormalisation, 1.0 sets full renormalisation, i.e. the mean value of E**2 becomes unity for each parity group.

If axis is 1, 2 or 3, an additional similar renormalisation is applied for groups defined by the absolute value of the h, k or l index respectively. If axis is set to zero, no such additional renormalisation is applied.

EGEN d(min) d(max)All missing reflections in the resolution range d(min) to d(max) Angstroms (the order of d(min) and d(max) is unimportant) are generated on a statistical basis, assuming that they were skipped during the data collection because a prescan indicated that they were weak. These reflections will then be flagged as 'unobserved', but improve the estimation of the remaining E-values and enable an increased number of negative quartets to be identified. d(min) should be safely inside the resolution limit of the data and d(max) should be set so that there is no danger of regenerating strong reflections (as weak) which were cut off by the beam stop etc.

LIST m [0]m = 1 and m = 2 write h, k, l, A and B lists to the name.res file, where A and B are the real and imaginary parts of a point atom structure factor respectively. If m = 1 the list corresponds to the phased E-values for the 'best' direct methods solution, before partial structure expansion (if any). If m = 2 the list is produced after the final cycle of partial structure expansion, and corresponds to weighted E-values used for the final Fourier synthesis. These options enable other Fourier programs to be used, e.g. for graphical display of 3D-Fouriers for data which do not give atomic resolution.

After data reduction and merging equivalent reflections, a list of h, k, l, F and sigma(F) (for m = 3) or h, k, l, F**2 and sigma(F**2) (for m = 4) is written to the name.res file. This provides a useful input file for programs such as DIRDIF and MULTAN, which do not include sort/merge and rejection of systematic absences etc. SHELXS-96 always averages Friedel opposites.

In all four cases the output format is (3I4,2F8.2), and the list is terminated by a dummy reflection 0,0,0.

FMAP code [#] axis [#] nl [#]The unique unit of the cell for performing the Fourier calculation is set up automatically unless specified by the user using FMAP and GRID. The program chooses a 53 x 53 x nl or 103 x 103 x nl grid depending the the resolution of the data (the latter is not available for the MSDOS version or if the available memory is restricted).

code = 1 (F**2 Patterson), 3 (Patterson with coefficients input using HKLF 7; negative coefficients are allowed. 4 (E-map without peak-list optimisation, e.g. because the peaks correspond to unequal atoms), 5 (Fourier with A and B coefficients input using HKLF 3), 6 (E*F Patterson), code > 6 (E-map followed by [code - 6] cycles peak-list optimization). Note that the peak-list optimization assigns all peaks to scattering factor type 1, so for many structures this should be specified as carbon on a SFAC instruction. FMAP 4 may be used with atoms but without TEXP etc. for an E-map based on calculated phases.

GRID sl [#] sa [#] sd [#] dl [#] da [#] dd [#]Fourier grid, when not set automatically. Starting points and increments are multiplied by 100. s means starting value, d increment, l is the direction perpendicular to the layers, a is across the paper from left to right, and d is down the paper from top to bottom. Note that the grid is 53 x 53 x nl points, i.e. twice as large as in SHELX-76, and that sl and dl need not be integral. The 103 x 103 x nl grid is only available when it is set automatically by the program (see above).

PLAN npeaks [#] d1 [0.5] d2 [1.5]If npeaks is positive it is the number of highest unique Fourier peaks which are written to the .res and .lst files; the remaining parameters are ignored.

If npeaks is given as negative, the program attempts to arrange the peaks into unique molecules taking the space group symmetry into account, and to 'plot' a projection of each such molecule on the printer (i.e. the .lst file). Distances involving peaks which are less than r1+r2+d1 (the covalent radii r are defined via SFAC; 1 and 2 refer to the two atoms concerned) are considered to be 'bonds' for purposes of the molecule assembly and tables. Distances involving atoms and/or peaks which are less than r1+r2+d2 are considered to be 'non-bonded interactions'. Such interactions are ignored when defining molecules, but the corresponding atoms and distances are included in the line- printer output. Thus an atom may appear in more than one map, or more than once on the same map. Negative d2 includes hydrogen atoms in these non-bonds, otherwise they are ignored (the absolute value of d2 is used in the test). Peaks are always always assigned the radius of SFAC type 1, which is usually set to carbon. Peaks appear on the printout as numbers, but in the .res file they are given names beginning with 'Q' and followed by the same numbers.

To simplify interpretation of the lineprinter plots, extra symmetry- generated atoms are added, so that atoms or peaks may appear more than once. A table of the appropriate coordinates and symmetry transformations appears at the end of the output. See also MOLE for forcing molecules (and their environments) to be printed separately.

MOLE n [#]Forces the following atoms, and atoms or peaks that are bonded to them, into molecule n of the PLAN output. n may not be greater than 99.

HKLF n [0] s [1] r11...r33 [1 0 0 0 1 0 0 0 1] wt [1] m [0]Before running SHELXS-96, a reflection data file name.hkl must usually be prepared. The HKLF command tells the program which format has been chosen for this file, and allows the indices to be reorientated using a 3x3 matrix r11..r33 (which should have a positive determinant). n is negative if reflection data follow, otherwise they are read from the .hkl file. The data are read in fixed format 3I4,2F8.2 (except for n = 1) subject to FORTRAN-77 conventions. The data are terminated by a record with h, k and l all zero (except n=1, which contains a terminator and checksum). If batch numbers, direction cosines or wavelengths are present in the .hkl file (e.g. for use with SHELXL-96) they will be ignored. The multiplicative scale s multiplies both F*F and sigma(F*F) (or F and sigma(F) for n = 1 or 3). The multiplicative weight wt multiplies all 1/sigma**2 values and m is an integer 'offset' needed to read 'condensed data' (HKLF 1); both are included only for compatibility with SHELX-76. Usually simply 'HKLF 4' is all that will be required.

- n = 1

SHELX-76 condensed data. Although now obsolete this format is both ASCII and compact, and contains a checksum, so is sometimes used for network transmission and testing purposes. - n = 3

`h k l F sigma(F)`or`h k l A B`depending on FMAP setting. In the first case the sign of F is ignored (for use with macromolecular delta-F data). This format should NOT be used for routine structure determination purposes because the approximation(s) required for the derivation of F and sigma(F) severely degrade the quality of the data. - n = 4

`h k l F*F sigma(F*F)`.**THIS IS THE RECOMMENDED FORMAT**for all normal purposes (except macromolecular isomorphous or anomalous delta-F's). - n = 7

`h k l E`or`h k l P`(Patterson coefficient) depending on FMAP.

ENDThis is the last instruction in the rare cases when the .ins file is not terminated by the HKLF instruction.

INIT nn [#] nf [#] s+ [0.8] s- [0.2] wr [0.2]The first stage involves five cycles of weighted tangent formula refinement (based on triplet phase relations only) starting from nn reflections with random phases and weights of 1. Single phase seminvariants which have sigma-1 formula P+ values less that s- or greater than s+ are included with their predicted phases and unit weights. All these reflections are held fixed during the INIT stage but refined freely in the subsequent stages. The remaining reflections also start from random phases with initial weights wr, but both the phases and the weights are allowed to vary.

If nf is non-zero, the nf 'best' (based on the negative quartet and triplet consistency) phase sets are retained and the process repeated for ( npp - nf ) parallel phase sets, where npp is the previous number of phase sets processed in parallel (often 128). This is repeated for nf fewer phase sets each time until only a quarter of the original number are processed in parallel. This rather involved algorithm is required to make efficient use of available computer memory. Typically nf should be 8 or 16 for 128 parallel permutations.

The purpose of the INIT stage is to feed the phase annealing stage with relatively self-consistent phase sets, which turns out to be more efficient than starting the phase annealing from purely random phases. If TREF 0 is used to generate partial structure phases for all reflections, the INIT stage is skipped. To save time, only ns reflections and the strongest mtpr triplets for each reflection (or less, if not so many can be found) are used in the INIT stage; these numbers are given on the PHAN instruction.

PHAN steps [10] cool [0.9] Boltz [#] ns [#] mtpr [40] mnqr [10]The second stage of phase refinement is based on 'phase annealing' [Acta Cryst., A46 (1990) 467-473]. This has proved to be an efficient search method for large structures, and possesses a number of beneficial side-effects. It is based on steps cycles of tangent formula refinement (one cycle is a pass through all ns phases), in which a correction is applied to the tangent formula phase. The phase annealing algorithm gives the magnitude of the correction (it is larger when the 'temperature' is higher; this corresponds to a larger value of Boltz), and the sign is chosen to give the best agreement with the negative quartets (if there are no negative quartets involving the reflection in question, a random sign is used instead). After each cycle through all ns phases, a new value for Boltz is obtained by multiplying the old value by cool; this corresponds to a reduction in the 'temperature'. To save time, only ns reflections are refined using the strongest mtpr triplets and mnqr quartets for each reflection (or less, if not so many phase relations can be found). The phase annealing parameters chosen by the program will rarely need to be altered; however if poor convergence is observed, the Boltz value should be reduced; it should usually be in the range 0.2 to 0.5. When the 'TEXP 0 / TREF' method of multisolution partial structure refinement is employed, Boltz should be set at a somewhat higher value (0.4 to 0.7) so that not too many solutions are duplicated.

TREF np [100] nE [#] kapscal [#] ntan [#] wn [#]np is the number of direct methods attempts; if negative, only the solution with code number |np| is generated (the code number is in fact a random number seed). Since the random number generation is very machine dependent, this can only be relied upon to generate the same results when run on the same model of computer. This facility is used to generate E-maps for solutions which do not have the 'best' combined figure of merit. No other parameter may be changed if it is desired to repeat a solution in this way. For difficult structures, it may well be necessary to increase np (e.g. TREF 2000) and of course the computer time allocated for the job.

nE reflections are employed in the full tangent formula phase refinement. Values of nE that give fewer than 20 unique phase relations per reflection for the full phase refinement are not recommended.

kapscal multiplies the products of the three E-values used in triplet phase relations; it may be regarded as a fudge factor to allow for experimental errors and also to discourage overconsistent (uranium atom) solutions in symorphic space groups. If it is negative the cross-term criteria for the negative quartets are relaxed (but all three cross-term reflections must still be measured), and more negative quartets are used in the phase refinement, which is also useful for symorphic space groups.

ntan is the number of cycles of full tangent formula refinement, which
follows the phase annealing stage and involves all nE reflections; it may be
increased (at the cost of CPU time) if there is evidence that the refinement
is not converging well. The tangent formula is modified to avoid
overconsistency by applying a correction to the resulting phase of
cos-1(

wn is a parameter used in calculating the combined figure of merit CFOM: CFOM = R(alpha) (NQUAL < wn) or R(alpha) + (wn-NQUAL)**2 (NQUAL >= wn); wn should be about 0.1 more negative than the anticipated value of NQUAL. If it is known that the measurements of the weak reflections are unreliable (i.e. have high standard deviations), e.g. because data were collected using the default options on a CAD-4 diffractometer, then the NQUAL figure of merit is less reliable. If the space group does not possess translation symmetry, it is essential to obtain good negative quartets, i.e. to measure ALL reflections for an adequate length of time.

PATT nv [#] dmin [#] resl [#] Nsup [#] Zmin [#] maxat [#]nv is the number of superposition vectors to be tried; if it is negative the search for possible origin shifts is made more exhaustive by relaxing various tolerances etc.

dmin is the minimum allowed length for a heavy-atom to heavy-atom vector; it affects ONLY the choice of superposition vector. If it is negative, the program does not generate any atoms on special positions in stage 4 (useful for some macromolecular problems).

resl is the effective resolution in Angstroms as deduced from the reflection data, and is used for setting various tolerances. If the data extend further than the crystal actually diffracted, or if the outer data are incomplete, it may well be worth increasing this number. This parameter can be relatively critical for macromolecular structures.

Nsup is the number of unique peaks to be found by searching the superposition function.

Zmin is the minimum atomic number to be included as an atom in the crossword table etc. (if this is set too low, the calculation can take appreciably longer).

maxat is the maximum number of potential atoms to be included in the crossword table, and can also appreciably affect the time required for PATT.

VECT X Y ZA superposition vector (with coordinates taken from the Patterson peak-list) may be input by hand by a VECT instruction, in which case the first two numbers on the PATT instruction are ignored (except for their signs !), and a PATT instruction will be automatically generated if not present in the .ins file. There may be any number of VECT instructions.

In the unlikely event of a routine PATT run failing to give an acceptable solution, the best approach - after checking the data reduction diagnostics carefully as explained above - is to select several potential heavy-atom to heavy-atom vectors by hand from the Patterson peak-list and specify them on VECT instructions (either in the same job or different jobs according to local circumstances) for use as superposition vectors. The exhaustiveness of the search can also be increased - at a significant cost in computer time - by making the first PATT parameter negative and/or by increasing the value of resl a little. The sign of the second PATT parameter (a negative sign excludes atoms on special positions) and the list of elements which might be present (SFAC/UNIT) should perhaps also be reconsidered.

TEXP na [#] nH [0] Ek [1.5]na PHAS reflections with E(obs) > Ek and the largest values of E(calc)/E(obs) are generated for use in partial structure expansion or direct methods. The first nH atoms (heavy atoms) in the atom list are retained during partial structure expansion, the rest are thrown away after calculating phases. At least one atom MUST be given ! TEXP automatically generates appropriate FMAP, GRID and PLAN instructions.

TEXP (and/or PHAS) may be used in conjunction with TREF to generate fixed phases for use in direct methods; the special TEXP option na = 0 provides point atom phases for ALL reflections, which are then refined during the phase annealing and tangent expansion stages of direct methods (as specified on the PHAN and TREF instructions). It is not necessary to use different starting phases for the different phase sets, because the phase annealing stage itself introduces (statistically distributed) random phase shifts ! This is a powerful method of partial structure expansion for cases when the phasing power of the partial structure is not quite adequate, e.g. when it consists of only one atom (say P or S in a large organic structure). If at least 5 atoms have been correctly located then TEXP alone should suffice.

When TEXP is used without TREF a tangent formula expansion (to all reflections with E > Emin as specified on the ESEL instruction) is first performed, followed by several cycles (see FMAP) of E-Fouriers and peak-list optimization. TEXP is particularly useful for cases in which several not very heavy atoms (e.g. P, S) have been located by PATT followed by hand interpretation of the resulting 'crossword table'. In such cases nH should be set to the number of such atoms and na to about half the number of reflections with E > 1.5 (see the first page of the SHELXS-96 output).

TEXP may be used in conjunction with ESEL -1 for a partial structure expansion in the effective space group P1 (C1 etc. if the lattice is centered). This can be very effective if it is suspected that a fragment is correctly oriented but translated from its real position, or if the space group cannot be unambiguously assigned. Hand interpretation of the resulting E-map is then however necessary to locate the positions of the crystallographic symmetry elements.

PHAS h k l phiA fixed phase for structure expansion or direct methods. PHAS may be used to fix single phase seminvariants that have been obtained from other programs or derived by examination of the best TREF solutions. The phase angle phi must be present, and should be given in degrees.

atomname sfac x y z sof [1] U (or U11 U22 U33 U23 U13 U12)Atom instructions begin with an atom name (up to 4 characters which do not correspond to any of the SHELXS command names, and terminated by at least one blank) followed by a scattering factor number (which refers to the list defined by the SFAC instruction(s)), x, y, and z in fractional coordinates, and (optionally) a site occupation factor (s.o.f.) and an isotropic U or six anisotropic Uij components (both in Angstroms**-2). The U or Uij values are ignored by SHELXS-96 but may be included for compatibility with SHELXL-96.

When SHELXS-96 writes the .res output file, a dummy U value is followed by a peak height (unless an atom type has been assigned by the program before the E-Fourier recycling). Both the dummy U and the peak height are ignored if the atom is read back into SHELXS-96 (e.g. for partial structure expansion). SHELXL-96 also ignores the peak height if found in the .ins file. In contrast to SHELX-76 it is not necessary to pad out the atom name to 4 characters with blanks, but it should be followed by at least one blank. References to 'free variables' and fixing of atom parameters by adding 10 as in SHELX-76 and SHELXL-96 will be interpreted correctly, but SHELXS-96 'residues' may not be used, and AFIX instructions are simply ignored (so idealized hydrogen atoms etc. are NOT generated). The site occupation factor for an atom in a special position should be divided by the number of atoms in the general position that have coalesced to give the special position. It may also be found by dividing the multiplicity of the special position (as as given in International Tables) by the multiplicity of the general position. Thus an atom on a fourfold axis will usually have s.o.f. = 10.25 (i.e. 0.25, fixed by adding 10).

MOVE dx [0] dy [0] dz [0] sign [1]The coordinates of the following atoms are changed to: x = dx + sign * x, y = dy + sign * y, z = dz + sign * z (after applying FRAG and SPIN - if present - according to PATSEE conventions); MOVE applies to all following atoms until superseded by a further MOVE. MOVE is normally used in conjunction with SPIN and FRAG (see below) but is also useful on its own for applying origin shifts.

SPIN phi1 [0] phi2 [0] phi3 [0]The following fragment (which should begin with a FRAG instruction) is rotated by the specified angles (in radians). This instruction is used to reinput angles from Patterson search programs (in particular PATSEE).

FRAG code [#] a [1] b [1] c [1] alpha [90] beta [90] gamma [90]FRAG enables the PATSEE search fragment to be read in using the original cell or orthogonal coordinates. This instruction will usually be preceded by SPIN and MOVE commands to give the rotation angles and translation (same conventions as for PATSEE), and followed by a list of atoms. FRAG, SPIN and MOVE instructions remain in force until superseded by another instruction of the same type. code is ignored by SHELXS-96 but is included for compatibility with PATSEE and SHELXL-96 (where it is used for different purposes).

PSEE m [200] 2theta(max) [#]The largest |m| E-values and the complete Patterson map are dumped into the name.res file in fixed format for use by Patterson search programs (in particular PATSEE) etc. 2theta(max) should be used to limit the resolution of the E-values generated; the default value uses sin(theta) = lambda/2. The 2theta(max) value is also written to the .res file, so it is possible to restrict the resolution of the E-values actually used by PATSEE to a lower 2theta(max) by editing this file without rerunning SHELXS-96; of course the E-values with higher 2theta than the value used in SHELXS-96 were not written to the .res file and so cannot be recovered in this way. When m is negative a 'super-sharp' Patterson with coefficients Sqrt(E**3*F) is used; if m is positive a standard sharpened Patterson with coefficients (E*F) is employed. The resulting name.res file must be renamed name.inp (or name.pat if the search fragment and encoded Patterson are to be read from separate files) for use by PATSEE. After a PSEE instruction, UNIT is followed by the strongest E-values and the full Patterson map in this output file (which may be rather long !).