Tripos Bookshelf  >  Almond

File

Alt-F

 

Menu File

[Open data file...][Import series...][Import fields...][Import activity...][Import structure list...][Export data][Object type][Model library...][Direct projection...] [Quit...]

 

This menu gives access to the basic operations involving file manipulation, as to open previous ALMOND projects, import series of compounds, import fields etc... The option to quit the program is in this menu.

 


 

File>>>Open data file...

Alt-O

 

Every time ALMOND imports one or more fields it creates an output field, usually with the extension .alm. This file can be opened afterwards to reload the results of the analysis, to have access to the variables and models obtained, etc...

This command opens a file selection dialog, where the User can select the file to open.

 

File Selection Dialog

 

Files

This list shows all the files matching the filter pattern. Click on any of these files to select it. In addition, the list displays other accessible directories and allows the User to navigate through the directory tree. To see the contents of any descendent directory simply click on any item of the list. To see the contents of the parent subdirectory, click on the ".. [Parent Directory]" item.

 

Select ALMOND file (.alm)

Under this label the User can find three elements which can be used to select the file to open.

 

The Filter button brings up a dialog where the User can see and edit the filter pattern. This filter pattern string uses standard wildcards characters to define the files presented in the Files list. For example, in this case the filter pattern string is *.alm and therefore the list shows only files with the extension .alm (usually the ALMOND files have this extension). To show files with different extensions press the button and enter in this dialog the appropriate wildcards.

Once the selection is done press the OK button, and this file will be read. If the file has a improper format a error message will appear and the file will be refused. Press the Cancel button to abort the operation.

 

 

After loading an .alm file, the program is at the same point it was when the User quit the program. If the User excluded objects or variables in the last run, these will be automatically excluded now. The same applies to non-standard data scaling.

 


 

File>>>Import series...

 

From version 3.1.0, GRID has been integrated in ALMOND to generate molecular interaction field (MIF) on its own. The interface is almost the same as previous for version of ALMOND. It is able to read one or many 3D molecular structures written in a standard file format (GRID kout, SYBYL mol2 or MDL SDFile), and process them automatically as indicated in the scheme below

 

Scheme of import series

 

The ALMOND conversion engine starts processing the molecules or molecules defined and assigns atom types. If ALMOND is unable to assign atom types an error message will be shown and the User will be asked to continue using only the structures successfully converted. Atom type assignment is made analyzing the atomic elements and the bond orders. SYBYL atom types are not considered at all and don´t need to be correctly set for a successful conversion. On the contrary, correct bond order and the presence of all the hydrogen atoms is essential. In aromatic compounds, a sensible Kekule form is preferred to aromatic bonds, even if in most cases both representations give identically good results.

Charged compound will be recognized from the structure. During the conversion, the charge assigned to the compound will be presented in the main text window. A correct charge is a good indicator that the atom assignment is working fine.

Once the GRID atom types are assigned, the program works molecule after molecule: for each one it runs GRIN producing a kout file. A GRID engine is used to produce a MIF that may be optionally written to the disk. The ALMOND encoding engine works on the MIF, storing the results in an ALMOND output file (.alm).

 

Multiple structures can be imported at the same time. When it is so, the MACC2 variables generated by ALMOND are stored in a X matrix representing the whole series of compounds. Multiple structures can be imported in four ways:

wildcards The User can type a line containing some wildcard characters. Files with names matching the wildcard string will be imported sequentially. For example: file###.mol2 will produce the import of files file000.mol2, file001.mol2 and file003.mol2
list file The User can enter the name of a file that contains a list of the files to import. A file is recognized by ALMOND as a "list file" when it has the extension .lst or .list. List files must contain inside only valid filenames, one in each line, without any other information.
SDFiles A single MDL SDFile may contain inside many structures which will be processed sequentially.
Multi-mol2 A single SYBYL mol2 file may contain inside many structures which will be processed sequentially.

 

ALMOND 3.3.0 can import the following type of formats:

single file multiple files using wildcards multiple files using list file
SYBYL mol2 yes yes yes
GRID kout yes yes yes
MDL SDFiles yes no no
Multi-mol2 yes no no

 

The command opens a dialog like this:

 

Import Series Dialog

 

Series type

The User selects here the format of the files that contain the molecular structure/s. The choices are SYBYL mol2, GRIN kout, MDL SDFiles or SYBYL Multi-mol2.

 

Task

For SYBYL mol2, Multi-mol2 and MDL SDFiles, the User has the choice of running the whole process, as specified in the scheme above, or run only the ALMOND convert engine and GRIN, in order to obtain the kout files.

 

Input series file/s

In this text field the User can enter the name of a single file or specify multiple files either introducing the name of a list file or a string with wildcards characters. If the desired input corresponds with an actual file name of the type specified in the above Series type control, the User can press the Find button to open a standard file selection dialog. The file selected will be then presented in the input line immediately to the left to the button pressed. See the File>>>Open data file command for details about the file selection dialog.

 

ALMOND new file (.alm)

In this text field the User must enter the name of the output file where the new generated variables and related information will be stored. This file can be opened afterward in order to inspect the variables, build models, export the variables in ASCII or GOLPE format, etc... We strongly recommend to assign the extension .alm to ALMOND output files. It is possible to type the file name or press the Find button to open a standard file selection dialog. The file selected will be then presented in the input line immediately to the left to the button pressed. See the File>>>Open data file command for details about the file selection dialog.

 

Probes (up to 4 selections)

In this control, the User select the probes used in the GRID analysis. For ALMOND calculations we suggest to use the following probes: DRY, O, N1. (see the background section for more details) but other probes can be selected in special situations. For selecting a probe, simply click on the corresponding line and it will be highlighted. Up to three selections are allowed.

 

TIP (shape probe)

This control is used to activate the shape probe which is a special ALMOND probe (see the background section). It is enabled when at least one GRID probe other than DRY is selected. DRY field cannot be used for shape calculation since it has no positive field. If four GRID probes are selected, the shape probe is disabled since a maximum of four probes is allowed in ALMOND.

 

TIP Options

Adv Options

Clicking on this button opens the Shape parameters dialog windows. It is enabled when the shape probe chooser has been turned on.

 

grid spacing (in Å)

This control defines the spacing of the nodes used to compute the MIF in GRID. Usually, this is defined in GRID by the directive NPLA. While NPLA indicated the number of planes to compute each Angstrom, here it should be introduced the actual separation in between the planes, in Angstroms. Therefore, if we want to use a grid spacing of 0.5Å in GRID the directive NPLA should be made equal to 2, while here simply we must move the sliding control up to 0.5. For most ALMOND analysis, a grid spacing of 0.5 is recommended.

 

keep KONT

The results of the GRID analysis may be stored in kont files, usually quite large. ALMOND can keep in the disk these files. This option is not activated by default in order to save disk space.

 

Inter-atom only

This option allows to keep track of the target atom which interacts most strongly at each grid point. When it is activated, only couples of nodes belonging to different atoms are considered for building GRIND. This options allows to clear up many distances measured within a single cluster of nodes, produced by a single atom. In most cases, the default (on) is a good choice, and was included mainly to maintain compatibility with old models. The scale immediately to the right of this control, defines the weight assigned to the electrostatic contribution when computing the atom contributing the most to a certain node. The default (1.0) is appropriate for uncharged molecules. For charged molecules, smalled values would help to decreease the importance of charged atoms in this assignement.

Adv. Options

In many cases, the structures provided by the Users contain mistakes in the saturation of some atoms. In these situations, the expert system in charge of assigning GRID atom types has two choices: assume that the structure provided is correct, even if this leads to produce a chraged compound or force a certain charged status (neutral or charged). Up to version 3.2, ALMOND always accepted the structures given. In version 3.3. it is also possible to set up ALMOND to force the neutrality or a certain charge in the compounds given.

This button opens a Dialog like this:

Adv Options

COOH ionization

Depending upon the setting ALMOND will assume that the carboxyl groups found have the ionization state provided, charged or neutral

NH3 ionization

Depending upon the setting ALMOND will assume that the amino groups found have the ionization state provided, neutral or charged

IMPORTANT: This setting is not stored as a permanent option. The default is always to use the structures as provided. Any different choice must be set again for every series imported.

 

 

Once the controls were set press the OK button, and ALMOND will start processing the files or press the Cancel button to abort the operation.

 

When importing SDFiles, the program will ask the User for the name of the field label that ALMOND should use to assign the name of the molecules (often <MOLNAME> ). In no input is given, ALMOND will use the contents of the first line for each molecule block. If again this is not suitable as an object name, a generic name based in the sequential order of the object into the file will be used.

 

 

If any of the files is not found, their format is incorrect or the conversion is not successful, an error message will be shown. Otherwise, the Calculation Parameters Dialog is shown:

 

Method Dialog

 

In this dialog, the User can change the different parameters that control the ALMOND encoding engine. Follow this link for a detailed description of the Calculation Parameters Dialog.

 


 

File>>>Import fields...

 

ALMOND does not include any method to generate molecular interaction field (MIF) data on its own. MIFs can be computed in GRID using the interface described above (see File>>>Import series...) or can be imported from other programs using this command.

 

ALMOND supports four external MIF formats:

  1. GRID kont, containing one or several fields for one or many molecules
  2. GRID ASCII, containing one or several fields for one or many molecules. File must have been generated with the directive LIST set to -1
  3. GOLPE dat, containing one or several fields for one or many molecules

 

ALMOND can import just one or a set of these files. There are two methods to define the import of multiple MIFs:

wildcards The User can type a line containing some wildcard characters. Files with names matching the wildcard string will be imported sequentially. For example: file###.dat will produce the import of files file000.dat, file001.dat and file003.dat
list file The User can enter the name of a file that contains a list of the files to import. A file is recognized by ALMOND as a "list file" because it ends with the extension .lst or .list. List files must contain inside only valid filenames, one in each line, without any other information.

 

Therefore, ALMOND can import data for multiple molecules in several ways.

When it is so, the MACC2 variables generated by ALMOND are stored in a X matrix representing the whole series of compounds.

 

 

The command opens a dialog like this:

Import Field Dialog

 

Field type

The User defines here the format of the MIF files. Possible choices are GRID kont, GOLPE dat and GRID ASCII.

 

Input field file/s

In this text field the User can enter the name of a single file or specify multiple files either introducing the name of a list file of a filename with wildcards characters. If the desired input corresponds with an actual file name of the type specified in the above Field type control, the User can press the Find button to open a standard file selection dialog. The file selected will be then presented in the input line immediately to the left to the button pressed. See the File>>>Open data file... command for details about the file selection dialog.

 

ALMOND new file (.alm)

In this text field the User must enter the name of the output file where the new generated variables and related information will be stored. This file can be opened later in order to inspect the variables, build models, export the variables in ASCII or GOLPE format, etc... We strongly recommend to assign the extension .alm to ALMOND output files. It is possible to type the file name or press the Find button to open a standard file selection dialog. The file selected will be then presented in the input line immediately to the left to the button pressed. See the File>>>Open data file... command for details about the file selection dialog.

 

 

Once the controls were set press the OK button, and ALMOND will start processing the files or press the Cancel button to abort the operation.

 

 

If any of the files is not found, their format is incorrect or the conversion is not successful, an error message will be shown. Otherwise, the Calculation Parameters Dialog is shown:

 

Method Dialog

 

In this dialog, the User can change the different parameters that control the ALMOND encoding engine. Follow this link for a detailed description of the Calculation Parameters Dialog.

 


 

File>>>Import activity...

This command allows the User to import up to 4 dependent variables. These values are often a measure of the biological activity of the compounds and the crude values require to be transformed to logarithmic scale. This command allows to import such data and performs the required transformations in a simple and straightforward way.

The command opens a dialog like this:

 

Import Activity Dialog

 

Find...

Press this button to open a standard file selection dialog. The file selected will be then presented in the input line immediately to the left to the button pressed. See the File>>>Open data file... command for details about the file selection dialog.

 

Activity File (ASCII)

Enter here the name of an ASCII file containing one line for each compound and up to five columns separated by spaces or tabs:

The following examples are valid ASCII activity file:

With a first column of labels

mol1 0.34

mol2 0.56

mol3 0.01

mol4 0.97

Without a first column of labels, three values of activity

0.25	0.45	0.34

0.76	1.25	0.98

0.84	0.96	1.56

3.00	2.98	3.00

0.75	1.10	0.85

1.23	1.00	1.25

Please notice that the first column is optional and is only a reminder for the User. ALMOND will not compare these names with the actual names of the objects in the file. The first value in this file will be assigned to the first object (in the order listed in the first read of the file), the second to the second object and so on...

 

pK (-log)

If this control is selected, the numerical data x will be converted to -log(x). An error will be shown if any of the values is zero or negative. The biological data is usually provided in milimolar, micromolar or nanomolar concentrations. In this dialog is possible to select the appropriate units for the data.

 

Skip first field

As mentioned above, the first column can be used to write a label which reminds the User the name of the objects. If the file contains such a column, select this control to prevent ALMOND to read it as if it was numerical data.

 

 

Please notice that, when importing activity data, the new data will replace any previously loaded activity data. Activity data might contain missing values, represented as -99.000, provided that more than one column is imported. It is not allowed to import activity data that assigns only missing values to a certain object.

IMPORTANT: From version 3.0, the activity data are loaded only for the non-excluded objects. That is to say, objects excluded before activity importing are ignored and have all their activity values set to -99.000 (missing values).

 


 

File>>>Import structure list...

This command allows the User to import the name of the structures to load in the plotGrid application.

During Field importing, the name of the files containing the structure of the molecule are not extracted from the field file, these file names can be imported from a text file containing one file name per line. Each line correspond to an object of the ALMOND file.

 

Import sructure list window

 

Structures located in an other directory than the current one can be imported if the path of the directory containing the files precedes the filename in the structure list.

The following example is a valid structure list for five molecules:

file1.kout

./file2.kout

./FILE_LIST/file3.kout

./FILE_LIST/file4.kout

./FILE_LIST/file5.mol2


 


 

File>>>Export data

 

Export Data Menu

 

The results of the ALMOND encoding engine can be exported as GOLPE .dat files or as plain text files in tabular format.

In any case, the data is always exported "as it is" in the program at a certain time. If the User applies scaling or removes objects or variables, these transformations will be also applied to the data exported.


File>>>Export data>>>GOLPE format...

The variables produced by the ALMOND encoding engine can be exported to GOLPE using this command. It opens a standard file selection dialog where the User can introduce the name of the output file.

 

Export Data GOLPE Dialog

 

The command will produce the output of the .dat file plus some auxiliary files that define the blocks of variables and their names.

The commands Model>>>External PCA predictions... and Model>>>External PLS predictions... require as input GOLPE dat files like the ones created with this command. However, when exporting data with this purpose please make sure that the data is not scaled. Only data with RAW scaling is suitable for external predictions.

 


 

File>>>Export data>>>ASCII format...

The variables produced by the ALMOND encoding engine can be exported to many applications using this command. It opens a standard file selection dialog where the User can introduce the name of the output file.

 

Export Data ASCII Dialog

 

The output files have tabular, tab separated format, so the lines can be extremely long. This is not a format intended for being human-readable but to be imported into generic spreadsheets, statistical analysis or plotting programs.

 


 

File>>>Object type

 

Type of Objects menu

 


 

File>>>Object type>>>List...

 

Shows in the main window a summary of the type of objects and the colors associated to them.

 


 

File>>>Object type>>>Modify (manual)...

 

Often, the objects of a data set can be considered to belong to different types. For instance, they can be molecules of two chemical families, obtained from modifications of two different prototypes, or samples obtained from different sources, etc... In such cases it is interesting to highlight these peculiarities in the 2D and 3D objects plots, in order to identify any clustering of the objects related with those properties.

ALMOND allows the User to define different types of objects and to assign a different color to each type. Please notice that the types will be used only to give different colors to the objects in the plots and they have absolutely no influence in the results of the analysis.

In data files containing few objects the User can assign the colors directly. On the contrary, in data files with hundred or thousand of objects it would be more convenient to use the File>>>Object type>>>Modify (rule based) option.

 

Type of Object dialog

 

Object Type

The User can choose here the type (color) to assign. There are 10 options: white, red, green, blue, magenta, cyan, yellow, purple, brown and ivory.

 

Objects:

This is a list of all the objects present in the dataset. Each time the User click on an object name, the type of the object will change to the type currently active in Object Type. By default, before making any type change, all the objects are assigned to type 1 (white).

 

When the OK button is pressed, the object types are stored in the disk and from this moment they can be seen in the plots. Changes can be applied at any time.

 


 

File>>>Object type>>>Modify (rule based)...

 

Often, the objects of a data set can be considered to belong to different types. For instance, they can be molecules of two chemical families, obtained from modifications of two different prototypes, or samples obtained from different sources, etc... In such cases it is interesting to highlight these peculiarities in the 2D and 3D objects plots, in order to identify any clustering of the objects related with those properties.

ALMOND allows the User to define different types of objects and to assign a different color to each type. Please notice that the types will be used only to give different colors to the objects in the plots and they have absolutely no influence in the results of the analysis.

In data files containing hundred or thousand of objects it is not practical to define the type picking one object at a time. When the type is associated with the name of the objects it is possible to define a rule, that can be then used to assign the type of the objects automatically.

 

Type of Object dialog

 

Rule definition

This input field contains initially a line of asterisks. The User should replace in this line the asterisks by the character # in those positions that distinguish the type of the objects. For example, if the data file contains a set of objects called X01, X02, X03... and other series called Y01, Y02, Y03.... the rule #************** will automatically assign different types to both series, since the first character is the key to identify the series to which belongs each object. The names of the objects can be more complicated, for instance they can be a first series of EA2748_A_01, EA2749_A_02, CAB347_A_03 ... and a second one of EA2748_B_01, EA2749_B_02, CAB347_B_03 ... In this case we can use the eighth character (A and B) to classify the objects in the two series, with the rule string *******#**********.

Formally, the rule works comparing the names of the objects only in those positions marked by the # character. If from the result of this comparison it is not similar to any previous object, it will be assigned to a new type. Since there is a maximum of 10 types (colors), type 11 will be assigned color 1 and so on.

Once the User has entered the string he should push the button Apply in order to carry out the type assignment. The results of the assignment can be inspected in the Objects names and types list. If the User is satisfied with the assignment he can press the OK button. Otherwise he can press the Cancel button and repeat the rule definition.

 

When the OK button is pressed, the object types are stored in the disk and from this moment they can be seen in the plots. Changes can be applied at any time.

 


 

File>>>Object type>>>Hide...

 

In some data files there are dense clusters of objects that make difficult to see some particular object in the 2D and 3D plots. In such situations it is possible to toggle the hidden or visible status of some objects in the plots. It is important to notice that this status affects only the way the objects are shown in the plots. Hidden objects participate in any analysis in the same way that visible objects.

 

Type of Object dialog

 

In this dialog the visible status of the objects can be changed both picking individually the objects or using a rule definition.

 

Rule definition

This input field contains initially a line of asterisks. The User should replace in this line the asterisks by any character that identify the objects to hide. For example, if the data file contains the objects called X01, X02, X03, X04, Y01, Y02 and we want to hide the two last objects the rule Y************** will automatically assign these two objects a hide visibility status.

Formally, the rule works comparing the names of the objects only in those positions not containing an asterisk. If from the result of this comparison an object is found similar to the rule, it is assigned an hidden status, if not it is assigned a visible status.

Once the User has entered the string he should push the button Apply in order to carry out the visibility status assignment. The results of the assignment can be inspected in the Objects names list. If the User is satisfied with the assignment he can press the OK button. Otherwise he can press the Cancel button and repeat the rule definition.

 

In addition the user can click on the names of individual objects in the Objects names list. Each click will switch the visible-hidden status of this object.

 

When the OK button is pressed, the object visibility status are stored in the disk and from this moment they can be seen in the plots. Changes can be applied at any time.

 


 

File>>>Object type>>>Define spectrum...

 

In ALMOND it is possible to color the objects according to a chromatic scale that represents the values of a certain variable, for example the activity.

First, the User must select the variable from a list containing all the ALMOND variables, in a dialog like this.

 

Spectrum Dialog I

 

Once a variable is selected pressing the OK button the following dialog is shown:

 

Define Spectrum Dialog II

 

Minimum value

Maximum value

The User can move these slides to define the maximum and the minimum between of which the spectrum will show color variation. Objects with values higher than the maximum and lower than the minimum will be assigned the to and from colors. Objects with values into the range defined by these two values can be draw in the plots with colors that take intermediate values between the color defined as from and the color defined as to.

 

from

to

In these menus, the User can choose the colors associated with low values of the spectrum variable (from) and with high values of the spectrum variable (to). Both color should be different. The choices are red, blue and green.

 

When the OK button is pressed, the spectrum is saved to disk. From this moment, correlograms and object plots will show the option to color the points according to the defined spectrum.

 


File>>>Model library...

Any PCA or PLS model generated by ALMOND may be saved and managed with the Model Library window.

The command opens a dialog like this:

{short description of image}

The selected model can be retrieved clicking the "Open" button.

Open

Press open to retrieve the appropriate ALMOND model.

Add

The "Add" button allow the User to store in the Model Library the model just built, modified or optimized with ALMOND. Only a model resident in the memory can be stored in the Model Library. When a model is stored, ALMOND asks for descriptive label for the model (i.e. Active transport). A new line will then be added to the list of the Model Library.

Delete

The "Delete" button allow the User to delete a model from the list of Library Models

PLEASE NOTE - Both Add and Delete buttons need write permissions in the Model Library directory to work properly, so please check your permissions on the library directory if you're experiencing problems with these two commands.

Expand

Adds more informations to the descriptive label associated to the models.

Info

Retrieves the calulations parameters used in the building phase of the selected library model.

Exit

Use this command to exit the model library dialog window

 


File>>>Direct projection...

 This command can be used to obtain fully automatic predictions (projections on models) starting directly from the 3D structure of chemicals.

Direct projection

Series type

The User selects here the format of the files that contain the molecular structure/s. The choices are SYBYL mol2, GRIN kout, MDL SDFiles or SYBYL Multi-mol2. Go to File>>>Import series for more information about the conditions of use of each file format.

 

Input series file/s

In this text field the User can enter the name of a single file or specify multiple files either introducing the name of a list file of a string with wildcards characters. If the desired input corresponds with an actual file name of the type specified in the above Series type control, the User can press the Find button to open a standard file selection dialog. The file selected will be then presented in the input line immediately to the left to the button pressed. See the File>>>Open data file command for details about the file selection dialog.

 

Models at Model library Path

The User must select a precomputed library model for the projection.

 

keep KONT

The results of the GRID analysis may be stored in kont files, usually quite large. ALMOND can keep in the disk these files. This option is not activated by default in order to save disk space.

 

Adv. Options

In many cases, the structures provided by the Users contain mistakes in the saturation of some atoms. In these situations, the expert system in charge of assigning GRID atom types has two choices: assume that the structure provided is correct, even if this leads to produce a chraged compound or force a certain charged status (neutral or charged). Up to version 3.2, ALMOND always accepted the structures given. In version 3.3. it is also possible to set up ALMOND to force the neutrality or a certain charge in the compounds given.

This button opens a Dialog like this:

Adv Options

COOH ionization

Depending upon the setting ALMOND will assume that the carboxyl groups found have the ionization state provided, charged or neutral

NH3 ionization

Depending upon the setting ALMOND will assume that the amino groups found have the ionization state provided, neutral or charged

IMPORTANT: This setting is not stored as a permanent option. The default is always to use the structures as provided. Any different choice must be set again for every series imported.

 

Once the OK button is pressed the projecting procedure starts. If a file projection.alm already exists in the current directory, a dialog box asks the User if he wants to abort the procedure in order to save the files created during a prior projection. If the User chooses to continue, GRIND descriptors are calculated using the same parameters as for the precomputed model, and are saved into temporary files starting with the prefix projection.alm. Dummy activity values (i.e. 0.0) are assigned to every objects and the prediction palette is opened.


File>>>Quit...

Alt-Q

Select this option to exit from ALMOND. A dialog window will appear to confirm the exit.

 

Exit Dialog

 

Press OK button to exit from ALMOND or the Cancel button to close the dialog window and continue working.


Tripos Bookshelf  >  Almond