Not Logged In  
Register
e-Science Central

Inkspot Software Tools

Fri, 1 May 2009 11:15
David Leahy

The example workflow shown as HSA_HERG uses a block created from an R script. This particular R script uses the "rcdk" package by Rajarshi Guha for cheminformatics and the workflow has one standard block for accessing a file, in this case an .sdf format file downloaded from QSAR World and containing chemical structures and Herg toxicity data.

The block readSDF is written using an R script using the rcdk library and reads an sdf file to produce two outpots, a data matrix of text, the smiles of the chemical structures, and the values in the sdf. The block allows the user to define a property value and in this case it is the sdf tag that describes the property values in the file.

To write a script go to the Apps tab and open the Service Editor and click "Add new service" to launch the service editor.

Add in the name of the new service, a text description. YOu overwrite the "My Services" entry if you want to locate the service under a different heading. Addd in the invocation url which locates the RServe service that will run your block, and give the service a name in the final field entry.

Click on the external script button and the R service which is what we are using in this case. Note that Inkspot will be ading more scripting engines for Octave, Python and others in the near future.

Create a new script file by clicking on "New" and write the R script that you would like to use. In the case of readSDF the script is shown here. What it does is load the rcdk library and parses some molecules by reading the sdf from an input file produced by the previous block using the "load molecules" function from rcdk.

The script then converts the molecules to a matrix of smiels strings for output into the next block. It also gets the property values from the sdf and passes them as a second output.


# Load the R-CDK Library
library(rcdk);

# Parse some molecules
mol <- load.molecules(infile[1]);

# output as smiles
smi <- data.matrix(sapply(mol,get.smiles));

print(smi);

# output prop value
prop <- data.matrix(as.double(lapply(mol, get.property, key = label)));


Save the script file with a menaingful name. We then need to define the input and outputs for the block as well as any properties that the user can define.

Select "add new input" and give it the name you used in the script, in this case "infile". It is a file wrapper so select that radio button and save.

Select "add new output" and add the outputs, in this case there are two, both data wrappers and they are "smi" and "prop"

Finally, we click on "Add new property" and fill in the editor form. In this case it is a string "label" whihc is used in the script to define the proeprty label. It is a string and we give it a default value of "Label".

Save the new block and it is ready for testing.

If you know return to the workflow editor you should see the block in the services list, under the heading you used. You may need to click refresh. You can test it out on some real data files.

To help with testing, you can add some print statements. The output can be viewed by clicking the small magnifying glass on th ebottom right corner of the editor window.

Note that you are allowed to pass variables as matrices of doubles or of text. In the case your input is text simply append the input variable name as "input_text" in your script.

 

 

 


Comment by: David Leahy
Wed, 6 Jan 2010 20:33

This is a test email of the notification

Comment on this post




Register with InkSpot

Sign in with another account

Login









Other Ways of Signing In
expand Attached Files