0. needs the latest version of ERANGE and the SOMCODE: export ERANGEPATH=/woldlab/loxcyc/data00/alim/commoncode export SOMPATH=/woldlab/loxcyc/data00/alim/testcode/SOM Also make sure that there is a working version of the som c-extension compiled and copied into the SOM directory. If necessary, you can recompile the extension using the following commands: cd $SOMPATH rm -r $SOMPATH/build python extension_setup.py build cp $SOMPATH/build/lib*/_csom.so . 1. create a file that lists the ERANGE regions to use for partitioning the genome of the form: /path/to/data/factor1.regionsLC.txt /path/to/data/factor2.regionsLC.txt ... You can skip this step if you only want to train the SOM on a finite set of regions (as long as they are in a single region file). 2. create a file that contains the datasets to be scored on the partition of the form: someid1,/path/to/data/factor1.rds someid2,/path/to/data/factor2.rds .... 3. run the partition.sh script with a name, a minimum partition size, and the region files (resulting file myotubechromN.part where N is the number of region files). . $ERANGEPATH/partition.sh myotubechrom 400 ../partition.files 4. get the RPKMs for each of the dataset in the dataset.files file. . $ERANGEPATH/regionCounts.sh myotubechrom6.part ../chrommyotube.files 3000000 5. build a matrix of the dataset (resulting file: myotubechrom.matrix.tab). . $ERANGEPATH/buildMatrix.sh myotubechrom ../chrommyotube.files 100 -rescale 6. build the SOM. This only save the weights of the SOM. python $SOMPATH/trainsom.py 40 60 myotubechrom.matrix.tab myotubechrom40_60.som -trials 3 -timesteps 4000000 7. score the SOM. Scores the SOM on a dataset (in this case, the dataset used for training). python $SOMPATH/scoresom.py myotubechrom40_60.som myotubechrom.matrix.tab myotubechrom40_60.scores 8. Map the SOM. Save the region distribution, component plane weights, and various dataset maps and PNGs in the somvis2 directory with the myo_all prefix. python $SOMPATH/mapsom.py myotubechrom40_60.som myotubechrom40_60.scores somvis2/myo_all -data myotubechrom.matrix.tab -all -dataheader -savemap Note that you can actually do a translation of the SOM (and hence the PNGs/maps) by dx,dy using the additional parameter -translate, e.g. -translate -15,15 9. Subtracting two maps to get a third map (and PNG). Be careful to only mix maps that are from the same SOM and have been translated to the same set of coordinates. python $SOMPATH/diffmap.py ../myotubechrom40_60.som ../myotubechrom40_60.scores NRSF_60h_10686.map Control_60h_10194.map NRSF_60h-control.map -png NRSF-control -title 'NRSF 60h - control' 10. Can build additional matrices using steps 4 & 5, which can then be mapped using step 9 *without retraining or rescoring the SOM*, by just changing the -data parameter or mapsom.py. 11. Another way to generate map differences quickly is to create a 3-column, tab-delimited file (named combined.diffs in the example below) listing the newmap name,starting map, and map to subtract, using the matrix column names, e.g.: Myog_diff Myog_60h_10158 Myog_exp_10599 Myog_rdiff Myog_exp_10599 Myog_60h_10158 Myog_60h_clean Myog_60h_10158 Control_60h_10194 Myog_exp_clean Myog_exp_10599 Control_exp_10135 Myog_diff_clean Myog_60h_clean Myog_exp_clean Note that you can reuse computed difference maps in later maps as long as they were calculated earlier. You can then use this file with mapsom.py to generate those maps at the same time as the ones in your scoring matrix: python $SOMPATH/mapsom.py myoboth200.som myoboth200.scores 200som/combScaled -data combScaled.matrix.tab -all -dataheader -translate -20,10 -savemap -diffs combined.diffs -count In this case, the additional -count flag will score the sum of the signal in the unit, rather than the average.