1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116
| #!/usr/bin/perl
use strict;
use warnings;
use Getopt::Long;
my(%extract_jobs_id,$gtf,$submit,$hold_jid);
GetOptions( "gtf:s" => \$gtf,
"submit:s" => \$submit);
$submit ||= "no";
if (!defined $gtf) {
die "Error : missing gtf file\n";
}
my (@gtf_files);
if (!defined $gtf) {
die "Error : define input gtf file(s) with -gtf file1.gtf[,file2.gtf]\n";
} else {
@gtf_files = split(",",$gtf);
}
foreach my $gtf (@gtf_files) {
#create a directory for each bam
my @nn=split(/\./,$gtf);
my $prefix=$nn[0];
my $command = "mkdir $prefix";
qx{$command};
#generate qsub file and launch it
my $qsuboutfile = "$prefix\_transcripts_extraction.qsub";
open (QSUB,">$prefix/$qsuboutfile")or die "\nError : $qsuboutfile file could not be created: $!\n";
print QSUB qq(#!/bin/bash
#\$ -S /bin/bash
#\$ -m e
#\$ -o $qsuboutfile.out
#\$ -e $qsuboutfile.err
#\$ -cwd
#\$ -q unlimitq
# command(s):
hostname >&2
awk '/ transcript_id "CUFF.*"/' ../$prefix.transcripts.gtf > $prefix\_news_exons.gtf
#sed -e "s/^Chr//" $prefix\_news_exons.gtf > $prefix\_news_exons_corrige.gtf
gffread -w $prefix\_new_transcrits.fa -g ../Danio_rerio.Zv9.69.dna.toplevel.fa $prefix\_news_exons.gtf
#gffread -w $prefix\_new_transcrits.fa -g ../Danio_rerio.Zv9.69.dna.toplevel.fa $prefix\_news_exons_corrige.gtf
);
close QSUB;
#launch qsub job if asked
my $jobid;
if ($submit eq "yes") {
my $dependencies = "";
if (defined $hold_jid) {
$dependencies = "-hold_jid $hold_jid";
}
my $instruction = "cd $prefix/; qsub $dependencies $qsuboutfile | awk '{print \$3}' ";
$jobid=qx{$instruction};
chomp $jobid;
print STDERR ".";
} else {
$extract_jobs_id{$gtf} = "";
}
if ($submit eq "yes") {
print STDERR "\n";
}
if ($submit eq "yes") {
print STDERR "scriptR";
}
my @blast_jobs;
my $fisher_test = "$prefix\_stats.R";
open (ROUT,">$prefix\_stats.txt") or die "\nError : $fisher_test file could not be created: $!\n";
print ROUT qq(##read input files
mydata <- read.table("$prefix\_stats.txt")
pdf (file=paste ("$prefix\_stats.pdf"),paper="a4")
layout(matrix(1:4, 2, 2))
hist(mydata\$V4,xlab="exons length",ylab="frequency",col="blue",main="mean size of exons in transcripts")
hist(mydata\$V2,xlab="transcripts length",ylab="frequency",col="blue",main="length of transcripts")
hist(mydata\$V3,xlab="exons number",ylab="frequency",col="blue",main="number of exons in transcripts")
dev.off()
);
close ROUT;
my $qsuboutfile2 = "$prefix\_blast.qsub";
open (QSUB,">$prefix/$qsuboutfile2")or die "\nError : $qsuboutfile2 file could not be created: $!\n";
print QSUB qq(#!/bin/bash
#\$ -S /bin/bash
#\$ -m e
#\$ -o $qsuboutfile2.out
#\$ -e $qsuboutfile2.err
#\$ -cwd
#\$ -q unlimitq
# command(s):
#hostname >&2
blastall -p blastx -d nr -i $prefix\_new_transcrits.fa -e 0.00001 -o $prefix\_new_transcrits.blast
stat_1.pl -gtf_file -| stat_2.pl $prefix\_news_exons.gtf > $prefix\_stats.txt
);
close QSUB;
#launch qsub job if asked
if ($submit eq "yes") {
my $dependencies = "-hold_jid $extract_jobs_id{$gtf}";
my $instruction = "cd $prefix/;qsub $dependencies $qsuboutfile2 | awk '{print \$3}'";
$jobid=qx{$instruction};
chomp $jobid;
push (@blast_jobs,$jobid);
print STDERR ".";
} else {
push (@blast_jobs,"");
} |
Partager